Spread the love

APACHE ZOOKEEPER TUTORIAL

WHAT IS ZOOKEEPER

  • Distributed, open-source coordination service for distributed applications.
  • It exposes a simple set of primitives to implement higher level services for synchronization, configuration maintenance, and groups and naming.
  • It is designed to be easy to program and use.
  • It runs in Java and has bindings for both Java, Python and C.
  • UsedinYahoo,Twitter,NetflixandFacebook

PROBLEM/SOLUTION

  • Coordination services: The Integration/communication of services in a distributed environment
  • Coordination services are hard to get right. They are especially prone to errors such as race conditions and deadlock.
  • Race condition Two or more systems trying to perform the same task.
  • Deadlock Two or more operations waiting for each.
  • Relieve distributed applications the responsibility of implementing coordination services from scratch

WHAT IS DISTRIBUTED SYSTEM

  • Multiple computer systems working on a single problem
  • It is a network that consists of autonomous computers that are

    connected using a distribution middleware

  • KeyFeatures:Concurrent,resourcesharing,Independent,global, greater fault tolerance and price/performance ratio is much better.
  • KeyGoals:Transparency,Openness,Reliability,Performance,Scalability.
  • Challenges: Security, Fault, Coordination and resource sharing

COORDINATION CHALLENGE ?

  • Whycoordinationinadistributedsystemisahardproblem?
  • Coordination or configuration management for a distributed

    application that has many systems.

  • Master node where the cluster data is stored
  • Workernodesorslavenodesgetthedatafromthismasternode
  • Single point of failure.
  • Synchronization is not easy.
  • Careful design and implementation is needed

APACHE ZOOKEEPER

Apache Software Foundation
Open source
Answer to the various coordination problems Originally Implemented at Yahoo
Implemented in Java
It is a centralized coordination service
ZooKeeper ensemble: cluster of servers

ZOOKEEPER ARCHITECTURE

IMPORTANT COMPONENTS

Leader & Follower Request Processor

Active in Leader Node and is responsible for processing write requests.

After processing, it send changes to follower nodes Atomic Broadcast

Present in both Leader Node and Follower Nodes.
It is responsible for sending the changes to other nodes

In-memory Database (Replicated Database)

  • –  It is responsible for storing the data in ZooKeeper.
  • –  Every node contains its own database
  • –  Data is also written to file system providing recoverability in case of any problems with cluster

OTHER COMPONENTS

Client:

  • –  One of the nodes in our distributed application cluster
  • –  Access information from the server.
  • –  Every client sends a message to the server to let the sever know that the client is alive.

    Server:
    Provides all the services to clients. Gives acknowledgement to client

    Ensemble:
    Group of ZooKeeper servers.
    Minimum number of nodes that is required to form an ensemble is 3.

ZOOKEEPER DATA MODEL

ZNODES

Every node in a ZooKeeper tree is referred to as a znode

Znodes maintain a stat structure that includes Version numbers

ACL
Timestamp Data length

Types of Znodes:
Persistence: Alive until they’re explicitly deleted
Ephemeral: Active until the client connection is alive Sequential: Either persistent or ephemeral

SESSIONS & WATCHES

Sessions
Requests in a session are executed in FIFO order.
Once session established then session id is assigned to the client. Client sends heartbeats to keep the session valid.
Session timeouts are usually represented in milliseconds.

Watches

  • –  Watches are a mechanism for client to get notifications about the

    changes in the ZooKeeper

  • –  Clients can set watches while reading a particular znode.
  • –  Znode changes are modification of data associated with the znode or changes in the znode’s children.
  • –  Watches are triggered only once.
  • –  If session is expired, watches are also removed.

POPULAR APPLICATIONS/COMPANIES USING ZOOKEEPER

  • ApacheHBase:Leaderelection,bootstrapping,serverlease management and coordination between servers.
  • Apache Solr: Leader election, configuration and other services.
  • Yahoo!:leaderelection,configurationmanagement,sharding,locking,

    group membership, etc

  • Katta: Node, master and index management in the grid.
  • Eclipse Communication Framework: Its abstract discovery services
  • Deepdyve: Manage server state, control index deployment and other tasks.
  • AdtroitLogic’s UltraESB: Clustering support and the automated round-robin-restart of the complete cluster.

Related Posts

blog

Apache Kafka Commands Cheat sheet

Spread the loveKafka Topics List existing topics bin/kafka-topics.sh –zookeeper localhost:2181 –list Purge a topic bin/kafka-topics.sh –zookeeper localhost:2181 –alter –topic mytopic –config retention.ms=1000 … wait a minute … bin/kafka-topics.sh –zookeeper localhost:2181 –alter –topic mytopic –delete-config retention.ms
Read more…

blog

What is Apache Maven | Apache Maven complete tutorial from scratch pdf

Spread the love In this post you will learn the complete tutorial of Apache Maven build tool What is Maven ? Apache Maven is a software project management and comprehension tool. Based on the concept
Read more…

blog

Practical Guide for Web Development in 2018

Spread the loveWelcome to my practical guide  for web development in 2018 in terms of  technology and career. Before we start I just want to  mention a few things, you don’t need to learn  everything that
Read more…