Apache Zookeeper

Terms and Concepts

  • ensemble - a Zookeeper cluster
  • quorum - minimum number of replicated servers needed to vote
  • znode - data register
  • servers are either a follower or leader

  • ephemeral nodes last as long as the process that created them
  • client can connect to any host, if disconnected, will reconnect to a different host
  • in memory
  • ordered - transactions have a number for order ( good for synchronization primitives )
  • writes to disk before memory
  • performance: dedicated transaction log disk, lots of RAM and heap
  • self healing - server rejoins on start
  • needs a majority, usually clusters have an odd number

  • Sequential Consistency - Updates from a client will be applied in the order that they were sent.
  • Atomicity - Updates either succeed or fail. No partial results.
  • Single System Image - A client will see the same view of the service regardless of the server that it connects to.
  • Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update.
  • Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time bound.


  • get JDK
  • set heap
  • unpack Zookeeper
conf/zoo.cfg tickTime=2000
Manual run: java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.16.jar:conf org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg Control Script - Start/Stop: bin/zkServer.sh start
bin/zkServer.sh stop
bin/zkServer.sh status

Don't really need this - sets up dirs if they aren't configured to be created automatically

!!!!! there is an example in here of using JMX !!!!!!!!!!!!!!!!!

Replicated ZooKeeper

If running multiple instances on one host:
  • different config file for each instance
  • different datadir
  • different ports ( all three )
Add this for each instance: /var/lib/zookeeper/myid # file holds server id for this instance, put this file in dataDir
# ID between 1 and 255
conf/zoo.cfg tickTime=2000

Config Details

server.id=host:port:port # each host in ensemble needs to know each other host

- first TCP port 2888 # quorum port, followers connect to leader
- second TCP port 3888 # leader election port

Test Client

Java test client java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.16.jar:conf:src/java/lib/jline-2.11.jar \ org.apache.zookeeper.ZooKeeperMain -server
bin/zkCli.sh -server
C test client make cli_st # build singlethreaded
make cli_mt # build mulithreaded


ZK Shell

- connect with the client above
create /zk_test my_data # create znode that holds "my_data"
get /zk_test # check data for this znode
set /zk_test junk # set znode value
delete /zk_test # delete znode

ZooKeeper Commands ( The Four Letter Words )

echo ruok | nc 2181
  • conf New in 3.3.0: Print details about serving configuration.
  • cons New in 3.3.0: List full connection/session details for all clients connected to this server.
  • crst New in 3.3.0: Reset connection/session statistics for all connections.
  • dump Lists the outstanding sessions and ephemeral nodes. This only works on the leader.
  • envi Print details about serving environment
  • ruok Tests if server is running in a non-error state. May or may not have joined the quorum.
  • srst Reset server statistics.
  • srvr New in 3.3.0: Lists full details for the server.
  • stat Lists brief details for the server and connected clients.
  • wchs New in 3.3.0: Lists brief information on watches for the server.
  • wchc New in 3.3.0: Lists detailed information on watches for the server, by session. WARNING - could be expensive (impact performance)
  • wchp New in 3.3.0: Lists detailed information on watches for the server, by path. WARNING - could be expensive (impact performance)
  • mntr New in 3.4.0: Outputs a list of variables that could be used for monitoring the health of the cluster.


  • embedded Jetty server, GUI for 4 letter zookeeper commands
  • New in 3.5.0 ( ALPHA version )
  • enabled by default, disable by setting this to false: zookeeper.admin.enableServer
  • use commands like this: http://localhost:8080/commands/stat
  • commands listed here: http://localhost:8080/commands
  • options: zookeeper.admin.enableServer zookeeper.admin.serverPort zookeeper.admin.commandURL

Simple API

  • create # creates a node at a location in the tree
  • delete # deletes a node
  • exists # tests if a node exists at a location
  • get data # reads the data from a node
  • set data # writes data to a node
  • get children # retrieves a list of children of a node
  • sync # waits for data to be propagated

Maintenance, Monitoring, more configs

dataDir # data snapshots, myid file, transaction logs ( default )
dataLogDir # different dir for transaction logs ( for low latency )

Clear old snapshots and log files ( retain 'count' backups ):
java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.16.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir< -n <count>

autopurge.snapRetainCount # auto purge with these
autopurge.purgeInterval # auto purge with these
conf/log4j.properties # log rolling can be setup here, this file needs to be in the bin dir or on the class path


  • monitor with JMX and with command port ( 4 letter words )
  • if db is corrupt, delete files in these dirs: datadir/version-2 and datalogdir/version-2
  • recovery: combine latest snapshot with transaction log


  • same as a normal member but doesn't vote ( fewer hosts for faster voting / writing )
  • only hear the results of votes, not the agreement protocol that leads up to them
  • clients may connect to them and send read and write requests to them
  • helps with scaling but the cluster availability won't be harmed when the fail
  • note: instances in other datacenters could create false positives if they vote
peerType=observer # add this only in the observers config file
server.1:localhost:2181:3181:observer # add this in every config file

Cluster Setup Example - Single Host with ActiveMQ


tar xvfz zookeeper-3.4.6.tar.gz
tar xvfz apache-activemq-5.11.1-bin.tar.gz
Edit each config file:

vi conf/zoo_instance1.cfg
vi conf/zoo_instance2.cfg
vi conf/zoo_instance3.cfg




Setup directories and ID files:
mkdir /var/tmp/zookeeper1
mkdir /var/tmp/zookeeper2
mkdir /var/tmp/zookeeper3
echo 1 > /var/tmp/zookeeper1/myid
echo 2 > /var/tmp/zookeeper2/myid
echo 3 > /var/tmp/zookeeper3/myid

Start up the servers:

bin/zkServer.sh start conf/zoo_instance1.cfg
bin/zkServer.sh start conf/zoo_instance2.cfg
bin/zkServer.sh start conf/zoo_instance3.cfg

Check their status:

echo; echo stat | nc 2181; echo;echo echo; echo stat | nc 2182; echo;echo echo; echo stat | nc 2183; echo;echo

ActiveMQ Setup

WARNING - LevelDB is deprecated, we are keeping this example for now but don't use this to setup ActiveMQ

Fire up an initial instance to test it:

cd apache-activemq-5.11.1
chmod 755 bin/activemq
bin/activemq start
bin/activemq stop

Configure it:

    / >

Create instances:

bin/activemq create instance1
bin/activemq create instance2
bin/activemq create instance3

Configure web ports:

vi instance1/conf/jetty.xml
<property name="port" value="8161"/>
vi instance2/conf/jetty.xml
<property name="port" value="8162"/>
vi instance3/conf/jetty.xml
<property name="port" value="8163"/>

Startup the instances:

cd instance1/
bin/instance1 start
cd ../instance2/
bin/instance2 start
cd ../instance3/
./bin/instance3 start