ZooKeeper primer and use cases

cr4zy · on Sept 3, 2014

I avoided Zookeeper for a long time, due to its apparent complexity, but am now very glad I finally took the plunge. There's a good cloud formation script here (https://github.com/thefactory/cloudformation-zookeeper) I had to open up the load balancer to get a public endpoint like zk.yourdomain.com.

etcd and others are quickly coming onto the scene, but zookeeper is still the right choice if you want full featured, robust software with a good ecosystem of tools & knowledge like Curator, Exhibitor, Cloud Formation, etc...

astrodust · on Sept 3, 2014

Zookeeper's documentation is so exceptionally disorganized they really should be given an award of some sort.

batbomb · on Sept 3, 2014

I started a pretty involved REST implementation (but borrowing a few things fron contrib/rest), complete with heartbeating and a RESTful events resource, which has since stalled as I got busy with other things. I hope to return to it soon.

My plan was to increase the tick time quite a bit and use it for global (in the literal sense) service discovery and coordination.

armon · on Sept 4, 2014

Full disclosure, I'm a HashiCorp employee, but another system to look at is Consul: http://www.consul.io. It is compared to ZooKeeper more extensively here: http://www.consul.io/intro/vs/zookeeper.html.

At a high level, ZooKeeper gives you a low-level distributed primitives. While these are all you need to build leader election, service discovery, monitoring, etc, it does require you to actually build that all yourself.

Consul goes the route of providing high-level features instead of low-level primitives (although they are all there, and usable via an HTTP API). It makes it easy as an operations person to deploy, and just as easy as an app developer to integrate and use.

eikenberry · on Sept 4, 2014

It's been about 6 months since I worked with Zookeeper, but here are a couple lessons I learned along the way...

1. Only use with Java (JVM). The C client library which most other languages wrap to work with zookeeper is very buggy. It had lots of corner cases which would crash the application using it.

2. If you are on AWS make sure your zookeeper instances have static IPs (use EIPs or EINs). We had a cluster in AWS classic with the cluster defined by domain names. At startup zookeeper would lookup the IP for the domain and cache it, never checking it again. This meant you could only loose 1 node in a 5 node cluster as you needed the other spare to handle the missing node during a rolling restart. It was a PITA.

Personally I recommend etcd now if you need functionality of this sort, particularly if you are not using Java.

atombender · on Sept 4, 2014

etcd looks very nice, but it's obvious that it's still under heavy development; for the latest release they deprecated the very useful high-level primitives (locking and election, the former which is buggy):

https://github.com/coreos/etcd/blob/master/Documentation/mod...

olavgg · on Sept 3, 2014

Very nice summary of ZooKeeper, it is really an awesome tool for distributed systems.