open-consul/website/source/intro/vs-zookeeper.html.markdown

45 lines
2.3 KiB
Markdown

---
layout: "intro"
page_title: "Serf vs. ZooKeeper, doozerd, etcd"
sidebar_current: "vs-other-zk"
---
# Serf vs. ZooKeeper, doozerd, etcd
ZooKeeper, doozerd and etcd are all similar in their client/server
architecture. All three have server nodes that require a quorum of
nodes to operate (usually a simple majority). They are strongly consistent,
and expose various primitives that can be used through client libraries within
applications to build complex distributed systems.
Serf has a radically different architecture based on gossip and provides a
smaller feature set. Serf only provides membership, failure detection,
and user events. Serf is designed to operate under network partitions
and embraces eventual consistency. Designed as a tool, it is friendly
for both system administrators and application developers.
ZooKeeper et al. by contrast are much more complex, and cannot be used directly
as a tool. Application developers must use libraries to build the features
they need, although some libraries exist for common patterns. Most failure
detection schemes built on these systems also have intrinsic scalability issues.
Most naive failure detection schemes depend on heartbeating, which use
periodic updates and timeouts. These schemes require work linear to
the number of nodes and place the demand on a fixed number of servers.
Additionally, the failure detection window is at least as long as the timeout,
meaning that in many cases failures may not be detected for a long time.
Additionally, ZooKeeper ephemeral nodes require that many active connections
be maintained to a few nodes.
The strong consistency provided by these systems is essential for building leader
election and other types of coordination for distributed systems, but it limits
their ability to operate under network partitions. At a minimum, if a majority of
nodes are not available, writes are disallowed. Since a failure is indistinguishable
from a slow response, the performance of these systems may rapidly degrade
under certain network conditions. All of these issues can be highly
problematic when partition tolerance is needed, for example in a service
discovery layer.
Additionally, Serf is not mutually exclusive with any of these strongly
consistent systems. Instead, they can be used in combination to create systems
that are more scalable and fault tolerant, without sacrificing features.