Updating documentation for new bootstrap method

This commit is contained in:
Armon Dadgar 2014-07-01 15:02:26 -07:00
parent 020802f7a5
commit 358b473e01
9 changed files with 64 additions and 71 deletions

View file

@ -537,6 +537,7 @@ Options:
-advertise=addr Sets the advertise address to use -advertise=addr Sets the advertise address to use
-bootstrap Sets server to bootstrap mode -bootstrap Sets server to bootstrap mode
-bind=0.0.0.0 Sets the bind address for cluster communication -bind=0.0.0.0 Sets the bind address for cluster communication
-bootstrap-expect=0 Sets server to expect bootstrap mode.
-client=127.0.0.1 Sets the address to bind for client access. -client=127.0.0.1 Sets the address to bind for client access.
This includes RPC, DNS and HTTP This includes RPC, DNS and HTTP
-config-file=foo Path to a JSON file to read configuration from. -config-file=foo Path to a JSON file to read configuration from.
@ -547,7 +548,6 @@ Options:
order. order.
-data-dir=path Path to a data directory to store agent state -data-dir=path Path to a data directory to store agent state
-dc=east-aws Datacenter of the agent -dc=east-aws Datacenter of the agent
-expect=0 Sets server to expect bootstrap mode.
-join=1.2.3.4 Address of an agent to join at start time. -join=1.2.3.4 Address of an agent to join at start time.
Can be specified multiple times. Can be specified multiple times.
-log-level=info Log level of the agent. -log-level=info Log level of the agent.

View file

@ -57,8 +57,7 @@ There are several important components that `consul agent` outputs:
* **Server**: This shows if the agent is running in the server or client mode. * **Server**: This shows if the agent is running in the server or client mode.
Server nodes have the extra burden of participating in the consensus quorum, Server nodes have the extra burden of participating in the consensus quorum,
storing cluster state, and handling queries. Additionally, a server may be storing cluster state, and handling queries. Additionally, a server may be
in "bootstrap" mode. The first server must be in this mode to allow additional in "bootstrap" mode. Multiple servers cannot be in bootstrap mode,
servers to join the cluster. Multiple servers cannot be in bootstrap mode,
otherwise the cluster state will be inconsistent. otherwise the cluster state will be inconsistent.
* **Client Addr**: This is the address used for client interfaces to the agent. * **Client Addr**: This is the address used for client interfaces to the agent.

View file

@ -35,11 +35,16 @@ The options below are all specified on the command-line.
as other nodes will treat the non-routability as a failure. as other nodes will treat the non-routability as a failure.
* `-bootstrap` - This flag is used to control if a server is in "bootstrap" mode. It is important that * `-bootstrap` - This flag is used to control if a server is in "bootstrap" mode. It is important that
no more than one server *per* datacenter be running in this mode. The initial server **must** be in bootstrap no more than one server *per* datacenter be running in this mode. Technically, a server in bootstrap mode
mode. Technically, a server in bootstrap mode is allowed to self-elect as the Raft leader. It is important is allowed to self-elect as the Raft leader. It is important that only a single node is in this mode,
that only a single node is in this mode, because otherwise consistency cannot be guaranteed if multiple because otherwise consistency cannot be guaranteed if multiple nodes are able to self-elect.
nodes are able to self-elect. Once there are multiple servers in a datacenter, it is generally a good idea It is not recommended to use this flag after a cluster has been bootstrapped.
to disable bootstrap mode on all of them.
* `-bootstrap-expect` - This flag provides the number of expected servers in the datacenter.
Either this value should not be provided, or the value must agree with other servers in
the cluster. When provided, Consul waits until the specified number of servers are
available, and then bootstraps the cluster. This allows an initial leader to be elected
automatically. This cannot be used in conjunction with the `-bootstrap` flag.
* `-bind` - The address that should be bound to for internal cluster communications. * `-bind` - The address that should be bound to for internal cluster communications.
This is an IP address that should be reachable by all other nodes in the cluster. This is an IP address that should be reachable by all other nodes in the cluster.
@ -148,6 +153,8 @@ definitions support being updated during a reload.
* `bootstrap` - Equivalent to the `-bootstrap` command-line flag. * `bootstrap` - Equivalent to the `-bootstrap` command-line flag.
* `bootstrap_expect` - Equivalent to the `-bootstrap-expect` command-line flag.
* `bind_addr` - Equivalent to the `-bind` command-line flag. * `bind_addr` - Equivalent to the `-bind` command-line flag.
* `client_addr` - Equivalent to the `-client` command-line flag. * `client_addr` - Equivalent to the `-client` command-line flag.

View file

@ -6,74 +6,62 @@ sidebar_current: "docs-guides-bootstrapping"
# Bootstrapping a Datacenter # Bootstrapping a Datacenter
When deploying Consul to a datacenter for the first time, there is an initial bootstrapping that Before a Consul cluster can begin to service requests, it is necessary for a server node to
must be done. Generally, the first nodes that are started are the server nodes. Remember that an be elected leader. For this reason, the first nodes that are started are generally the server nodes.
agent can run in both client and server mode. Server nodes are responsible for running Remember that an agent can run in both client and server mode. Server nodes are responsible for running
the [consensus protocol](/docs/internals/consensus.html), and storing the cluster state. the [consensus protocol](/docs/internals/consensus.html), and storing the cluster state.
The client nodes are mostly stateless and rely on the server nodes, so they can be started easily. The client nodes are mostly stateless and rely on the server nodes, so they can be started easily.
The first server that is deployed in a new datacenter must provide the `-bootstrap` [configuration The recommended way to bootstrap is to use the `-bootstrap-expect` [configuration
option](/docs/agent/options.html). This option allows the server to assert leadership of the cluster option](/docs/agent/options.html). This options informs Consul of the expected number of
without agreement from any other server. This is necessary because at this point, there are no other server nodes, and automatically bootstraps when that many servers are available. To prevent
servers running in the datacenter! Lets call this first server `Node A`. When starting `Node A` something inconsistencies and split-brain situations, all servers should specify the same value for `-bootstrap-expect`
like the following will be logged: or specify no value at all. Any server that does not specify a value will not attempt to
bootstrap the cluster.
2014/02/22 19:23:32 [INFO] consul: cluster leadership acquired There is a [deployment table](/docs/internals/consensus.html#toc_3) that covers various options,
but it is recommended to have 3 or 5 total servers per data center. A single server deployment is _**highly**_
discouraged as data loss is inevitable in a failure scenario.
Once `Node A` is running, we can start the next set of servers. There is a [deployment table](/docs/internals/consensus.html#toc_3) Suppose we are starting a 3 server cluster, we can start `Node A`, `Node B` and `Node C` providing
that covers various options, but it is recommended to have 3 or 5 total servers per data center. the `-bootstrap-expect 3` flag. Once the nodes are started, you should see a message to the effect of:
A single server deployment is _**highly**_ discouraged as data loss is inevitable in a failure scenario.
We start the next servers **without** specifying `-bootstrap`. This is critical, since only one server
should ever be running in bootstrap mode*. Once `Node B` and `Node C` are started, you should see a
message to the effect of:
[WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election. [WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.
This indicates that the node is not in bootstrap mode, and it will not elect itself as leader. This indicates that the nodes are expecting 2 peers, but none are known yet. The servers will not elect
We can now join these machines together. Since a join operation is symmetric it does not matter themselves leader to prevent a split-brain. We can now join these machines together. Since a join operation
which node initiates it. From `Node B` and `Node C` you can do the following: is symmetric it does not matter which node initiates it. From any node you can do the following:
$ consul join <Node A Address> $ consul join <Node A Address> <Node B Address> <Node C Address>
Successfully joined cluster by contacting 1 nodes. Successfully joined cluster by contacting 3 nodes.
Alternatively, from `Node A` you can do the following: Once the join is successful, one of the nodes will output something like:
$ consul join <Node B Address> <Node C Address> [INFO] consul: adding server foo (Addr: 127.0.0.2:8300) (DC: dc1)
Successfully joined cluster by contacting 2 nodes. [INFO] consul: adding server bar (Addr: 127.0.0.1:8300) (DC: dc1)
[INFO] consul: Attempting bootstrap with nodes: [127.0.0.3:8300 127.0.0.2:8300 127.0.0.1:8300]
...
[INFO] consul: cluster leadership acquired
Once the join is successful, `Node A` should output something like: As a sanity check, the `consul info` command is a useful tool. It can be used to
[INFO] raft: Added peer 127.0.0.2:8300, starting replication
....
[INFO] raft: Added peer 127.0.0.3:8300, starting replication
Another good check is to run the `consul info` command. When run on `Node A`, you can
verify `raft.num_peers` is now 2, and you can view the latest log index under `raft.last_log_index`. verify `raft.num_peers` is now 2, and you can view the latest log index under `raft.last_log_index`.
When running `consul info` on `Node B` and `Node C` you should see `raft.last_log_index` When running `consul info` on the followers, you should see `raft.last_log_index`
converge to the same value as the leader begins replication. That value represents the last converge to the same value as the leader begins replication. That value represents the last
log entry that has been stored on disk. log entry that has been stored on disk.
This indicates that `Node B` and `Node C` have been added as peers. At this point,
all three nodes see each other as peers, `Node A` is the leader, and replication
should be working.
The final step is to remove the `-bootstrap` flag. This is important since we don't
want the node to be able to make unilateral decisions in the case of a failure of the
other two nodes. To do this, we send a `SIGINT` to `Node A` to allow it to perform
a graceful leave. Then we remove the `-bootstrap` flag and restart the node. The node
will need to rejoin the cluster, since the graceful exit leaves the cluster. Any transactions
that took place while `Node A` was offline will be replicated and the node will catch up.
Now that the servers are all started and replicating to each other, all the remaining Now that the servers are all started and replicating to each other, all the remaining
clients can be joined. Clients are much easier, as they can be started and perform clients can be joined. Clients are much easier, as they can be started and perform
a `join` against any existing node. All nodes participate in a gossip protocol to a `join` against any existing node. All nodes participate in a gossip protocol to
perform basic discovery, so clients will automatically find the servers and register perform basic discovery, so clients will automatically find the servers and register
themselves. themselves.
<div class="alert alert-block alert-info"> It should be noted that it is not strictly necessary to start the server nodes
* If you accidentally start another server with the flag set, do not fret. before the clients, however most operations will fail until the servers are available.
Shutdown the node, and remove the `raft/` folder from the data directory. This will
remove the bad state caused by being in `-bootstrap` mode. Then restart the ## Manual Bootstrapping
node and join the cluster normally.
</div> In versions of Consul previous to 0.4, bootstrapping was a more manual process.
For a guide on using the `-bootstrap` flag directly, see the [manual bootstrapping guide](/docs/guides/manual-bootstrap.html).
This is not recommended, as it is more error prone than automatic bootstrapping.

View file

@ -18,7 +18,7 @@ add or remove a server <a href="/docs/guides/servers.html">see this page</a>.
</div> </div>
If you had only a single server and it has failed, simply restart it. If you had only a single server and it has failed, simply restart it.
Note that a single server configuration requires the `-bootstrap` flag. Note that a single server configuration requires the `-bootstrap` or `-bootstrap-expect 1` flag.
If that server cannot be recovered, you need to bring up a new server. If that server cannot be recovered, you need to bring up a new server.
See the [bootstrapping guide](/docs/guides/bootstrapping.html). Data loss See the [bootstrapping guide](/docs/guides/bootstrapping.html). Data loss
is inevitable, since data was not replicated to any other servers. This is inevitable, since data was not replicated to any other servers. This

View file

@ -18,8 +18,7 @@ to first add the new nodes and then remove the old nodes.
## Adding New Servers ## Adding New Servers
Adding new servers is generally straightforward. After the initial server, no further Adding new servers is generally straightforward. Simply start the new
servers should ever be started with the `-bootstrap` flag. Instead, simply start the new
server with the `-server` flag. At this point, the server will not be a member of server with the `-server` flag. At this point, the server will not be a member of
any cluster, and should emit something like: any cluster, and should emit something like:

View file

@ -20,7 +20,8 @@ will be part of the cluster.
For simplicity, we'll run a single Consul agent in server mode right now: For simplicity, we'll run a single Consul agent in server mode right now:
``` ```
$ consul agent -server -bootstrap -data-dir /tmp/consul $ consul agent -server -bootstrap-expect 1 -data-dir /tmp/consul
==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode.
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary ==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1 ==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent... ==> Starting Consul agent...
@ -67,15 +68,13 @@ joining clusters in the next section.
``` ```
$ consul members $ consul members
Armons-MacBook-Air 10.1.10.38:8301 alive role=consul,dc=dc1,vsn=1,vsn_min=1,vsn_max=1,port=8300,bootstrap=1 Node Address Status Type Build Protocol
Armons-MacBook-Air 10.1.10.38:8301 alive server 0.3.0 2
``` ```
The output shows our own node, the address it is running on, its The output shows our own node, the address it is running on, its
health state, and some metadata associated with the node. Some important health state, its role in the cluster, as well as some versioning information.
metadata keys to recognize are the `role` and `dc` keys. These tell you Additional metadata can be viewed by providing the `-detailed` flag.
the service name and the datacenter that member is within. These can be
used to lookup nodes and services using the DNS interface, which is covered
shortly.
The output from the `members` command is generated based on the The output from the `members` command is generated based on the
[gossip protocol](/docs/internals/gossip.html) and is eventually consistent. [gossip protocol](/docs/internals/gossip.html) and is eventually consistent.

View file

@ -34,7 +34,7 @@ will act as our server in this cluster. We're still not making a cluster
of servers. of servers.
``` ```
$ consul agent -server -bootstrap -data-dir /tmp/consul \ $ consul agent -server -bootstrap-expect 1 -data-dir /tmp/consul \
-node=agent-one -bind=172.20.20.10 -node=agent-one -bind=172.20.20.10
... ...
``` ```
@ -70,9 +70,10 @@ run `consul members` against each agent, you'll see that both agents now
know about each other: know about each other:
``` ```
$ consul members $ consul members -detailed
agent-one 172.20.20.10:8301 alive role=consul,dc=dc1,vsn=1,vsn_min=1,vsn_max=1,port=8300,bootstrap=1 Node Address Status Tags
agent-two 172.20.20.11:8301 alive role=node,dc=dc1,vsn=1,vsn_min=1,vsn_max=1 agent-one 172.20.20.10:8301 alive role=consul,dc=dc1,vsn=2,vsn_min=1,vsn_max=2,port=8300,bootstrap=1
agent-two 172.20.20.11:8301 alive role=node,dc=dc1,vsn=2,vsn_min=1,vsn_max=2
``` ```
<div class="alert alert-block alert-info"> <div class="alert alert-block alert-info">

View file

@ -43,7 +43,7 @@ $ echo '{"service": {"name": "web", "tags": ["rails"], "port": 80}}' \
Now, restart the agent we're running, providing the configuration directory: Now, restart the agent we're running, providing the configuration directory:
``` ```
$ consul agent -server -bootstrap -data-dir /tmp/consul -config-dir /etc/consul.d $ consul agent -server -bootstrap-expect 1 -data-dir /tmp/consul -config-dir /etc/consul.d
==> Starting Consul agent... ==> Starting Consul agent...
... ...
[INFO] agent: Synced service 'web' [INFO] agent: Synced service 'web'