From dd33777d177971c581b4ad42b7b51e96e541dd2a Mon Sep 17 00:00:00 2001 From: Jonathan Ballet Date: Sat, 31 Jul 2021 00:05:15 +0200 Subject: [PATCH] Improve "Integrated Storage" documentation (#12200) * Improve "Integrated Storage" documentation * add missing markup * add more links to the configuration pages * Improve the Raft Storage configuration page * More markup * Improve the "High Availability" documentation * More links to the configuration pages * More links * even more links --- website/content/docs/concepts/ha.mdx | 116 ++++++++++------- .../concepts/integrated-storage/index.mdx | 120 ++++++++++-------- .../docs/configuration/storage/raft.mdx | 69 +++++----- 3 files changed, 176 insertions(+), 129 deletions(-) diff --git a/website/content/docs/concepts/ha.mdx b/website/content/docs/concepts/ha.mdx index 78bfd958c..6d7fd93bb 100644 --- a/website/content/docs/concepts/ha.mdx +++ b/website/content/docs/concepts/ha.mdx @@ -21,21 +21,24 @@ information is also available on the To be highly available, one of the Vault server nodes grabs a lock within the data store. The successful server node then becomes the active node; all other nodes become standby nodes. At this point, if the standby nodes receive a -request, they will either forward the request or redirect the client depending -on the current configuration and state of the cluster -- see the sections below -for details. Due to this architecture, HA does not enable increased -scalability. In general, the bottleneck of Vault is the data store itself, not -Vault core. For example: to increase the scalability of Vault with Consul, you -would generally scale Consul instead of Vault. +request, they will either [forward the request](#request-forwarding) or +[redirect the client](#client-redirection) depending on the current +configuration and state of the cluster -- see the sections below for details. +Due to this architecture, HA does not enable increased scalability. In general, +the bottleneck of Vault is the data store itself, not Vault core. For example: +to increase the scalability of Vault with Consul, you would generally scale +Consul instead of Vault. Certain storage backends can support high availability mode, which enable them to store both Vault's information in addition to the HA lock. However, Vault also supports split data/HA mode, whereby the lock value and the rest of the -data live separately. This can be done by specifying both the `storage` and -`ha_storage` stanzas in the configuration file with different backends. For -instance, a Vault cluster can be set up to use Consul as the `ha_storage` to -manage the lock, and use Amazon S3 as the `storage` for all other persisted -data. +data live separately. This can be done by specifying both the +[`storage`](/docs/configuration#storage) and +[`ha_storage`](/docs/configuration#ha_storage) stanzas in the configuration file +with different backends. For instance, a Vault cluster can be set up to use +Consul as the [`ha_storage`](/docs/configuration#ha_storage) to manage the lock, +and use Amazon S3 as the [`storage`](/docs/configuration#storage) for all other +persisted data. The sections below explain the server communication patterns and each type of request handling in more detail. At a minimum, the requirements for redirection @@ -84,28 +87,36 @@ always required for all HA setups. Some HA data store drivers can autodetect the redirect address, but it is often necessary to configure it manually via a top-level value in the configuration -file. The key for this value is `api_addr` and the value can also be specified -by the `VAULT_API_ADDR` environment variable, which takes precedence. +file. The key for this value is [`api_addr`](/docs/configuration#api_addr) and +the value can also be specified by the `VAULT_API_ADDR` environment variable, +which takes precedence. -What the `api_addr` value should be set to depends on how Vault is set up. -There are two common scenarios: Vault servers accessed directly by clients, and -Vault servers accessed via a load balancer. +What the [`api_addr`](/docs/configuration#api_addr) value should be set to +depends on how Vault is set up. There are two common scenarios: Vault servers +accessed directly by clients, and Vault servers accessed via a load balancer. -In both cases, the `api_addr` should be a full URL including scheme -(`http`/`https`), not simply an IP address and port. +In both cases, the [`api_addr`](/docs/configuration#api_addr) should be a full +URL including scheme (`http`/`https`), not simply an IP address and port. ### Direct Access -When clients are able to access Vault directly, the `api_addr` for each -node should be that node's address. For instance, if there are two Vault nodes -`A` (accessed via `https://a.vault.mycompany.com:8200`) and `B` (accessed via -`https://b.vault.mycompany.com:8200`), node `A` would set its `api_addr` -to `https://a.vault.mycompany.com:8200` and node `B` would set its -`api_addr` to `https://b.vault.mycompany.com:8200`. +When clients are able to access Vault directly, the +[`api_addr`](/docs/configuration#api_addr) for each node should be that node's +address. For instance, if there are two Vault nodes: + +* `A`, accessed via `https://a.vault.mycompany.com:8200` +* `B`, accessed via `https://b.vault.mycompany.com:8200` + +Then node `A` would set its +[`api_addr`](/docs/configuration#api_addr) to +`https://a.vault.mycompany.com:8200` and node `B` would set its +[`api_addr`](/docs/configuration#api_addr) to +`https://b.vault.mycompany.com:8200`. This way, when `A` is the active node, any requests received by node `B` will -cause it to redirect the client to node `A`'s `api_addr` at -`https://a.vault.mycompany.com`, and vice-versa. +cause it to redirect the client to node `A`'s +[`api_addr`](/docs/configuration#api_addr) at `https://a.vault.mycompany.com`, +and vice-versa. ### Behind Load Balancers @@ -115,33 +126,42 @@ case, the Vault servers should actually be set up as described in the above section, since for redirection purposes the clients have direct access. However, if the only access to the Vault servers is via the load balancer, the -`api_addr` on each node should be the same: the address of the load -balancer. Clients that reach a standby node will be redirected back to the load -balancer; at that point hopefully the load balancer's configuration will have -been updated to know the address of the current leader. This can cause a -redirect loop and as such is not a recommended setup when it can be avoided. +[`api_addr`](/docs/configuration#api_addr) on each node should be the same: the +address of the load balancer. Clients that reach a standby node will be +redirected back to the load balancer; at that point hopefully the load +balancer's configuration will have been updated to know the address of the +current leader. This can cause a redirect loop and as such is not a recommended +setup when it can be avoided. ### Per-Node Cluster Listener Addresses -Each `listener` block in Vault's configuration file contains an `address` value -on which Vault listens for requests. Similarly, each `listener` block can -contain a `cluster_address` on which Vault listens for server-to-server cluster -requests. If this value is not set, its IP address will be automatically set to -same as the `address` value, and its port will be automatically set to the same -as the `address` value plus one (so by default, port `8201`). +Each [`listener`](/docs/configuration/listener) block in Vault's configuration +file contains an [`address`](/docs/configuration/listener/tcp#address) value on +which Vault listens for requests. Similarly, each +[`listener`](/docs/configuration/listener) block can contain a +[`cluster_address`](/docs/configuration/listener/tcp#cluster_address) on which +Vault listens for server-to-server cluster requests. If this value is not set, +its IP address will be automatically set to same as the +[`address`](/docs/configuration/listener/tcp#address) value, and its port will +be automatically set to the same as the +[`address`](/docs/configuration/listener/tcp#address) value plus one (so by +default, port `8201`). Note that _only_ active nodes have active listeners. When a node becomes active it will start cluster listeners, and when it becomes standby it will stop them. ### Per-Node Cluster Address -Similar to the `api_addr`, `cluster_addr` is the value that each node, if -active, should advertise to the standbys to use for server-to-server +Similar to the [`api_addr`](/docs/configuration#api_addr), +[`cluster_addr`](/docs/configuration#cluster_addr) is the value that each node, +if active, should advertise to the standbys to use for server-to-server communications, and lives as a top-level value in the configuration file. On each node, this should be set to a host name or IP address that a standby can -use to reach one of that node's `cluster_address` values set in the `listener` -blocks, including port. (Note that this will always be forced to `https` since -only TLS connections are used between servers.) +use to reach one of that node's +[`cluster_address`](/docs/configuration#cluster_address) values set in the +[`listener`](/docs/configuration/listener) blocks, including port. (Note that +this will always be forced to `https` since only TLS connections are used +between servers.) This value can also be specified by the `VAULT_CLUSTER_ADDR` environment variable, which takes precedence. @@ -149,12 +169,16 @@ variable, which takes precedence. ## Storage Support Currently there are several storage backends that support high availability -mode, including Consul, ZooKeeper and etcd. These may change over time, and the -[configuration page](/docs/configuration) should be referenced. +mode, including [Consul](/docs/storage/consul), +[ZooKeeper](/docs/storage/zookeeper) and [etcd](/docs/storage/etcd). These may +change over time, and the [configuration page](/docs/configuration) should be +referenced. -The Consul backend is the recommended HA backend, as it is used in production +The [Consul backend](/docs/storage/consul) is the recommended HA backend, as it is used in production by HashiCorp and its customers with commercial support. If you're interested in implementing another backend or adding HA support to another backend, we'd love your contributions. Adding HA support requires -implementing the `physical.HABackend` interface for the storage backend. +implementing the +[`physical.HABackend`](https://pkg.go.dev/github.com/hashicorp/vault/sdk/physical#HABackend) +interface for the storage backend. diff --git a/website/content/docs/concepts/integrated-storage/index.mdx b/website/content/docs/concepts/integrated-storage/index.mdx index 77f4d8dd7..d63d59e81 100644 --- a/website/content/docs/concepts/integrated-storage/index.mdx +++ b/website/content/docs/concepts/integrated-storage/index.mdx @@ -30,7 +30,7 @@ Vault's cluster port. The cluster port defaults to `8201`. The TLS information is exchanged at join time and is rotated on a cadence. A requirement for integrated storage is that the -[cluster_addr](/docs/concepts/ha#per-node-cluster-address) configuration option +[`cluster_addr`](/docs/concepts/ha#per-node-cluster-address) configuration option is set. This allows Vault to assign an address to the node ID at join time. ## Cluster Membership @@ -57,10 +57,10 @@ been joined it cannot be re-joined to a different cluster. You can either join the node automatically via the config file or manually through the API (both methods described below). When joining a node, the API address of the leader node must be used. We -recommend setting the [api_addr](/docs/concepts/ha#direct-access) configuration +recommend setting the [`api_addr`](/docs/concepts/ha#direct-access) configuration option on all nodes to make joining simpler. -#### retry_join Configuration +#### `retry_join` Configuration This method enables setting one, or more, target leader nodes in the config file. When an uninitialized Vault server starts up it will attempt to join each potential @@ -69,7 +69,7 @@ leaders become active this node will successfully join. When using Shamir seal, the joined nodes will still need to be unsealed manually. When using Auto Unseal the node will be able to join and unseal automatically. -An example [retry_join](/docs/configuration/storage/raft#retry_join-stanza) +An example [`retry_join`](/docs/configuration/storage/raft#retry_join-stanza) config can be seen below: ```hcl @@ -86,14 +86,18 @@ storage "raft" { } ``` -Note, in each `retry_join` stanza, you may provide a single `leader_api_addr` or -`auto_join` value. When a cloud `auto_join` configuration value is provided, Vault -will use [go-discover](https://github.com/hashicorp/go-discover) to automatically -attempt to discover and resolve potential Raft leader addresses. +Note, in each [`retry_join`](/docs/configuration/storage/raft#retry_join-stanza) +stanza, you may provide a single +[`leader_api_addr`](/docs/configuration/storage/raft#leader_api_addr) or +[`auto_join`](/docs/configuration/storage/raft#auto_join) value. When a cloud +[`auto_join`](/docs/configuration/storage/raft#auto_join) configuration value is +provided, Vault will use [go-discover](https://github.com/hashicorp/go-discover) +to automatically attempt to discover and resolve potential Raft leader +addresses. See the go-discover [README](https://github.com/hashicorp/go-discover/blob/master/README.md) for -details on the format of the `auto_join` value. +details on the format of the [`auto_join`](/docs/configuration/storage/raft#auto_join) value. ```hcl storage "raft" { @@ -106,9 +110,11 @@ storage "raft" { } ``` -By default, Vault will attempt to reach discovered peers using HTTPS and port 8200. -Operators may override these through the `auto_join_scheme` and `auto_join_port` -fields respectively. +By default, Vault will attempt to reach discovered peers using HTTPS and port +8200. Operators may override these through the +[`auto_join_scheme`](/docs/configuration/storage/raft#auto_join_scheme) and +[`auto_join_port`](/docs/configuration/storage/raft#auto_join_port) fields +respectively. ```hcl storage "raft" { @@ -125,7 +131,7 @@ storage "raft" { #### Join from the CLI -Alternatively you can use the [join CLI +Alternatively you can use the [`join` CLI command](/docs/commands/operator/raft/#join) or the API to join a node. The active node's API address will need to be specified: @@ -154,7 +160,7 @@ the size of the cluster, or for many other reasons. Removing the peer will ensure the cluster stays at the desired size, and that quorum is maintained. To remove the peer you can issue a -[remove-peer](/docs/commands/operator/raft#remove-peer) command and provide the +[`remove-peer`](/docs/commands/operator/raft#remove-peer) command and provide the node ID you wish to remove: ```shell-session @@ -165,7 +171,7 @@ Peer removed successfully! ### Listing Peers To see the current peer set for the cluster you can issue a -[list-peers](/docs/commands/operator/raft#list-peers) command. All the voting +[`list-peers`](/docs/commands/operator/raft#list-peers) command. All the voting nodes that are listed here contribute to the quorum and a majority must be alive for integrated storage to continue to operate. @@ -183,22 +189,24 @@ node3 node3.vault.local:8201 leader true We've glossed over some details in the above sections on bootstrapping clusters. The instructions are sufficient for most cases, but some users have run into problems when using auto-join and TLS in conjunction with things like auto-scaling. -The issue is that go-discover on most platforms returns IPs (not hostnames), and -because the IPs aren't knowable in advance, the TLS certificates used to secure -the Vault API port don't contain these IPs in their IP SANs. +The issue is that [go-discover](https://github.com/hashicorp/go-discover) on +most platforms returns IPs (not hostnames), and because the IPs aren't knowable +in advance, the TLS certificates used to secure the Vault API port don't contain +these IPs in their IP SANs. ### Vault networking recap Before we explore solutions to this problem, let's recapitulate how Vault nodes speak to one another. -Vault exposes two TCP ports: the API port and the cluster port. +Vault exposes two TCP ports: [the API port](/docs/configuration#api_addr) and +[the cluster port](/docs/configuration#cluster_addr). The API port is where clients send their Vault HTTP requests. For a single-node Vault cluster you don't worry about a cluster port as it won't be used. -When you have multiple nodes you also need a cluster port. This is used by Vault +When you have multiple nodes, you also need a cluster port. This is used by Vault nodes to issue RPCs to one another, e.g. to forward requests from a standby node to the active node, or when Raft is in use, to handle leader election and replication of stored data. @@ -217,12 +225,12 @@ instead of the cluster port. This is currently the only situation in which OSS Vault does this (Vault Enterprise also does something similar when setting up replication.) -* node2 wants to join the cluster, so issues challenge API request to existing member node1 -* node1 replies to challenge request with (1) an encrypted random UUID and (2) seal config -* node2 must decrypt UUID using seal; if using auto-unseal can do it directly, if using shamir must wait for user to provide enough unseal keys to perform decryption -* node2 sends decrypted UUID back to node1 using answer API -* node1 sees node2 can be trusted (since it has seal access) and replies with a bootstrap package which includes the cluster TLS certificate and private key -* node2 gets sent a raft snapshot over the cluster port +* `node2` wants to join the cluster, so issues challenge API request to existing member `node1` +* `node1` replies to challenge request with (1) an encrypted random UUID and (2) seal config +* `node2` must decrypt UUID using seal; if using auto-unseal can do it directly, if using Shamir must wait for user to provide enough unseal keys to perform decryption +* `node2` sends decrypted UUID back to `node1` using answer API +* `node1` sees `node2` can be trusted (since it has seal access) and replies with a bootstrap package which includes the cluster TLS certificate and private key +* `node2` gets sent a raft snapshot over the cluster port After this procedure the new node will never again send traffic to the API port. All subsequent inter-node communication will use the cluster port. @@ -231,22 +239,26 @@ All subsequent inter-node communication will use the cluster port. ### Assisted raft join techniques -The simplest option is to do it by hand: issue raft join commands specifying the -explicit names or IPs of the nodes to join to. In this section we look at other -TLS-compatible options that lend themselves more to automation. +The simplest option is to do it by hand: issue [`raft +join`](/docs/commands/operator/raft#join) commands specifying the explicit names +or IPs of the nodes to join to. In this section we look at other TLS-compatible +options that lend themselves more to automation. #### Autojoin with TLS servername -As of Vault 1.6.2, the simplest option might be to specify a leader_tls_servername -in the retry_join stanza which matches a DNS SAN in the certificate. +As of Vault 1.6.2, the simplest option might be to specify a +[`leader_tls_servername`](/docs/configuration/storage/raft#leader_tls_servername) +in the [`retry_join`](/docs/configuration/storage/raft#retry_join-stanza) stanza +which matches a [DNS +SAN](https://en.wikipedia.org/wiki/Subject_Alternative_Name) in the certificate. Note that names in a certificate's DNS SAN don't actually have to be registered in a DNS server. Your nodes may have no names found in DNS, while still -using certificate(s) that contain this shared "servername" in their DNS SANs. +using certificate(s) that contain this shared `servername` in their DNS SANs. #### Autojoin but constrain CIDR, list all possible IPs in certificate -If all the vault node IPs are assigned from a small subnet, e.g. a /28, it +If all the vault node IPs are assigned from a small subnet, e.g. a `/28`, it becomes practical to put all the IPs that exist in that subnet into the IP SANs of the TLS certificate the nodes will share. @@ -258,8 +270,9 @@ using non-voting nodes and dynamically scaling clusters. Most Vault instances are going to have a load balancer (LB) between clients and the Vault nodes. In that case, the LB knows how to route traffic to working -Vault nodes, and there's no need for auto-join: we can just use retry_join -with the LB address as the target. +Vault nodes, and there's no need for auto-join: we can just use +[`retry_join`](/docs/configuration/storage/raft#retry_join-stanza) with the LB +address as the target. One potential issue here: some users want a public facing LB for clients to connect to Vault, but aren't comfortable with Vault internal traffic @@ -279,12 +292,13 @@ and have it reconnect to the cluster with the same host address. This will retur the cluster to a fully healthy state. If this is impractical, you need to remove the failed server. Usually, you can -issue a remove-peer command to remove the failed server if it's still a member -of the cluster. +issue a [`remove-peer`](/docs/commands/operator/raft#remove-peer) command to +remove the failed server if it's still a member of the cluster. -If the remove-peer command isn't possible or you'd rather manually re-write the -cluster membership a `raft/peers.json` file can be written to the configured -data directory. +If the [`remove-peer`](/docs/commands/operator/raft#remove-peer) command isn't +possible or you'd rather manually re-write the cluster membership a +[`raft/peers.json`](#manual-recovery-using-peers-json) file can be written to +the configured data directory. ### Quorum Lost @@ -302,22 +316,24 @@ committed could be incomplete. The recovery process implicitly commits all outstanding Raft log entries, so it's also possible to commit data that was uncommitted before the failure. -See the section below on manual recovery using peers.json for details of the -recovery procedure. You include only the remaining servers in the -raft/peers.json recovery file. The cluster should be able to elect a leader once -the remaining servers are all restarted with an identical raft/peers.json -configuration. +See the section below on manual recovery using +[`peers.json`](#manual-recovery-using-peers-json) for details of the recovery +procedure. You include only the remaining servers in the +[`peers.json`](#manual-recovery-using-peers-json) recovery file. The +cluster should be able to elect a leader once the remaining servers are all +restarted with an identical +[`peers.json`](#manual-recovery-using-peers-json) configuration. Any servers you introduce later can be fresh with totally clean data directories and joined using Vault's join command. In extreme cases, it should be possible to recover with just a single remaining server by starting that single server with itself as the only peer in the -raft/peers.json recovery file. +[`peers.json`](#manual-recovery-using-peers-json) recovery file. ### Manual Recovery Using peers.json -Using raft/peers.json for recovery can cause uncommitted Raft log entries to be +Using `raft/peers.json` for recovery can cause uncommitted Raft log entries to be implicitly committed, so this should only be used after an outage where no other option is available to recover a lost server. Make sure you don't have any automated processes that will put the peers file in place on a periodic basis. @@ -326,10 +342,10 @@ To begin, stop all remaining servers. The next step is to go to the [configured data path](/docs/configuration/storage/raft/#path) of each Vault server. Inside that -directory, there will be a raft/ sub-directory. We need to create a -raft/peers.json file. The file should be formatted as a JSON array containing -the node ID, address:port, and suffrage information of each Vault server you -wish to be in the cluster. +directory, there will be a `raft/` sub-directory. We need to create a +`raft/peers.json` file. The file should be formatted as a JSON array containing +the node ID, `address:port`, and suffrage information of each Vault server you +wish to be in the cluster: ```json [ @@ -352,7 +368,7 @@ wish to be in the cluster. ``` - `id` `(string: )` - Specifies the node ID of the server. This can be - found in the config file, or inside the node-id file in the server's data + found in the config file, or inside the `node-id` file in the server's data directory if it was auto-generated. - `address` `(string: )` - Specifies the host and port of the server. The port is the server's cluster port. diff --git a/website/content/docs/configuration/storage/raft.mdx b/website/content/docs/configuration/storage/raft.mdx index b4d121942..b63350b69 100644 --- a/website/content/docs/configuration/storage/raft.mdx +++ b/website/content/docs/configuration/storage/raft.mdx @@ -32,14 +32,15 @@ cluster_addr = "http://127.0.0.1:8201" ``` ~> **Note:** When using the Integrated Storage backend, it is required to provide -`cluster_addr` to indicate the address and port to be used for communication +[`cluster_addr`](/docs/concepts/ha#per-node-cluster-address) to indicate the address and port to be used for communication between the nodes in the Raft cluster. -~> **Note:** When using the Integrated Storage backend, a separate `ha_storage` +~> **Note:** When using the Integrated Storage backend, a separate +[`ha_storage`](/docs/configuration#ha_storage) backend cannot be declared. ~> **Note:** When using the Integrated Storage backend, it is strongly recommended to -set `disable_mlock` to `true`, and to disable memory swapping on the system. +set [`disable_mlock`](/docs/configuration#disable_mlock) to `true`, and to disable memory swapping on the system. ## `raft` Parameters @@ -73,43 +74,44 @@ set `disable_mlock` to `true`, and to disable memory swapping on the system. consider reducing write throughput or the amount of data stored on Vault. The default value is 10000 which is suitable for all normal workloads. -- `snapshot_threshold` `(integer: 8192)` - This controls the minimum number of raft +- `snapshot_threshold` `(integer: 8192)` - This controls the minimum number of Raft commit entries between snapshots that are saved to disk. This is a low-level parameter that should rarely need to be changed. Very busy clusters experiencing excessive disk IO may increase this value to reduce disk IO and minimize the chances of all servers taking snapshots at the same time. Increasing this trades off disk IO for disk space since the log will grow much - larger and the space in the raft.db file can't be reclaimed till the next + larger and the space in the `raft.db` file can't be reclaimed till the next snapshot. Servers may take longer to recover from crashes or failover if this is increased significantly as more logs will need to be replayed. -- `retry_join` `(list: [])` - There can be one or more `retry_join` stanzas. - When the raft cluster is getting bootstrapped, if the connection details of all - the nodes are known beforehand, then specifying this config stanzas enables the - nodes to automatically join a raft cluster. All the nodes would mention all - other nodes that they could join using this config. When one of the nodes is - initialized, it becomes the leader and all the other nodes will join the - leader node to form the cluster. When using Shamir seal, the joined nodes will - still need to be unsealed manually. See the section below that describes the - parameters accepted by the `retry_join` stanza. +- `retry_join` `(list: [])` - There can be one or more + [`retry_join`](#retry_join-stanza) stanzas. When the Raft cluster is getting + bootstrapped, if the connection details of all the nodes are known beforehand, + then specifying this config stanzas enables the nodes to automatically join a + Raft cluster. All the nodes would mention all other nodes that they could join + using this config. When one of the nodes is initialized, it becomes the leader + and all the other nodes will join the leader node to form the cluster. When + using Shamir seal, the joined nodes will still need to be unsealed manually. + See [the section below](#retry_join-stanza) that describes the parameters + accepted by the [`retry_join`](#retry_join-stanza) stanza. - `max_entry_size` `(integer: 1048576)` - This configures the maximum number of - bytes for a raft entry. It applies to both Put operations and transactions. + bytes for a Raft entry. It applies to both Put operations and transactions. Any put or transaction operation exceeding this configuration value will cause the respective operation to fail. Raft has a suggested max size of data in a - raft log entry. This is based on current architecture, default timing, etc. + Raft log entry. This is based on current architecture, default timing, etc. Integrated storage also uses a chunk size that is the threshold used for breaking a large value into chunks. By default, the chunk size is the same as - raft's max size log entry. The default value for this configuration is 1048576 + Raft's max size log entry. The default value for this configuration is 1048576 -- two times the chunking size. -- `autopilot_reconcile_interval` `(string: "")` - This is the interval after +- `autopilot_reconcile_interval` `(string: "10s")` - This is the interval after which autopilot will pick up any state changes. State change could mean multiple things; for example a newly joined voter node, initially added as non-voter to - the raft cluster by autopilot has successfully completed the stabilization + the Raft cluster by autopilot has successfully completed the stabilization period thereby qualifying for being promoted as a voter, a node that has become unhealthy and needs to be shown as such in the state API, a node has been marked - as dead needing eviction from raft configuration, etc. Defaults to 10s. + as dead needing eviction from Raft configuration, etc. ### `retry_join` stanza @@ -123,8 +125,11 @@ set `disable_mlock` to `true`, and to disable memory swapping on the system. - `auto_join_port` `(uint: "")` - The optional port used for addressed discovered via auto-join. -- `leader_tls_servername` `(string: "")` - TLS servername to use when connecting with HTTPS. - Should match one of the names in the DNS SANs of the remote server certificate. +- `leader_tls_servername` `(string: "")` - The TLS server name to use when + connecting with HTTPS. + Should match one of the names in the [DNS + SANs](https://en.wikipedia.org/wiki/Subject_Alternative_Name) of the remote + server certificate. See also [Integrated Storage and TLS](https://www.vaultproject.io/docs/concepts/integrated-storage#autojoin-with-tls-servername) - `leader_ca_cert_file` `(string: "")` - File path to the CA cert of the @@ -145,20 +150,22 @@ set `disable_mlock` to `true`, and to disable memory swapping on the system. - `leader_client_key` `(string: "")` - Client key for the follower node to establish client authentication with the possible leader node. -Each `retry_join` block may provide TLS certificates via file paths or as a -single-line certificate string value with newlines delimited by `\n`, but not a -combination of both. Each `retry_join` stanza may contain either a `leader_api_addr` -value or a cloud `auto_join` configuration value, but not both. When an `auto_join` -value is provided, Vault will automatically attempt to discover and resolve -potential Raft leader addresses. +Each [`retry_join`](#retry_join-stanza) block may provide TLS certificates via +file paths or as a single-line certificate string value with newlines delimited +by `\n`, but not a combination of both. Each [`retry_join`](#retry_join-stanza) +stanza may contain either a [`leader_api_addr`](#leader_api_addr) value or a +cloud [`auto_join`](#auto_join) configuration value, but not both. When an +[`auto_join`](#auto_join) value is provided, Vault will automatically attempt to +discover and resolve potential Raft leader addresses. -By default, Vault will attempt to reach discovered peers using HTTPS and port 8200. -Operators may override these through the `auto_join_scheme` and `auto_join_port` +By default, Vault will attempt to reach discovered peers using HTTPS and port +8200. Operators may override these through the +[`auto_join_scheme`](#auto_join_scheme) and [`auto_join_port`](#auto_join_port) fields respectively. Example Configuration: -``` +```hcl storage "raft" { path = "/Users/foo/raft/" node_id = "node1"