c45bdca0b3
* raft: initial work on raft ha storage support * add note on join * add todo note * raft: add support for bootstrapping and joining existing nodes * raft: gate bootstrap join by reading leader api address from storage * raft: properly check for raft-only for certain conditionals * raft: add bootstrap to api and cli * raft: fix bootstrap cli command * raft: add test for setting up new cluster with raft HA * raft: extend TestRaft_HA_NewCluster to include inmem and consul backends * raft: add test for updating an existing cluster to use raft HA * raft: remove debug log lines, clean up verifyRaftPeers * raft: minor cleanup * raft: minor cleanup * Update physical/raft/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/ha.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/ha.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/logical_system_raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * address feedback comments * address feedback comments * raft: refactor tls keyring logic * address feedback comments * Update vault/raft.go Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * address feedback comments * testing: fix import ordering * raft: rename var, cleanup comment line * docs: remove ha_storage restriction note on raft * docs: more raft HA interaction updates with migration and recovery mode * docs: update the raft join command * raft: update comments * raft: add missing isRaftHAOnly check for clearing out state set earlier * raft: update a few ha_storage config checks * Update command/operator_raft_bootstrap.go Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com> * raft: address feedback comments * raft: fix panic when checking for config.HAStorage.Type * Update vault/raft.go Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * Update website/pages/docs/commands/operator/raft.mdx Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * raft: remove bootstrap cli command * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * raft: address review feedback * raft: revert vendored sdk * raft: don't send applied index and node ID info if we're HA-only Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
160 lines
7.2 KiB
Plaintext
160 lines
7.2 KiB
Plaintext
---
|
||
layout: docs
|
||
page_title: Raft - Storage Backends - Configuration
|
||
sidebar_title: Raft
|
||
description: |-
|
||
|
||
The Raft storage backend is used to persist Vault's data. Unlike all the other
|
||
storage backends, this backend does not operate from a single source for the
|
||
data. Instead all the nodes in a Vault cluster will have a replicated copy of
|
||
the entire data. The data is replicated across the nodes using the Raft
|
||
Consensus Algorithm.
|
||
---
|
||
|
||
# Raft Storage Backend
|
||
|
||
The Raft storage backend is used to persist Vault's data. Unlike other storage
|
||
backends, Raft storage does not operate from a single source of data. Instead
|
||
all the nodes in a Vault cluster will have a replicated copy of Vault's data.
|
||
Data gets replicated across the all the nodes via the [Raft Consensus
|
||
Algorithm][raft].
|
||
|
||
- **High Availability** – the Raft storage backend supports high availability.
|
||
|
||
- **HashiCorp Supported** – the Raft storage backend is officially supported
|
||
by HashiCorp.
|
||
|
||
```hcl
|
||
storage "raft" {
|
||
path = "/path/to/raft/data"
|
||
node_id = "raft_node_1"
|
||
}
|
||
cluster_addr = "http://127.0.0.1:8201"
|
||
```
|
||
|
||
~> **Note:** When using the Raft storage backend, it is required to provide
|
||
`cluster_addr` to indicate the address and port to be used for communication
|
||
between the nodes in the Raft cluster.
|
||
|
||
~> **Note:** When using the Raft storage backend, a separate `ha_storage`
|
||
backend cannot be declared.
|
||
|
||
~> **Note:** When using the Raft storage backend, it is strongly recommended to
|
||
set `disable_mlock` to `true`, and to disable memory swapping on the system.
|
||
|
||
## `raft` Parameters
|
||
|
||
- `path` `(string: "")` – The file system path where all the Vault data gets
|
||
stored.
|
||
This value can be overridden by setting the `VAULT_RAFT_PATH` environment variable.
|
||
|
||
- `node_id` `(string: "")` - The identifier for the node in the Raft cluster.
|
||
This value can be overridden by setting the `VAULT_RAFT_NODE_ID` environment variable.
|
||
|
||
- `performance_multiplier` `(integer: 0)` - An integer multiplier used by
|
||
servers to scale key Raft timing parameters. Tuning this affects the time it
|
||
takes Vault to detect leader failures and to perform leader elections, at the
|
||
expense of requiring more network and CPU resources for better performance.
|
||
Omitting this value or setting it to 0 uses default timing described below.
|
||
Lower values are used to tighten timing and increase sensitivity while higher
|
||
values relax timings and reduce sensitivity.
|
||
|
||
By default, Vault will use a lower-performance timing that's suitable for
|
||
minimal Vault servers, currently equivalent to setting this to a value of 5
|
||
(this default may be changed in future versions of Vault, depending if the
|
||
target minimum server profile changes). Setting this to a value of 1 will
|
||
configure Raft to its highest-performance mode and is recommended for
|
||
production Vault servers. The maximum allowed value is 10.
|
||
|
||
- `trailing_logs` `(integer: 10000)` - This controls how many log entries are
|
||
left in the log store on disk after a snapshot is made. This should only be
|
||
adjusted when followers cannot catch up to the leader due to a very large
|
||
snapshot size and high write throughput causing log truncation before a
|
||
snapshot can be fully installed. If you need to use this to recover a cluster,
|
||
consider reducing write throughput or the amount of data stored on Vault. The
|
||
default value is 10000 which is suitable for all normal workloads.
|
||
|
||
- `snapshot_threshold` `(integer: 8192)` - This controls the minimum number of raft
|
||
commit entries between snapshots that are saved to disk. This is a low-level
|
||
parameter that should rarely need to be changed. Very busy clusters
|
||
experiencing excessive disk IO may increase this value to reduce disk IO and
|
||
minimize the chances of all servers taking snapshots at the same time.
|
||
Increasing this trades off disk IO for disk space since the log will grow much
|
||
larger and the space in the raft.db file can't be reclaimed till the next
|
||
snapshot. Servers may take longer to recover from crashes or failover if this
|
||
is increased significantly as more logs will need to be replayed.
|
||
|
||
- `retry_join` `(list: [])` - There can be one or more `retry_join` stanzas.
|
||
When the raft cluster is getting bootstrapped, if the connection details of all
|
||
the nodes are known beforehand, then specifying this config stanzas enables the
|
||
nodes to automatically join a raft cluster. All the nodes would mention all
|
||
other nodes that they could join using this config. When one of the nodes is
|
||
initialized, it becomes the leader and all the other nodes will join the
|
||
leader node to form the cluster. When using Shamir seal, the joined nodes will
|
||
still need to be unsealed manually. See the section below that describes the
|
||
parameters accepted by the `retry_join` stanza.
|
||
|
||
- `max_entry_size` `(integer: 1048576)` - This configures the maximum number of
|
||
bytes for a raft entry. It applies to both Put operations and transactions.
|
||
Any put or transaction operation exceeding this configuration value will cause
|
||
the respective operation to fail. Raft has a suggested max size of data in a
|
||
raft log entry. This is based on current architecture, default timing, etc.
|
||
Integrated storage also uses a chunk size that is the threshold used for
|
||
breaking a large value into chunks. By default, the chunk size is the same as
|
||
raft's max size log entry. The default value for this configuration is 1048576
|
||
-- two times the chunking size.
|
||
|
||
### `retry_join` stanza
|
||
|
||
- `leader_api_addr` `(string: "")` - Address of a possible leader node.
|
||
|
||
- `leader_ca_cert_file` `(string: "")` - File path to the CA cert of the
|
||
possible leader node.
|
||
|
||
- `leader_client_cert_file` `(string: "")` - File path to the client certificate
|
||
for the follower node to establish client authentication with the possible
|
||
leader node.
|
||
|
||
- `leader_client_key_file` `(string: "")` - File path to the client key for the
|
||
follower node to establish client authentication with the possible leader node.
|
||
|
||
- `leader_ca_cert` `(string: "")` - CA cert of the possible leader node.
|
||
|
||
- `leader_client_cert` `(string: "")` - Client certificate for the follower node
|
||
to establish client authentication with the possible leader node.
|
||
|
||
- `leader_client_key` `(string: "")` - Client key for the follower node to
|
||
establish client authentication with the possible leader node.
|
||
|
||
Each `retry_join` block may provide TLS certificates via file paths or as a
|
||
single-line certificate string value with newlines delimited by `\n`, but not a
|
||
combination of both.
|
||
|
||
Example Configuration:
|
||
```
|
||
storage "raft" {
|
||
path = "/Users/foo/raft/"
|
||
node_id = "node1"
|
||
retry_join {
|
||
leader_api_addr = "http://127.0.0.2:8200"
|
||
leader_ca_cert_file = "/path/to/ca1"
|
||
leader_client_cert_file = "/path/to/client/cert1"
|
||
leader_client_key_file = "/path/to/client/key1"
|
||
}
|
||
retry_join {
|
||
leader_api_addr = "http://127.0.0.3:8200"
|
||
leader_ca_cert_file = "/path/to/ca2"
|
||
leader_client_cert_file = "/path/to/client/cert2"
|
||
leader_client_key_file = "/path/to/client/key2"
|
||
}
|
||
retry_join {
|
||
leader_api_addr = "http://127.0.0.4:8200"
|
||
leader_ca_cert_file = "/path/to/ca3"
|
||
leader_client_cert_file = "/path/to/client/cert3"
|
||
leader_client_key_file = "/path/to/client/key3"
|
||
}
|
||
}
|
||
```
|
||
|
||
[raft]: https://raft.github.io/ 'The Raft Consensus Algorithm'
|