8eed94b072
* website maintenance round * improve docs, revert bug workaround as it was fixed * boost memory * remove unnecessary code
251 lines
9.9 KiB
Plaintext
251 lines
9.9 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Integrated Storage
|
|
sidebar_title: Integrated Storage
|
|
description: Learn about the integrated raft storage in Vault.
|
|
---
|
|
|
|
# Integrated Storage
|
|
|
|
Vault supports a number of storage options for the durable storage of Vault's
|
|
information. As of Vault 1.4 an integrated storage option is offered. This
|
|
storage backend does not rely on any third party systems, implements high
|
|
availability semantics, supports Enterprise Replication features, and provides
|
|
backup/restore workflows.
|
|
|
|
The integrated storage option stores Vault's data on the server's filesystem and
|
|
uses a consensus protocol to replicate data to each server in the cluster. More
|
|
information on the internals of integrated storage can be found in the
|
|
[integrated storage internals
|
|
documentation](/docs/internals/integrated-storage/). Additionally, the
|
|
[Configuration](/docs/configuration/storage/raft/) docs can help in configuring
|
|
Vault to use integrated storage.
|
|
|
|
The sections below go into various details on how to operate Vault with
|
|
integrated storage.
|
|
|
|
## Cluster Membership
|
|
|
|
This section will outline how to bootstrap and manage a cluster of Vault nodes
|
|
running integrated storage.
|
|
|
|
Integrated storage is bootstrapped during the [initialization
|
|
process](https://learn.hashicorp.com/vault/getting-started/deploy#initializing-the-vault),
|
|
and results in a cluster of size 1. Depending on the [desired deployment
|
|
size](/docs/internals/integrated-storage/#deployment-table), nodes can be joined
|
|
to the active Vault node.
|
|
|
|
### Joining Nodes
|
|
|
|
Joining is the process of taking an uninitialized Vault node and making it a
|
|
member of an existing cluster. In order to authenticate the new node to the
|
|
cluster it must use the same seal mechanism. If using a Auto Unseal the node
|
|
must be configured to use the same KMS provider and Key as the cluster it's
|
|
attempting to join. If using a Shamir seal the unseal keys must be provided to
|
|
the new node before the join process can complete. Once a node has successfully
|
|
joined, data from the active node can begin to replicate to it. Once a node has
|
|
been joined it cannot be re-joined to a different cluster.
|
|
|
|
You can either join the node automatically via the config file or manually through the
|
|
API (both methods described below). When joining a node, the API address of the leader node must be used. We
|
|
recommend setting the [api_addr](/docs/concepts/ha#direct-access) configuration
|
|
option on all nodes to make joining simpler.
|
|
|
|
#### retry_join Configuration
|
|
|
|
This method enables setting one, or more, target leader nodes in the config file. When an
|
|
uninitialized Vault server starts up it will attempt to join each potential
|
|
leader that is defined, retrying until successful. When one of the specified
|
|
leaders become active this node will successfully join. When using Shamir seal,
|
|
the joined nodes will still need to be unsealed manually. When using Auto Unseal
|
|
the node will be able to join and unseal automatically.
|
|
|
|
An example [retry_join](/docs/configuration/storage/raft#retry_join-stanza)
|
|
config can be seen below:
|
|
|
|
```hcl
|
|
storage "raft" {
|
|
path = "/var/raft/"
|
|
node_id = "node3"
|
|
retry_join {
|
|
leader_api_addr = "https://node1.vault.local:8200"
|
|
}
|
|
retry_join {
|
|
leader_api_addr = "https://node2.vault.local:8200"
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Join from the CLI
|
|
|
|
Alternatively you can use the [join CLI
|
|
command](/docs/commands/operator/raft/#join) or the API to join a node. The
|
|
active node's API address will need to be specified:
|
|
|
|
```shell-session
|
|
$ vault operator raft join https://node1.vault.local:8200
|
|
```
|
|
|
|
#### Non-Voting Nodes (Enterprise Only)
|
|
|
|
Nodes that are joined to a cluster can be specified as non-voters. A non-voting
|
|
node has all of Vault's data replicated to it, but does not contribute to the
|
|
quorum count. This can be used in conjunction with [Performance
|
|
Standby](/docs/enterprise/performance-standby/) nodes to add read scalability to
|
|
a cluster in cases where a high volume of reads to servers are needed.
|
|
|
|
```shell-session
|
|
$ vault operator raft join -non-voter https://node1.vault.local:8200
|
|
```
|
|
|
|
### Removing Peers
|
|
|
|
Removing a peer node is a necessary step when you no longer want the node in the
|
|
cluster. This could happen if the node is rotated for a new one, the hostname
|
|
permanently changes and can no longer be accessed, you're attempting to shrink
|
|
the size of the cluster, or for many other reasons. Removing the peer will
|
|
ensure the cluster stays at the desired size, and that quorum is maintained.
|
|
|
|
To remove the peer you can issue a
|
|
[remove-peer](/docs/commands/operator/raft#remove-peer) command and provide the
|
|
node ID you wish to remove:
|
|
|
|
```shell-session
|
|
$ vault operator raft remove-peer node1
|
|
Peer removed successfully!
|
|
```
|
|
|
|
### Listing Peers
|
|
|
|
To see the current peer set for the cluster you can issue a
|
|
[list-peers](/docs/commands/operator/raft#list-peers) command. All the voting
|
|
nodes that are listed here contribute to the quorum and a majority must be alive
|
|
for integrated storage to continue to operate.
|
|
|
|
```shell-session
|
|
$ vault operator raft list-peers
|
|
Node Address State Voter
|
|
---- ------- ----- -----
|
|
node1 node1.vault.local:8201 follower true
|
|
node2 node2.vault.local:8201 follower true
|
|
node3 node3.vault.local:8201 leader true
|
|
```
|
|
|
|
## Server-to-Server Communication
|
|
|
|
Once nodes are joined to one another they begin to communicate using mTLS over
|
|
Vault's cluster port. The cluster port defaults to `8201`. The TLS information
|
|
is exchanged at join time and is rotated on a cadence.
|
|
|
|
A requirement for integrated storage is that the
|
|
[cluster_addr](/docs/concepts/ha#per-node-cluster-address) configuration option
|
|
is set. This allows Vault to assign an address to the node ID at join time.
|
|
|
|
## Outage Recovery
|
|
|
|
### Quorum Maintained
|
|
|
|
This section outlines the steps to take when a single server or multiple servers
|
|
are in a failed state but quorum is still maintained. This means the remaining
|
|
alive servers are still operational, can elect a leader, and are able to process
|
|
write requests.
|
|
|
|
If the failed server is recoverable, the best option is to bring it back online
|
|
and have it reconnect to the cluster with the same host address. This will return
|
|
the cluster to a fully healthy state.
|
|
|
|
If this is impractical, you need to remove the failed server. Usually, you can
|
|
issue a remove-peer command to remove the failed server if it's still a member
|
|
of the cluster.
|
|
|
|
If the remove-peer command isn't possible or you'd rather manually re-write the
|
|
cluster membership a `raft/peers.json` file can be written to the configured
|
|
data directory.
|
|
|
|
### Quorum Lost
|
|
|
|
In the event that multiple servers are lost, causing a loss of quorum and a
|
|
complete outage, partial recovery is still possible.
|
|
|
|
If the failed servers are recoverable, the best option is to bring them back
|
|
online and have them reconnect to the cluster using the same host addresses.
|
|
This will return the cluster to a fully healthy state.
|
|
|
|
If the failed servers are not recoverable, partial recovery is possible using
|
|
data on the remaining servers in the cluster. There may be data loss in this
|
|
situation because multiple servers were lost, so information about what's
|
|
committed could be incomplete. The recovery process implicitly commits all
|
|
outstanding Raft log entries, so it's also possible to commit data that was
|
|
uncommitted before the failure.
|
|
|
|
See the section below on manual recovery using peers.json for details of the
|
|
recovery procedure. You include only the remaining servers in the
|
|
raft/peers.json recovery file. The cluster should be able to elect a leader once
|
|
the remaining servers are all restarted with an identical raft/peers.json
|
|
configuration.
|
|
|
|
Any servers you introduce later can be fresh with totally clean data
|
|
directories and joined using Vault's join command.
|
|
|
|
In extreme cases, it should be possible to recover with just a single remaining
|
|
server by starting that single server with itself as the only peer in the
|
|
raft/peers.json recovery file.
|
|
|
|
### Manual Recovery Using peers.json
|
|
|
|
Using raft/peers.json for recovery can cause uncommitted Raft log entries to be
|
|
implicitly committed, so this should only be used after an outage where no other
|
|
option is available to recover a lost server. Make sure you don't have any
|
|
automated processes that will put the peers file in place on a periodic basis.
|
|
|
|
To begin, stop all remaining servers.
|
|
|
|
The next step is to go to the [configured data
|
|
path](/docs/configuration/storage/raft/#path) of each Vault server. Inside that
|
|
directory, there will be a raft/ sub-directory. We need to create a
|
|
raft/peers.json file. The file should be formatted as a JSON array containing
|
|
the node ID, address:port, and suffrage information of each Vault server you
|
|
wish to be in the cluster.
|
|
|
|
```json
|
|
[
|
|
{
|
|
"id": "node1",
|
|
"address": "node1.vault.local:8201",
|
|
"non_voter": false
|
|
},
|
|
{
|
|
"id": "node2",
|
|
"address": "node2.vault.local:8201",
|
|
"non_voter": false
|
|
},
|
|
{
|
|
"id": "node3",
|
|
"address": "node3.vault.local:8201",
|
|
"non_voter": false
|
|
}
|
|
]
|
|
```
|
|
|
|
- `id` `(string: <required>)` - Specifies the node ID of the server. This can be
|
|
found in the config file, or inside the node-id file in the server's data
|
|
directory if it was auto-generated.
|
|
- `address` `(string: <required>)` - Specifies the host and port of the server. The
|
|
port is the server's cluster port.
|
|
- `non_voter` `(bool: <false>)` - This controls whether the server is a non-voter.
|
|
If omitted, it will default to false, which is typical for most clusters. This
|
|
is an enterprise only feature.
|
|
|
|
Create entries for all servers. You must confirm that servers you do not
|
|
include here have indeed failed and will not later rejoin the cluster. Ensure
|
|
that this file is the same across all remaining server nodes.
|
|
|
|
At this point, you can restart all the remaining servers. The cluster should be
|
|
in an operable state again. One of the nodes should claim leadership and become
|
|
active.
|
|
|
|
### Other Recovery Methods
|
|
|
|
For other, non-quorum related recovery [Vault's
|
|
recovery](/docs/concepts/recovery-mode/) mode can be used.
|