b76a56d40c
* migrates nav data format and updates docs pages * removes sidebar_title from content files
188 lines
11 KiB
Plaintext
188 lines
11 KiB
Plaintext
---
|
||
layout: docs
|
||
page_title: Replication - Vault Enterprise
|
||
description: >-
|
||
Vault Enterprise has support for Replication, allowing critical data to be
|
||
replicated across clusters to support horizontally scaling and disaster
|
||
recovery workloads.
|
||
---
|
||
|
||
# Vault Enterprise Replication
|
||
|
||
## Overview
|
||
|
||
-> **Note**: All versions of [Vault Enterprise](https://www.hashicorp.com/products/vault/)
|
||
have support for Disaster Recovery replication. Performance Replication requires the
|
||
Multi-Datacenter & Scale module.
|
||
|
||
Many organizations have infrastructure that spans multiple datacenters. Vault
|
||
provides the critical services of identity management, secrets storage, and
|
||
policy management. This functionality is expected to be highly available and
|
||
to scale as the number of clients and their functional needs increase; at the
|
||
same time, operators would like to ensure that a common set of policies are
|
||
enforced globally, and a consistent set of secrets and keys are exposed to
|
||
applications that need to interoperate.
|
||
|
||
Vault replication addresses both of these needs in providing consistency,
|
||
scalability, and highly-available disaster recovery.
|
||
|
||
Note: Using replication requires a storage backend that supports transactional updates, such as Consul.
|
||
|
||
## Architecture
|
||
|
||
The core unit of Vault replication is a **cluster**, which is comprised of a
|
||
collection of Vault nodes (an active and its corresponding HA nodes). Multiple Vault
|
||
clusters communicate in a one-to-many near real-time flow.
|
||
|
||
Replication operates on a leader/follower model, wherein a leader cluster (known as a
|
||
**primary**) is linked to a series of follower **secondary** clusters. The primary
|
||
cluster acts as the system of record and asynchronously replicates most Vault data.
|
||
|
||
All communication between primaries and secondaries is end-to-end encrypted
|
||
with mutually-authenticated TLS sessions, setup via replication tokens which are
|
||
exchanged during bootstrapping.
|
||
|
||
What data is replicated between the primary and secondary depends on the type of
|
||
replication that is configured between the primary and secondary. These types
|
||
of relationships are either **disaster recovery** or **performance**
|
||
relationships.
|
||
|
||
## Performance Replication and Disaster Recovery (DR) Replication
|
||
|
||
_Performance Replication_:
|
||
In performance replication, secondaries keep track of their own tokens and leases
|
||
but share the underlying configuration, policies, and supporting secrets (K/V values,
|
||
encryption keys for `transit`, etc).
|
||
|
||
If a user action would modify underlying shared state, the secondary forwards the request
|
||
to the primary to be handled; this is transparent to the client. In practice, most
|
||
high-volume workloads (reads in the `kv` backend, encryption/decryption operations
|
||
in `transit`, etc.) can be satisfied by the local secondary, allowing Vault to scale
|
||
relatively horizontally with the number of secondaries rather than vertically as
|
||
in the past.
|
||
|
||
_Disaster Recovery (DR) Replication_:
|
||
In disaster recovery (or DR) replication, secondaries share the same underlying configuration,
|
||
policy, and supporting secrets (K/V values, encryption keys for `transit`, etc) infrastructure
|
||
as the primary. They also share the same token and lease infrastructure as the primary, as
|
||
they are designed to allow for continuous operations with applications connecting to the
|
||
original primary on the election of the DR secondary.
|
||
|
||
DR is designed to be a mechanism to protect against catastrophic failure of entire clusters.
|
||
They do not forward service read or write requests until they are elected and become a new primary.
|
||
|
||
| Capability | Disaster Recovery | Performance |
|
||
| -------------------------------------------------------------------------------------------------------------------- | ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||
| Mirrors the configuration of a primary cluster | Yes | Yes |
|
||
| Mirrors the configuration of a primary cluster’s backends (i.e.: auth methods, secrets engines, audit devices, etc.) | Yes | Yes |
|
||
| Mirrors the tokens and leases for applications and users interacting with the primary cluster | Yes | No. Secondaries keep track of their own tokens and leases. When the secondary is promoted, applications must reauthenticate and obtain new leases from the newly-promoted primary. |
|
||
| Allows the secondary cluster to handle client requests | No | Yes |
|
||
|
||
For more information on the capabilities of performance and disaster recovery replication, see the Vault Replication [API Documentation](/api/system/replication).
|
||
|
||
## Primary and Secondary Cluster Compatibility
|
||
|
||
### Storage engines
|
||
|
||
There is no requirement that both clusters use the same storage engine.
|
||
|
||
### Seals
|
||
|
||
There is no requirement that both clusters use the same seal type, but see
|
||
[sealwrap](/docs/enterprise/sealwrap#seal-wrap-and-replication) for the full
|
||
details.
|
||
|
||
Also note that enabling replication will modify the secondary seal.
|
||
If the secondary uses an auto seal, its recovery configuration and keys
|
||
will be replaced; if it uses shamir, its seal configuration and unseal
|
||
keys will be replaced. Here seal/recovery configuration means the number of
|
||
seal/recovery key fragments and the required threshold of those
|
||
fragments.
|
||
|
||
| Primary seal | Secondary seal | Impact on secondary of enabling replication |
|
||
| ------------ | -------------- | ------------------------------------------------------------------------------------------ |
|
||
| Shamir | Shamir | Seal config and unseal keys replaced with primary's |
|
||
| Shamir | Auto | Seal recovery config and recovery keys replaced with primary's seal config and unseal keys |
|
||
| Auto | Auto | Seal recovery config and recovery keys replaced with primary's |
|
||
| Auto | Shamir | Seal config and unseal keys replaced with primary's recovery seal config and recovery keys |
|
||
|
||
### Vault versions
|
||
|
||
Vault changes are designed and tested to ensure that the
|
||
[upgrade instructions](/docs/upgrading#replication-installations) are viable, i.e.
|
||
that a secondary can run a newer Vault version than its primary.
|
||
|
||
That said, we do not recommend running replicated Vault clusters with different
|
||
versions any longer than necessary to perform the upgrade.
|
||
|
||
## Internals
|
||
|
||
Details on the internal design of the replication feature can be found in the
|
||
[replication internals](/docs/internals/replication) document.
|
||
|
||
## Security Model
|
||
|
||
Vault is trusted all over the world to keep secrets safe. As such, we have put
|
||
extreme focus to detail to our replication model as well.
|
||
|
||
### Primary/Secondary Communication
|
||
|
||
When a cluster is marked as the primary it generates a self-signed CA
|
||
certificate. On request, and given a user-specified identifier, the primary
|
||
uses this CA certificate to generate a private key and certificate and packages
|
||
these, along with some other information, into a replication bootstrapping
|
||
bundle, a.k.a. a secondary activation token. The certificate is used to perform
|
||
TLS mutual authentication between the primary and that secondary.
|
||
|
||
This CA certificate is never shared with secondaries, and no secondary ever has
|
||
access to any other secondary’s certificate. In practice this means that
|
||
revoking a secondary’s access to the primary does not allow it continue
|
||
replication with any other machine; it also means that if a primary goes down,
|
||
there is full administrative control over which cluster becomes primary. An
|
||
attacker cannot spoof a secondary into believing that a cluster the attacker
|
||
controls is the new primary without also being able to administratively direct
|
||
the secondary to connect by giving it a new bootstrap package (which is an
|
||
ACL-protected call).
|
||
|
||
Vault makes use of Application Layer Protocol Negotiation on its cluster port.
|
||
This allows the same port to handle both request forwarding and replication,
|
||
even while keeping the certificate root of trust and feature set different.
|
||
|
||
### Secondary Activation Tokens
|
||
|
||
A secondary activation token is an extremely sensitive item and as such is
|
||
protected via response wrapping. Experienced Vault users will note that the
|
||
wrapping format for replication bootstrap packages is different from normal
|
||
response wrapping tokens: it is a signed JWT. This allows the replication token
|
||
to carry the redirect address of the primary cluster as part of the token. In
|
||
most cases this means that simply providing the token to a new secondary is
|
||
enough to activate replication, although this can also be overridden when the
|
||
token is provided to the secondary.
|
||
|
||
Secondary activation tokens should be treated like Vault root tokens. If
|
||
disclosed to a bad actor, that actor can gain access to all Vault data. It
|
||
should therefore be treated with utmost sensitivity. Like all
|
||
response-wrapping tokens, once the token is used successfully (in this case, to
|
||
activate a secondary) it is useless, so it is only necessary to safeguard it
|
||
from one machine to the next. Like with root tokens, HashiCorp recommends that
|
||
when a secondary activation token is live, there are multiple eyes on it from
|
||
generation until it is used.
|
||
|
||
Once a secondary is activated, its cluster information is stored safely behind
|
||
its encrypted barrier.
|
||
|
||
## Learn
|
||
|
||
Refer to the following tutorials replication setup and best practices:
|
||
|
||
- [Setting up Performance Replication](https://learn.hashicorp.com/vault/operations/ops-replication)
|
||
- [Disaster Recovery Replication Setup](https://learn.hashicorp.com/vault/operations/ops-disaster-recovery)
|
||
- [Performance Replication with Mount Filters](https://learn.hashicorp.com/vault/operations/mount-filter)
|
||
- [Monitoring Vault Replication](https://learn.hashicorp.com/vault/operations/monitor-replication)
|
||
|
||
## API
|
||
|
||
The Vault replication component has a full HTTP API. Please see the
|
||
[Vault Replication API](/api/system/replication) for more
|
||
details.
|