Updates made by Michael Lucas (#4855)

Since this PR was created on behalf of EA and I approve all the changes, I'm merging this.
This commit is contained in:
Yoko 2018-07-02 08:56:15 -07:00 committed by GitHub
parent 3a5d5cb872
commit dcb229d36d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 58 additions and 31 deletions

BIN
website/source/assets/images/vault-ref-arch-9.png (Stored with Git LFS) Normal file

Binary file not shown.

View File

@ -32,8 +32,8 @@ storage backend for production deployments.
## <a name="one-dc"></a>Deployment Topology within One Datacenter
This section explains how to deploy a Vault open source cluster in one datacenter.
Once a Vault cluster is running successfully within one datacenter,
move on to deploying Vault Enterprise across [multiple datacenters](#multi-dc).
Support for [multiple datacenters](#multi-dc) is included in Vault Enterprise through
cluster replication.
### Reference Diagram
@ -47,7 +47,9 @@ provides flexibility and resilience. Consul servers are separate
from the Vault servers so that software upgrades are easier to perform. Additionally,
separate Consul and Vault servers allows for separate sizing for each.
Vault to Consul backend connectivity is over HTTP and should be
secured with TLS as well as a Consul token to provide encryption of all traffic.
secured with TLS as well as a Consul token to provide encryption of all traffic.
> **NOTE:** Refer to the online documentation to learn more about running [Consul in encrypted mode](https://www.consul.io/docs/agent/options.html#encrypt).
#### Failure Tolerance
@ -55,10 +57,13 @@ Typical distribution in a cloud environment is to spread Consul/Vault nodes into
separate Availability Zones (AZs) within a high bandwidth, low latency network,
such as an AWS Region. The diagram below shows Vault and Consul spread between
AZs, with Consul servers in Redundancy Zone configurations, promoting a single
voting member per AZ, providing both Zone and Node level failure protection.
voting member per AZ, providing both Zone and Node level failure protection.
> **NOTE:** Refer to the online documentation to learn more about the [Consul leader election process](https://www.consul.io/docs/guides/leader-election.html).
![Failure tolerance|40%](/assets/images/vault-ref-arch-3.png)
### Network Connectivity Details
![Network Connectivity Details](/assets/images/vault-ref-arch.png)
@ -66,8 +71,8 @@ voting member per AZ, providing both Zone and Node level failure protection.
### Deployment System Requirements
The following table provides guidelines for server sizing. Of particular note is
the strong recommendation to avoid non-fixed performance CPUs, or Burstable
CPU in AWS terms, such as T-series instances.
the strong recommendation to avoid non-fixed performance CPUs, or "Burstable
CPU" in AWS terms, such as T-series instances.
#### Sizing for Vault Servers
@ -75,10 +80,10 @@ CPU” in AWS terms, such as T-series instances.
|-------|----------|-----------------|-----------|--------------------------------------------|
| Small | 2 core | 4-8 GB RAM | 25 GB | **AWS:** m5.large |
| | | | | **Azure:** Standard_A2_v2, Standard_A4_v2 |
| | | | | **GCE:** n1-standard-4, n1-standard-8 |
| | | | | **GCE:** n1-standard-2, n1-standard-4 |
| Large | 4-8 core | 16-32 GB RAM | 50 GB | **AWS:** m5.xlarge, m5.2xlarge |
| | | | | **Azure:** Standard_D4_v3, Standard_D8_v3 |
| | | | | **GCE:** n1-standard-16, n1-standard-32 |
| | | | | **GCE:** n1-standard-8, n1-standard-16 |
#### Sizing for Consul Servers
@ -86,10 +91,10 @@ CPU” in AWS terms, such as T-series instances.
|-------|----------|-----------------|-----------|--------------------------------------------|
| Small | 2 core | 8-16 GB RAM | 50 GB | **AWS:** m5.large, m5.xlarge |
| | | | | **Azure:** Standard_A4_v2, Standard_A8_v2 |
| | | | | **GCE:** n1-standard-8, n1-standard-16 |
| | | | | **GCE:** n1-standard-4, n1-standard-8 |
| Large | 4-8 core | 32-64+ GB RAM | 100 GB | **AWS:** m5.2xlarge, m5.4xlarge |
| | | | | **Azure:** Standard_D4_v3, Standard_D5_v3 |
| | | | | **GCE:** n1-standard-32, n1-standard-64 |
| | | | | **GCE:** n1-standard-16, n1-standard-32 |
### Hardware Considerations
@ -113,12 +118,12 @@ used.
Consul servers function in this deployment is to serve as the storage backend
for Vault. This means that all content stored for persistence in Vault is
encrypted by Vault, and written to the storage backend at rest. This data is
written to the key-value store section of Consuls Service Catalog, which is
written to the key-value store section of Consul's Service Catalog, which is
required to be stored in its entirety in-memory on each Consul server. This
means that memory can be a constraint in scaling as more clients authenticate to
Vault, more secrets are persistently stored in Vault, and more temporary secrets
are leased from Vault. This also has the effect of requiring vertical scaling on
Consul servers memory if additional space is required, as the entire Service
Consul server's memory if additional space is required, as the entire Service
Catalog is stored in memory on each Consul server.
Furthermore, network throughput is a common consideration for Vault and Consul
@ -128,6 +133,14 @@ Consul cluster members, communications with external systems (per auth or secret
engine configuration, and some audit logging configurations) and responses
consume network bandwidth.
Due to network performance considerations in Consul cluster operations,
replication of Vault datasets across network boundaries should be achieved
through Performance or DR Replication, rather than spreading the Consul cluster
across network and physical boundaries. If a single consul cluster is spread
across network segments that are distant or inter-regional, this can cause
synchronization issues within the cluster or additional data transfer charges
in some cloud providers.
### Other Considerations
[Vault Production Hardening Recommendations](/guides/operations/production.html)
@ -150,8 +163,10 @@ operate just as a typical DNS resolution operation.
### <a name="external-lb"></a>Load Balancing Using External Load Balancer
![Vault Behind a Load Balancer](/assets/images/vault-ref-arch-9.png)
External load balancers are supported as well, and would be placed in front of the
Vault cluster, and would poll specific Vault URLs to detect the active node and
Vault cluster, and would poll specific Vault URL's to detect the active node and
route traffic accordingly. An HTTP request to the active node with the following
URL will respond with a 200 status: `http://<Vault Node URL>:8200/v1/sys/health`
@ -183,31 +198,22 @@ listen vault
#### Client IP Address Handling
Vault does not support X-Forwarded-For at this time due to security concerns, as
an attacker can perform a header injection to impersonate another client. Vault
does have PROXY v1 protocol support. This is well supported by
[HAProxy](https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt) and [AWS
ELBs](http://docs.aws.amazon.com/elasticloadbalancing/latest/classic/enable-proxy-protocol.html).
Another [HAProxy blog post](http://www.haproxy.com/blog/haproxy/proxy-protocol/)
with some good info on PROXY protocol. It is worth pointing out that [F5 load
balancers do support it as
well](https://devcentral.f5.com/codeshare/proxy-protocol-initiator).
Note that the Vault listener must be [properly
configured](/docs/configuration/listener/tcp.html#proxy_protocol_behavior)
to support this functionality.
There are two supported methods for handling client IP addressing behind a proxy
or load balancer;
[X-Forwarded-For Headers](https://www.vaultproject.io/docs/configuration/listener/tcp.html#x_forwarded_for_authorized_addrs)
and [PROXY v1](https://www.vaultproject.io/docs/configuration/listener/tcp.html#proxy_protocol_authorized_addrs). Both require a trusted load balancer and require IP address whitelisting to
adhere to security best practices.
### High Availability
A Vault cluster is the high-available unit of deployment within one datacenter.
A Vault cluster is the highly-available unit of deployment within one datacenter.
A recommended approach is three Vault servers with a Consul storage backend.
With this configuration, during a Vault server outage, failover is handled
immediately without human intervention. To learn more about setting up your
Vault servers in HA mode, read [_Vault HA with
Consul_](/guides/operations/vault-ha-consul.html) guide.
High-availability and data locality across datacenters requires
High-availability and data-locality across datacenters requires
Vault Enterprise.
@ -232,7 +238,7 @@ Vault performance replication allows for secrets management across many sites.
Secrets, authentication methods, authorization policies and other details are
replicated to be active and available in multiple locations.
Refer to the [Vault Mount Filter](/guides/operations/mount-filter.html) guide
> **NOTE:** Refer to the [Vault Mount Filter](/guides/operations/mount-filter.html) guide
about filtering out secret engines from being replicated across regions.
#### Disaster Recovery Replication
@ -241,7 +247,8 @@ Vault disaster recovery replication ensures that a standby Vault cluster is kept
synchronized with an active Vault cluster. This mode of replication includes
data such as ephemeral authentication tokens, time-based token information as
well as token usage data. This provides for aggressive recovery point objective
in environments where high availability is of the utmost concern.
in environments where preventing loss of ephemeral operational data is of the
utmost concern.
#### Cross-Region Disaster Recovery
@ -265,7 +272,21 @@ scenario.
![Replication Pattern](/assets/images/vault-ref-arch-7.png)
> **NOTE:** Refer to the [Vault Disaster Recovery Setup](/guides/operations/disaster-recovery.html) guide for additional information.
#### Corruption or Sabotage Disaster Recovery
Another common scenario to protect against, more prevalent in cloud environments
that provide very high levels of intrinsic resiliency, might be the purposeful
or accidental corruption of data and configuration, and or a loss of cloud account
control. Vault's DR Replication is designed to replicate live data, which would
propagate intentional or accidental data corruption or deletion. To protect against
these possibilities, you should backup Vault's storage backend. This is supported
through the Consul Snapshot feature, which can be automated for regular archival
backups. A cold site or new infrastructure could be re-hydrated from a Consul
snapshot.
> **NOTE:** Refer to the online documentation to learn more about [Consul snapshots](https://www.consul.io/docs/commands/snapshot.html).
#### Replication Notes
@ -306,6 +327,9 @@ each Vault component
- Refer to the [AppRole Pull
Authentication](/guides/identity/authentication.html) guide to programmatically
generate a token for a machine or app
- Consul is an integral part of running a resilient Vault cluster, regardless of
location. Refer to the online [Consul documentation](https://www.consul.io/intro/getting-started/install.html) to
learn more.
## Next steps