a3dfde5cec
* conversion stage 1 * correct image paths * add sidebar title to frontmatter * docs/concepts and docs/internals * configuration docs and multi-level nav corrections * commands docs, index file corrections, small item nav correction * secrets converted * auth * add enterprise and agent docs * add extra dividers * secret section, wip * correct sidebar nav title in front matter for apu section, start working on api items * auth and backend, a couple directory structure fixes * remove old docs * intro side nav converted * reset sidebar styles, add hashi-global-styles * basic styling for nav sidebar * folder collapse functionality * patch up border length on last list item * wip restructure for content component * taking middleman hacking to the extreme, but its working * small css fix * add new mega nav * fix a small mistake from the rebase * fix a content resolution issue with middleman * title a couple missing docs pages * update deps, remove temporary markup * community page * footer to layout, community page css adjustments * wip downloads page * deps updated, downloads page ready * fix community page * homepage progress * add components, adjust spacing * docs and api landing pages * a bunch of fixes, add docs and api landing pages * update deps, add deploy scripts * add readme note * update deploy command * overview page, index title * Update doc fields Note this still requires the link fields to be populated -- this is solely related to copy on the description fields * Update api_basic_categories.yml Updated API category descriptions. Like the document descriptions you'll still need to update the link headers to the proper target pages. * Add bottom hero, adjust CSS, responsive friendly * Add mega nav title * homepage adjustments, asset boosts * small fixes * docs page styling fixes * meganav title * some category link corrections * Update API categories page updated to reflect the second level headings for api categories * Update docs_detailed_categories.yml Updated to represent the existing docs structure * Update docs_detailed_categories.yml * docs page data fix, extra operator page remove * api data fix * fix makefile * update deps, add product subnav to docs and api landing pages * Rearrange non-hands-on guides to _docs_ Since there is no place for these on learn.hashicorp, we'll put them under _docs_. * WIP Redirects for guides to docs * content and component updates * font weight hotfix, redirects * fix guides and intro sidenavs * fix some redirects * small style tweaks * Redirects to learn and internally to docs * Remove redirect to `/vault` * Remove `.html` from destination on redirects * fix incorrect index redirect * final touchups * address feedback from michell for makefile and product downloads
70 lines
3.2 KiB
Markdown
70 lines
3.2 KiB
Markdown
---
|
|
layout: "docs"
|
|
page_title: "High Availability"
|
|
sidebar_title: "High Availability"
|
|
sidebar_current: "docs-internals-ha"
|
|
description: |-
|
|
Learn about the high availability design of Vault.
|
|
---
|
|
|
|
# High Availability
|
|
|
|
Vault is primarily used in production environments to manage secrets.
|
|
As a result, any downtime of the Vault service can affect downstream clients.
|
|
Vault is designed to support a highly available deploy to ensure a machine
|
|
or process failure is minimally disruptive.
|
|
|
|
~> **Advanced Topic!** This page covers technical details
|
|
of Vault. You don't need to understand these details to
|
|
effectively use Vault. The details are documented here for
|
|
those who wish to learn about them without having to go
|
|
spelunking through the source code. However, if you're an
|
|
operator of Vault, we recommend learning about the architecture
|
|
due to the importance of Vault in an environment.
|
|
|
|
# Design Overview
|
|
|
|
The primary design goal in making Vault highly available (HA) was to
|
|
minimize downtime and not horizontal scalability. Vault is typically
|
|
bound by the IO limits of the storage backend rather than the compute
|
|
requirements. This simplifies the HA approach and allows more complex
|
|
coordination to be avoided.
|
|
|
|
Certain storage backends, such as Consul, provide additional coordination
|
|
functions that enable Vault to run in an HA configuration. When supported
|
|
by the backend, Vault will automatically run in HA mode without additional
|
|
configuration.
|
|
|
|
When running in HA mode, Vault servers have two additional states they
|
|
can be in: standby and active. For multiple Vault servers sharing a storage
|
|
backend, only a single instance will be active at any time while all other
|
|
instances are hot standbys.
|
|
|
|
The active server operates in a standard fashion and processes all requests.
|
|
The standby servers do not process requests, and instead redirect to the active
|
|
Vault. Meanwhile, if the active server is sealed, fails, or loses network connectivity
|
|
then one of the standbys will take over and become the active instance.
|
|
|
|
It is important to note that only _unsealed_ servers act as a standby.
|
|
If a server is still in the sealed state, then it cannot act as a standby
|
|
as it would be unable to serve any requests should the active server fail.
|
|
|
|
# Performance Standby Nodes (Enterprise)
|
|
|
|
Performance Standby Nodes are just like traditional High Availability standby
|
|
nodes but they can service read-only requests from users or applications.
|
|
Read-only requests are requests that do not modify Vault's storage. This allows
|
|
for Vault to quickly scale its ability to service these kinds of operations,
|
|
providing near-linear request-per-second scaling in many common scenarios for
|
|
some secrets engines like K/V and Transit. By spreading traffic across
|
|
performance standby nodes, clients can scale these IOPS horizontally to handle
|
|
extremely high traffic workloads.
|
|
|
|
If a request comes into a Performance Standby Node that causes a storage write
|
|
the request will be forwarded onto the active server. If the request is
|
|
read-only the request will be serviced locally on the Performance Standby.
|
|
|
|
Just like traditional HA standbys if the active node is sealed, fails, or loses
|
|
newtwork connectivity then a performance standby can take over and become the
|
|
active instance.
|