Fault tolerance is a system's ability to operate without interruption despite component failure. Learn how a set of Consul servers provide fault tolerance through use of a quorum, and how to further improve control plane resilience through use of infrastructure zones and Enterprise redundancy zones.
Effectively mitigating your risk is more nuanced than just increasing the fault tolerance
metric described above. You must consider:
### Correlated Risks
Are you protected against correlated risks? Infrastructure-level failures can cause multiple servers to fail at the same time. This means that a single infrastructure-level failure could cause a Consul outage, even if your server-level fault tolerance is 2.
### Mitigation Costs
What are the costs of the mitigation? Different mitigation options present different trade-offs for operational complexity, computing cost, and Consul request performance.
## Strategies to Increase Fault Tolerance
The following sections explore several options for increasing Consul's fault tolerance.
HashiCorp recommends all production deployments consider:
- [Spreading Consul servers across availability zones](#spread-servers-across-infrastructure-availability-zones)
- <EnterpriseAlert inline /><a href="#use-backup-voting-servers-to-replace-lost-voters">Using backup voting servers to replace lost voters</a>
### Spread Servers Across Infrastructure Availability Zones
The cloud or on-premise infrastructure underlying your [Consul datacenter](/docs/install/glossary#datacenter)
may be split into several "availability zones".
An availability zone is meant to share no points of failure with other zones by:
- Having power, cooling, and networking systems independent from other zones
- Being physically distant enough from other zones so that large-scale disruptions
such as natural disasters (flooding, earthquakes) are very unlikely to affect multiple zones
Availability zones are available in the regions of most cloud providers and in some on-premise installations.
If possible, spread your Consul voting servers across 3 availability zones
to protect your Consul datacenter from a single zone-level failure.
For example, if deploying 5 Consul servers across 3 availability zones, place no more than 2 servers in each zone.
If one zone fails, at most 2 servers are lost and quorum will be maintained by the 3 remaining servers.
To distribute your Consul servers across availability zones, modify your infrastructure configuration with your infrastructure provider. No change is needed to your Consul server's agent configuration.