Adds XL machine spec and notes on large deployments (#4622)

* Adds XL machine spec and notes on large deployments
* Clarifies machine sizes
* Fixes internal links within the document
* Moves datacenter size guidelines to "Single Datacenter" section
This commit is contained in:
Geoffrey Grosenbach 2018-08-31 10:41:48 -05:00 committed by GitHub
parent 8a6276ad8e
commit 36fa155675
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 8 additions and 1 deletions

View File

@ -26,11 +26,16 @@ The following instance configurations are recommended.
|Large|4-8 core|32-64+ GB RAM|100 GB|**AWS**: m5.2xlarge, m5.4xlarge|
|||||**Azure**: Standard\_D4\_v3, Standard\_D5\_v3|
|||||**GCE**: n1-standard-32, n1-standard-64|
|XL|12-24 core|64+ GB RAM|SSD|**AWS**: m5d.4xlarge|
|||||**Azure**: Standard\_D16\_v3, Standard\_D32\_v3|
|||||**GCE**: n1-standard-32, n1-standard-64|
The **small size** instance configuration is appropriate for most initial production deployments, or for development/testing environments. The large size is for production environments where there is a consistently high workload. Suggested instance types are provided for common platforms, but do refer to platform documentation for up-to-date instance types that align with the recommended resources.
~> **NOTE** For large workloads, ensure that the disks support a high number of IOPS to keep up with the rapid Raft log update rate.
For a write-heavy and/or a read-heavy cluster, the number of clients may need to be reduced further with considerations for the impact of the number of services and/or watches registered and the number and size of KV pairs. Alternately, large scale read requests can be achieved by increasing the number of non-voting servers ([Enterprise feature](/docs/enterprise/read-scale/index.html)) while maintaining the recommended number of servers (3 or 5) in the quorum. See [Performance Tuning](#performance-tuning) for more recommendations for read-heavy clusters.
## Datacenter Design
Consul may be deployed in a single physical datacenter or it may span multiple datacenters.
@ -47,6 +52,8 @@ Typically, there must be three or five servers to balance between availability a
Consul is proven to work well with up to `5,000` nodes in a single datacenter gossip pool. Some deployments have stretched this number much further but they require care and testing to ensure they remain reliable and can converge their cluster state quickly. It's highly recommended that clusters are increased in size gradually when approaching or exceeding `5,000` nodes.
Consul can support larger single datacenter cluster sizes by tuning the [gossip parameters](/docs/agent/options.html#gossip_lan) and ensuring Consul agents -- particularly servers -- are running on sufficient hardware. There are real production users of Consul running with greater than 25,000 nodes in a single datacenter today by tuning these parameters. [XL server instances](#system-requirements) or better are required to achieve this scale.
~> For write-heavy clusters, consider scaling vertically with larger machine instances and lower latency storage.
In cases where a full mesh among all agents cannot be established due to network segmentation, Consuls own [network segments](/docs/enterprise/network-segments/index.html) can be used. Network segments is an Enterprise feature that allows the creation of multiple tenants which share Raft servers in the same cluster. Each tenant has its own gossip pool and doesnt communicate with the agents outside this pool. The KV store, however, is shared between all tenants. If Consul network segments cannot be used, isolation between agents can be accomplished by creating discrete [Consul datacenters](/docs/guides/datacenters.html).
@ -87,7 +94,7 @@ In a larger network that spans L2 segments, traffic typically traverses through
|HTTP API|8500|`-1` to disable|Used by clients to talk to the HTTP API. TCP only.|
|DNS Interface|8600|`-1` to disable||
-> As mentioned in the [datacenter design section](#1-1-1-single-datacenter), network areas and network segments can be used to prevent opening up firewall ports between different subnets.
-> As mentioned in the [datacenter design section](#single-datacenter), network areas and network segments can be used to prevent opening up firewall ports between different subnets.
By default agents will only listen for HTTP and DNS traffic on the local interface.