Fixes #743
4.1 KiB
layout | page_title | sidebar_current | description |
---|---|---|---|
docs | DNS Caching | docs-guides-dns-cache | One of the main interfaces to Consul is DNS. Using DNS is a simple way to integrate Consul into an existing infrastructure without any high-touch integration. |
DNS Caching
One of the main interfaces to Consul is DNS. Using DNS is a simple way to integrate Consul into an existing infrastructure without any high-touch integration.
By default, Consul serves all DNS results with a 0 TTL value. This prevents any caching. The advantage is that each DNS lookup is always re-evaluated, so the most timely information is served. However, this adds a latency hit for each lookup and can potentially exhaust the query throughput of a cluster.
For this reason, Consul provides a number of tuning parameters that can customize how DNS queries are handled.
Stale Reads
Stale reads can be used to reduce latency and increase the throughput of DNS queries. By default, all reads are serviced by a single leader node. These reads are strongly consistent but are limited by the throughput of a single node. Doing a stale read allows any Consul server to service a query, but non-leader nodes may return data that is out-of-date. By allowing data to be slightly stale, we get horizontal read scalability. Now any Consul server can service the request, so we increase throughput by the number of servers in a cluster.
The settings used to control stale reads
are dns_config.allow_stale
,
which must be set to enable stale reads, and
dns_config.max_stale
which limits how stale results are allowed to be.
By default, allow_stale
is disabled,
meaning no stale results may be served. The default for
max_stale
is 5 seconds. This means that
if allow_stale
is enabled, we will use
data from any Consul server that is within 5 seconds of the leader.
Negative Response Caching
Some DNS clients cache negative responses - that is, Consul returning a "not found" style response because a service exists but there are no healthy endpoints. What this means in practice is that cached negative responses may mean that services appear "down" for longer than they are actually unavailable when using DNS for service discovery.
One common example is that Windows will default to caching negative responses for 15 minutes. DNS forwarders may also cache negative responses, with the same effect. To avoid this problem, check the negative response cache defaults for your client operating system and any DNS forwarder on the path between the client and Consul and set the cache values appropriately. In many cases "appropriately" simply is turning negative response caching off to get the best recovery time when a service becomes available again.
TTL Values
TTL values can be set to allow DNS results to be cached downstream of Consul. Higher TTL values reduce the number of lookups on the Consul servers and speed lookups for clients, at the cost of increasingly stale results. By default, all TTLs are zero, preventing any caching.
To enable caching of node lookups (e.g. "foo.node.consul"), we can set the
dns_config.node_ttl
value. This can be set to
"10s" for example, and all node lookups will serve results with a 10 second TTL.
Service TTLs can be specified in a more granular fashion. You can set TTLs
per-service, with a wildcard TTL as the default. This is specified using the
dns_config.service_ttl
map. The "*"
service is the wildcard service.
For example, a dns_config
that provides
a wildcard TTL and a specific TTL for a service might look like this:
{
"dns_config": {
"service_ttl": {
"*": "5s",
"web": "30s"
}
}
}
This sets all lookups to "web.service.consul" to use a 30 second TTL while lookups to "db.service.consul" or "api.service.consul" will use the 5 second TTL from the wildcard.