Refactor dns to have same behavior between A and SRV.
Current implementation returns the node name instead of the service
address.
With this fix when querying for SRV record service address is return in
the SRV record.
And when performing a simple dns lookup it returns a CNAME to the
service address.
This allows addresses to be tagged at the service level similar to what we allow for nodes already. The address translation that can be enabled with the `translate_wan_addrs` config was updated to take these new addresses into account as well.
The DNS config parameters `recursors` and `dns_config.*` are now hot
reloaded on SIGHUP or `consul reload` and do not need an agent restart
to be modified.
Config is stored in an atomic.Value and loaded at the beginning of each
request. Reloading only affects requests that start _after_ the
reload. Ongoing requests are not affected. To match the current
behavior the recursor handler is loaded and unloaded as needed on config
reload.
* Fix race condition in DNS when using cache
The healty node filtering was modifying the result from the cache, which
caused a crash when multiple queries were made to the same service
simultaneously.
We now copy the node slice before filtering to ensure we do not modify
the data stored in the cache.
* Fix wording in dns cache config doc
s/dns_max_age/cache_max_age/
Adds two new configuration parameters "dns_config.use_cache" and
"dns_config.cache_max_age" controlling how DNS requests use the agent
cache when querying servers.
* Avoid to have infinite recursion in DNS lookups when resolving CNAMEs
This will avoid killing Consul when a Service.Address is using CNAME
to a Consul CNAME that creates an infinite recursion.
This will fix https://github.com/hashicorp/consul/issues/4907
* Use maxRecursionLevel = 3 to allow several recursions
* bugfix: use ServiceTags to generate cahce key hash
* update unit test
* update
* remote print log
* Update .gitignore
* Completely deprecate ServiceTag field internally for clarity
* Add explicit test for CacheInfo cases
This implements parts of RFC 7871 where Consul is acting as an authoritative name server (or forwarding resolver when recursors are configured)
If ECS opt is present in the request we will mirror it back and return a response with a scope of 0 (global) or with the same prefix length as the request (indicating its valid specifically for that subnet).
We only mirror the prefix-length (non-global) for prepared queries as those could potentially use nearness checks that could be affected by the subnet. In the future we could get more sophisticated with determining the scope bits and allow for better caching of prepared queries that don’t rely on nearness checks.
The other thing this does not do is implement the part of the ECS RFC related to originating ECS headers when acting as a intermediate DNS server (forwarding resolver). That would take a quite a bit more effort and in general provide very little value. Consul will currently forward the ECS headers between recursors and the clients transparently, we just don't originate them for non-ECS clients to get potentially more accurate "location aware" results.
* Implementation of Weights Data structures
Adding this datastructure will allow us to resolve the
issues #1088 and #4198
This new structure defaults to values:
```
{ Passing: 1, Warning: 0 }
```
Which means, use weight of 0 for a Service in Warning State
while use Weight 1 for a Healthy Service.
Thus it remains compatible with previous Consul versions.
* Implemented weights for DNS SRV Records
* DNS properly support agents with weight support while server does not (backwards compatibility)
* Use Warning value of Weights of 1 by default
When using DNS interface with only_passing = false, all nodes
with non-Critical healthcheck used to have a weight value of 1.
While having weight.Warning = 0 as default value, this is probably
a bad idea as it breaks ascending compatibility.
Thus, we put a default value of 1 to be consistent with existing behaviour.
* Added documentation for new weight field in service description
* Better documentation about weights as suggested by @banks
* Return weight = 1 for unknown Check states as suggested by @banks
* Fixed typo (of -> or) in error message as requested by @mkeeler
* Fixed unstable unit test TestRetryJoin
* Fixed unstable tests
* Fixed wrong Fatalf format in `testrpc/wait.go`
* Added notes regarding DNS SRV lookup limitations regarding number of instances
* Documentation fixes and clarification regarding SRV records with weights as requested by @banks
* Rephrase docs
* Fixes the DNS recursor properly resolving the requests
* Added a test case for the recursor bug
* Refactored code && added a test case for all failing recursors
* Inner indentation moved into else if check
This also changes where the enforcement of the enable_additional_node_meta_txt configuration gets applied.
formatNodeRecord returns the main RRs and the meta/TXT RRs in separate slices. Its then up to the caller to add to the appropriate sections or not.
This just makes sure that if multiple services are registered with unique service addresses that we don’t blast back multiple CNAMEs for the same service DNS name and keeps us within the DNS specs.
Will fix https://github.com/hashicorp/consul/issues/4036
Instead of removing one by one the entries, find the optimal
size using binary search.
For SRV records, with 5k nodes, duration of DNS lookups is
divided by 4 or more.
Update docs a little
Update/add tests. Make sure all the various ways of determining the source IP work
Update X-Forwarded-For header parsing. This can be a comma separated list with the first element being the original IP so we now handle csv data there.
Got rid of error return from sourceAddrFromRequest