This commit copies many of the connect-proxy xds testcases and reuses
for ingress gateways. This allows us to more easily see changes to the
envoy configuration when make updates to ingress gateways.
This makes it so that both OSS and enterprise tests pass correctly
In the api tests, explicitly set namespace to empty string so that tests
can be shared.
On every service registration, we check to see if a service should be
assassociated to a wildcard gateway-service. This fixes an issue where
we did not correctly check to see if the service being registered was a
"typical" service or not.
A few of the unexported functions in agent/cache took a large number of
arguments. These arguments were effectively overrides for values that
were provided in RequestInfo.
By using a struct we can not only reduce the number of arguments, but
also simplify the logic by removing the need for overrides.
Previously the SupportsBlocking option was specified by a method on the
type, and all the other options were specified from RegisterOptions.
This change moves RegisterOptions to a method on the type, and moves
SupportsBlocking into the options struct.
Currently there are only 2 cache-types. So all cache-types can implement
this method by embedding a struct with those predefined values. In the
future if a cache type needs to be registered more than once with different
options it can remove the embedded type and implement the method in a way
that allows for paramaterization.
* Implements a simple, tcp ingress gateway workflow
This adds a new type of gateway for allowing Ingress traffic into Connect from external services.
Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
The Init method provided the same functionality as the New constructor.
The constructor is both more widely used, and more idiomatic, so remove
the Init method.
This change is in preparation for fixing printing of these IDs.
Also reduce the log level of some version checking messages on the server as they can be pretty noisy during upgrades and really are more for debugging purposes.
This should very slightly reduce the amount of memory required to store each item in
the cache.
It will also enable setting different TTLs based on the type of result. For example
we may want to use a shorter TTL when the result indicates the resource does not exist,
as storing these types of records could easily lead to a DOS caused by
OOM.
These two notify functions are very similar. There appear to be just
enough differences that trying to parameterize the differences may not
improve things.
For now, reduce some of the cosmetic differences so that the material
differences are more obvious.
Use named returned so that the caller has a better idea of what these
bools mean.
Return early to reduce the scope, and make it more obvious what values
are returned in which cases. Also reduces the number of conditional
expressions in each case.
The test had two racy bugs related to memdb references.
The first was when we initially populated data and retained the FederationState objects in a slice. Due to how the `inmemCodec` works these were actually the identical objects passed into memdb.
The second was that the `checkSame` assertion function was reading from memdb and setting the RaftIndexes to zeros to aid in equality checks. This was mutating the contents of memdb which is a no-no.
With this fix, the command:
```
i=0; while /usr/local/bin/go test -count=1 -timeout 30s github.com/hashicorp/consul/agent/consul -run '^(TestReplication_FederationStates)$'; do i=$((i + 1)); printf "$i "; done
```
That used to break on my machine in less than 20 runs is now running 150+ times without any issue.
Might also fix#7575
On recent Mac OS versions, the ulimit defaults to 256 by default, but many
systems (eg: some Linux distributions) often limit this value to 1024.
On validation of configuration, Consul now validates that the number of
allowed files descriptors is bigger than http_max_conns_per_client.
This make some unit tests failing on Mac OS.
Use a less important value in unit test, so tests runs well by default
on Mac OS without need for tuning the OS.
* Enable filtering language support for the v1/connect/intentions listing API
* Update website for filtering of Intentions
* Update website/source/api/connect/intentions.html.md
This change moves all the typeEntry lookups to the first step in the exported methods,
and makes unexporter internals accept the typeEntry struct.
This change is primarily intended to make it easier to extract the container of caches
from the Cache type.
It may incidentally reduce locking in fetch, but that was not a goal.
Sometimes, in the CI, it could receive a SIGURG, producing this line:
FAIL: TestForwardSignals/signal-interrupt (0.06s)
util_test.go:286: expected to read line "signal: interrupt" but got "signal: urgent I/O condition"
Only forward the signals we test to avoid this kind of false positive
Example of such unstable errors in CI:
https://circleci.com/gh/hashicorp/consul/153571
Exposing checks is supposed to allow a Consul agent bound to a different
IP address (e.g., in a different Kubernetes pod) to access healthchecks
through the proxy while the underlying service binds to localhost. This
is an important security feature that makes sure no external traffic
reaches the service except through the proxy.
However, as far as I can tell, this is subtly broken in the case where
the Consul agent cannot reach the proxy over localhost.
If a proxy is configured with: `{ LocalServiceAddress: "127.0.0.1",
Checks: true }`, as is typical with a sidecar proxy, the Consul checks
are currently rewritten to `127.0.0.1:<random port>`. A Consul agent
that does not share the loopback address cannot reach this address. Just
to make sure I was not misunderstanding, I tried configuring the proxy
with `{ LocalServiceAddress: "<pod ip>", Checks: true }`. In this case,
while the checks are rewritten as expected and the agent can reach the
dynamic port, the proxy can no longer reach its backend because the
traffic is no longer on the loopback interface.
I think rewriting the checks to use `proxy.Address`, the proxy's own
address, is more correct in this case. That is the IP where the proxy
can be reached, both by other proxies and by a Consul agent running on
a different IP. The local service address should continue to use
`127.0.0.1` in most cases.
If a proxied service is a gRPC or HTTP2 service, but a path is exposed
using the HTTP1 or TCP protocol, Envoy should not be configured with
`http2ProtocolOptions` for the cluster backing the path.
A situation where this comes up is a gRPC service whose healthcheck or
metrics route (e.g. for Prometheus) is an HTTP1 service running on
a different port. Previously, if these were exposed either using
`Expose: { Checks: true }` or `Expose: { Paths: ... }`, Envoy would
still be configured to communicate with the path over HTTP2, which would
not work properly.
I spent some time today on my local Mac to figure out why Consul 1.6.3+
was not accepting limits.http_max_conns_per_client.
This adds an explicit check on number of file descriptors to be sure
it might work (this is no guarantee as if many clients are reaching
the agent, it might consume even more file descriptors)
Anyway, many users are fighting with RLIMIT_NOFILE, having a clear
message would allow them to figure out what to fix.
Example of message (reload or start):
```
2020-03-11T16:38:37.062+0100 [ERROR] agent: Error starting agent: error="system allows a max of 512 file descriptors, but limits.http_max_conns_per_client: 8192 needs at least 8212"
```
Due to merge #7562, upstream does not compile anymore.
Error is:
ERRO Running error: gofmt: analysis skipped: errors in package: [/Users/p.souchay/go/src/github.com/hashicorp/consul/agent/config_endpoint_test.go:188:33: too many arguments]