This is needed in case the client proxy is in TransparentProxy mode.
Typically they won't have explicit configuration for every upstream, so
this ensures the settings can be applied to all of them when generating
xDS config.
New clients in transparent proxy mode can send requests for service
config resolution without any upstream args because they do not have
explicitly defined upstreams.
Old clients on the other hand will never send requests without the
Upstreams args unless they don't have upstreams, in which case we do not
send back upstream config.
As part of this change the indexer will now be case insensitive by using
the lower case value. This should be safe because previously we always
had lower case strings.
This change was made out of convenience. All the other indexers use
lowercase, so we can re-use the indexFromQuery function by using
lowercase here as well.
Previously we were encoding the UUID as a string, but the index it references uses a UUID
so this index can also use an encoded UUID to save a bit of memory.
Prefix queries are generally being used to match part of a partial
index. We can support these indexes by using a function that accept
different types for each subset of the index.
What I found interesting is that in the generic StringFieldIndexer the
implementation for PrefixFromArgs would remove the trailing null, but
at least in these 2 cases we actually want a null terminated string.
We simply want fewer components in the string.
The TestServiceHealthEventsFromChanges function was over 1400 lines.
Attempting to debug test failures in test functions this large is
difficult. It requires scrolling to the line which defines the testcase
because the failure message only includes the line number of the
assertion, not the line number of the test case.
This is an excellent example of where test tables stop working well, and
start being a problem. To mitigate this problem, the runCase pattern can
be used. When one of these tests fails, a failure message will print the
line number of both the test case and the assertion. This allows a
developer to quickly jump to both of the relevant lines, signficanting
reducing the time it takes to debug test failures.
For example, one such failure could look like this:
catalog_events_test.go:1610: case: service reg, new node
catalog_events_test.go:1605: assertion failed: values are not equal
ResolveServiceConfig is called by service manager before the proxy
registration is in the catalog. Therefore we should pass proxy
registration flags in the request rather than trying to fetch
them from the state store (where they may not exist yet).
This is done because after removing ID and NodeName from
ServiceConfigRequest we will no longer know whether a request coming in
is for a Consul client earlier than v1.10.
This enables it to be called for many upstreams or downstreams of a
service while only querying intentions once.
Additionally, decisions are now optionally denied due to L7 permissions
being present. This enables the function to be used to filter for
potential upstreams/downstreams of a service.
Previously this type was defined in structs, but unlike the other types in structs this type
is not used by RPC requests. By moving it to state we can better indicate that this is not
an API type, but part of the state implementation.
I added this recently without realizing that the method already existed and was named
NamespaceOrEmpty. Replace all calls to GetNamespace with NamespaceOrEmpty or NamespaceOrDefault
as appropriate.
Document that this comparison should roughly match MatchesKey
Only sort by overrideKey or service name, but not both
Add namespace to the sort.
The client side also builds a map of these based on the namespace/node/service key, so the only order
that really matters is the ordering of register/dereigster events.
Refactored out a function that can be used for both the snapshot and stream of events to translate
an event into an appropriate connect event.
Previously terminating gateway events would have used the wrong key in the snapshot, which would have
caused them to be filtered out later on.
Also removed an unused function, and some commented out code.
Health of a terminating gateway instance changes
- Generate an event for creating/destroying this instance of the terminating gateway,
duplicate it for each affected service
Co-Authored-By: Kyle Havlovitz <kylehav@gmail.com>
These new functional indexers provide a few advantages:
1. enterprise differences can be isolated to a single function (the
indexer function), making code easier to change
2. as a consequence of (1) we no longer need to wrap all the calls to
Txn operations, making code easier to read.
3. by removing reflection we should increase the performance of all
operations.
One important change is in making all the function signatures the same.
https://blog.golang.org/errors-are-values
An extra boolean return value for SingleIndexer.FromObject is superfluous.
The error value can indicate when the index value could not be created.
By removing this extra return value we can use the same signature for both
indexer functions.
This has the nice properly of a function being usable for both indexing operations.
By using a new pattern for more specific indexes. This allows us to use
the same index for both service checks and node checks. It removes the
abstraction around memdb.Txn operations, and isolates all of the
enterprise differences in a single place (the indexer).
Previously a snapshot created as part of a resumse-stream request could have incorrectly
cached the newSnapshotToFollow event. This would cause clients to error because they
received an unexpected framing event.
This fixes an issue where leaf certificates issued in primary
datacenters using Vault as a Connect CA would be reissued very
frequently (every ~20 seconds) because the logic meant to detect root
rotation was errantly triggering.
The hash of the rootCA was being compared against a hash of the
intermediateCA and always failing. This doesn't apply to the Consul
built-in CA provider because there is no intermediate in use in the
primary DC.
This is reminiscent of #6513
registerSchema creates some indirection which is not necessary in this
case. newDBSchema can call each of the tables.
Enterprise tables can be added from the existing withEnterpriseSchema
shim.
This allows setting ForceWithoutCrossSigning when reconfiguring the CA
for any provider, in order to forcibly move to a new root in cases where
the old provider isn't reachable or able to cross-sign for whatever
reason.
This way we only have to wait for the serf barrier to pass once before
we can make use of federation state APIs Without this patch every
restart needs to re-compute the change.
Deleting from memdb inside an interation can cause a panic from Iterator.Next. This
case is technically safe (for now) because the iterator is using the root radix tree
not a modified one.
However this could break at any time if someone adds an insert or delete to the coordinates table
before this place in the function.
It also sets a bad example, because generally deletes in an interator are not safe. So this
commit uses the pattern we have in other places to move the deletes out of the iteration.
After fixing that bug I uncovered a couple more:
Fix an issue where we might try to cross sign a cert when we never had a valid root.
Fix a potential issue where reconfiguring the CA could cause either the Vault or AWS PCA CA providers to delete resources that are still required by the new incarnation of the CA.
Using withEnterpriseSchema() we can apply any enterprise schema changes
with a single shim, removing the need to duplicate all of the table
definitions.
Also move all the catalog schemas to a new file to shrink catalog.go a bit.
I believe this commit also fixes a bug. Previously RPCMaxConnsPerClient was not being re-read from the RuntimeConfig, so passing it to Server.ReloadConfig was never changing the value.
Also improve the test runtime by not doing a lot of unnecessary work.
* Fix bug in usage metrics that caused a negative count to occur
There were a couple of instances were usage metrics would do the wrong
thing and result in incorrect counts, causing the count to attempt to
decrement below zero and return an error. The usage metrics did not
account for various places where a single transaction could
delete/update/add multiple service instances at once.
We also remove the error when attempting to decrement below zero, and
instead just make sure we do not accidentally underflow the unsigned
integer. This is a more graceful failure than returning an error and not
allowing a transaction to commit.
* Add changelog
These types are used as values (not pointers) in other structs. Using a pointer receiver causes
problems when the value is printed. fmt will not call the String method if it is passed a value
and the String method has a pointer receiver. By using a value receiver the correct string is printed.
Also remove some unused methods.
This way we only have to wait for the serf barrier to pass once before
we can upgrade to v2 acls. Without this patch every restart needs to
re-compute the change, and potentially if a stray older node joins after
a migration it might regress back to v1 mode which would be problematic.
This can happen when one other node in the cluster such as a client is unable to communicate with the leader server and sees it as failed. When that happens its failing status eventually gets propagated to the other servers in the cluster and eventually this can result in RPCs returning “No cluster leader” error.
That error is misleading and unhelpful for determing the root cause of the issue as its not raft stability but rather and client -> server networking issue. Therefore this commit will add a new error that will be returned in that case to differentiate between the two cases.
Previously the tokens would fail to insert into the secondary's state
store because the AuthMethod field of the ACLToken did not point to a
known auth method from the primary.
Add a skip condition to all tests slower than 100ms.
This change was made using `gotestsum tool slowest` with data from the
last 3 CI runs of master.
See https://github.com/gotestyourself/gotestsum#finding-and-skipping-slow-tests
With this change:
```
$ time go test -count=1 -short ./agent
ok github.com/hashicorp/consul/agent 0.743s
real 0m4.791s
$ time go test -count=1 -short ./agent/consul
ok github.com/hashicorp/consul/agent/consul 4.229s
real 0m8.769s
```
* server: fix panic when deleting a non existent intention
* add changelog
* Always return an error when deleting non-existent ixn
Co-authored-by: freddygv <gh@freddygv.xyz>
A vulnerability was identified in Consul and Consul Enterprise (“Consul”) such that operators with `operator:read` ACL permissions are able to read the Consul Connect CA configuration when explicitly configured with the `/v1/connect/ca/configuration` endpoint, including the private key. This allows the user to effectively privilege escalate by enabling the ability to mint certificates for any Consul Connect services. This would potentially allow them to masquerade (receive/send traffic) as any service in the mesh.
--
This PR increases the permissions required to read the Connect CA's private key when it was configured via the `/connect/ca/configuration` endpoint. They are now `operator:write`.
* ci: stop building darwin/386 binaries
Go 1.15 drops support for 32-bit binaries on Darwin https://golang.org/doc/go1.15#darwin
* tls: ConnectionState::NegotiatedProtocolIsMutual is deprecated in Go 1.15, this value is always true
* correct error messages that changed slightly
* Completely regenerate some TLS test data
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
The Intention.Apply RPC is quite large, so this PR attempts to break it down into smaller functions and dissolves the pre-config-entry approach to the breakdown as it only confused things.
Header is: X-Consul-Default-ACL-Policy=<allow|deny>
This is of particular utility when fetching matching intentions, as the
fallthrough for a request that doesn't match any intentions is to
enforce using the default acl policy.
The Catalog, Config Entry, KV and Session resources potentially re-validate the input as its coming in. We need to prevent snapshot restoration failures due to missing namespaces or namespaces that are being deleted in enterprise.
1. do a state store query to list intentions as the agent would do over in `agent/proxycfg` backing `agent/xds`
2. upgrade the database and do a fresh `service-intentions` config entry write
3. the blocking query inside of the agent cache in (1) doesn't notice (2)
Makes Payload a type with FilterByKey so that Payloads can implement
filtering by key. With this approach we don't need to expose a Namespace
field on Event, and we don't need to invest micro formats or require a
bunch of code to be aware of exactly how the key field is encoded.
The output of the previous assertions made it impossible to debug the tests without code changes.
With go-cmp comparing the entire slice we can see the full diffs making it easier to debug failures.
Previously config entries sharing a kind & name but in different
namespaces could occasionally cause "stuck states" in replication
because the namespace fields were ignored during the differential
comparison phase.
Example:
Two config entries written to the primary:
kind=A,name=web,namespace=bar
kind=A,name=web,namespace=foo
Under the covers these both get saved to memdb, so they are sorted by
all 3 components (kind,name,namespace) during natural iteration. This
means that before the replication code does it's own incomplete sort,
the underlying data IS sorted by namespace ascending (bar comes before
foo).
After one pass of replication the primary and secondary datacenters have
the same set of config entries present. If
"kind=A,name=web,namespace=bar" were to be deleted, then things get
weird. Before replication the two sides look like:
primary: [
kind=A,name=web,namespace=foo
]
secondary: [
kind=A,name=web,namespace=bar
kind=A,name=web,namespace=foo
]
The differential comparison phase walks these two lists in sorted order
and first compares "kind=A,name=web,namespace=foo" vs
"kind=A,name=web,namespace=bar" and falsely determines they are the SAME
and are thus cause an update of "kind=A,name=web,namespace=foo". Then it
compares "<nothing>" with "kind=A,name=web,namespace=foo" and falsely
determines that the latter should be DELETED.
During reconciliation the deletes are processed before updates, and so
for a brief moment in the secondary "kind=A,name=web,namespace=foo" is
erroneously deleted and then immediately restored.
Unfortunately after this replication phase the final state is identical
to the initial state, so when it loops around again (rate limited) it
repeats the same set of operations indefinitely.
Required also converting some of the transaction functions to WriteTxn
because TxnRO() called the same helper as TxnRW.
This change allows us to return a memdb.Txn for read-only txn instead of
wrapping them with state.txn.
* Consul Service meta wrongly computes and exposes non_voter meta
In Serf Tags, entreprise members being non-voters use the tag
`nonvoter=1`, not `non_voter = false`, so non-voters in members
were wrongly displayed as voter.
Demonstration:
```
consul members -detailed|grep voter
consul20-hk5 10.200.100.110:8301 alive acls=1,build=1.8.4+ent,dc=hk5,expect=3,ft_fs=1,ft_ns=1,id=xxxxxxxx-5629-08f2-3a79-10a1ab3849d5,nonvoter=1,port=8300,raft_vsn=3,role=consul,segment=<all>,use_tls=1,vsn=2,vsn_max=3,vsn_min=2,wan_join_port=8302
```
* Added changelog
* Added changelog entry
Previously, we would emit service usage metrics both with and without a
namespace label attached. This is problematic in the case when you want
to aggregate metrics together, i.e. "sum(consul.state.services)". This
would cause services to be counted twice in that aggregate, once via the
metric emitted with a namespace label, and once in the metric emited
without any namespace label.
This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it.
This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.
Extend Consul’s intentions model to allow for request-based access control enforcement for HTTP-like protocols in addition to the existing connection-based enforcement for unspecified protocols (e.g. tcp).
- Upgrade the ConfigEntry.ListAll RPC to be kind-aware so that older
copies of consul will not see new config entries it doesn't understand
replicate down.
- Add shim conversion code so that the old API/CLI method of interacting
with intentions will continue to work so long as none of these are
edited via config entry endpoints. Almost all of the read-only APIs will
continue to function indefinitely.
- Add new APIs that operate on individual intentions without IDs so that
the UI doesn't need to implement CAS operations.
- Add a new serf feature flag indicating support for
intentions-as-config-entries.
- The old line-item intentions way of interacting with the state store
will transparently flip between the legacy memdb table and the config
entry representations so that readers will never see a hiccup during
migration where the results are incomplete. It uses a piece of system
metadata to control the flip.
- The primary datacenter will begin migrating intentions into config
entries on startup once all servers in the datacenter are on a version
of Consul with the intentions-as-config-entries feature flag. When it is
complete the old state store representations will be cleared. We also
record a piece of system metadata indicating this has occurred. We use
this metadata to skip ALL of this code the next time the leader starts
up.
- The secondary datacenters continue to run the old intentions
replicator until all servers in the secondary DC and primary DC support
intentions-as-config-entries (via serf flag). Once this condition it met
the old intentions replicator ceases.
- The secondary datacenters replicate the new config entries as they are
migrated in the primary. When they detect that the primary has zeroed
it's old state store table it waits until all config entries up to that
point are replicated and then zeroes its own copy of the old state store
table. We also record a piece of system metadata indicating this has
occurred. We use this metadata to skip ALL of this code the next time
the leader starts up.
This adds a new very tiny memdb table and corresponding raft operation
for updating a very small effective map[string]string collection of
"system metadata". This can persistently record a fact about the Consul
state machine itself.
The first use of this feature will come in a later PR.
Reduce Jitter to one function
Rename NewRetryWaiter
Fix a bug in calculateWait where maxWait was applied before jitter, which would make it
possible to wait longer than maxWait.
This really only matters for unit tests, since typically if an agent shuts down its server, it follows that up by exiting the process, which would also clean up all of the networking anyway.
The subscribe endpoint needs to be able to inspect the payload to filter
events, and convert them into the protobuf types.
Use the protobuf CatalogOp type for the operation field, for now. In the
future if we end up with multiple interfaces we should be able to remove
the protobuf dependency by changing this to an int32 and adding a test
for the mapping between the values.
Make the value of the payload a concrete type instead of interface{}. We
can create other payloads for other event types.
Rename GRPCClient to ClientConnPool. This type appears to be more of a
conn pool than a client. The clients receive the connections from this
pool.
Reduce some dependencies by adjusting the interface baoundaries.
Remove the need to create a second slice of Servers, just to pick one and throw the rest away.
Unexport serverResolver, it is not used outside the package.
Use a RWMutex for ServerResolverBuilder, some locking is read-only.
Add more godoc.
* fix lessThanHalfTime
* get lock for CAProvider()
* make a var to relate both vars
* rename to getCAProviderWithLock
* move CertificateTimeDriftBuffer to agent/connect/ca
In an upcoming change we will need to pass a grpc.ClientConnPool from
BaseDeps into Server. While looking at that change I noticed all of the
existing consulOption fields are already on BaseDeps.
Instead of duplicating the fields, we can create a struct used by
agent/consul, and use that struct in BaseDeps. This allows us to pass
along dependencies without translating them into different
representations.
I also looked at moving all of BaseDeps in agent/consul, however that
created some circular imports. Resolving those cycles wouldn't be too
bad (it was only an error in agent/consul being imported from
cache-types), however this change seems a little better by starting to
introduce some structure to BaseDeps.
This change is also a small step in reducing the scope of Agent.
Also remove some constants that were only used by tests, and move the
relevant comment to where the live configuration is set.
Removed some validation from NewServer and NewClient, as these are not
really runtime errors. They would be code errors, which will cause a
panic anyway, so no reason to handle them specially here.
secondaryIntermediateCertRenewalWatch was using `retryLoopBackoff` to
renew the intermediate certificate. Once it entered the inner loop and
started `retryLoopBackoff` it would never leave that.
`retryLoopBackoffAbortOnSuccess` will return when renewing is
successful, like it was intended originally.
The nodeCheck slice was being used as the first arg in append, which in some cases will modify the array backing the slice. This would lead to service checks for other services in the wrong event.
Also refactor some things to reduce the arguments to functions.
Creating a new readTxn does not work because it will not see the newly created objects that are about to be committed. Instead use the active write Txn.
Whenever an upsert/deletion of a config entry happens, within the open
state store transaction we speculatively test compile all discovery
chains that may be affected by the pending modification to verify that
the write would not create an erroneous scenario (such as splitting
traffic to a subset that did not exist).
If a single discovery chain evaluation references two config entries
with the same kind and name in different namespaces then sometimes the
upsert/deletion would be falsely rejected. It does not appear as though
this bug would've let invalid writes through to the state store so the
correction does not require a cleanup phase.
This commit refactors the state store usage code to track unique service
name changes on transaction commit. This means we only need to lookup
usage entries when reading the information, as opposed to iterating over
a large number of service indices.
- Take into account a service instance's name being changed
- Do not iterate through entire list of service instances, we only care
about whether there is 0, 1, or more than 1.
We add a WriteTxn interface for use in updating the usage memdb table,
with the forward-looking prospect of incrementally converting other
functions to accept interfaces.
As well, we use the ReadTxn in new usage code, and as a side effect
convert a couple of existing functions to use that interface as well.
Using the newly provided state store methods, we periodically emit usage
metrics from the servers.
We decided to emit these metrics from all servers, not just the leader,
because that means we do not have to care about leader election flapping
causing metrics turbulence, and it seems reasonable for each server to
emit its own view of the state, even if they should always converge
rapidly.
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.
Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:
```json
[
{
"WAN": true,
"Datacenter": "dc2",
"Segment": "",
"Keys": {
"0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
},
"PrimaryKeys": {
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
},
"NumNodes": 6
},
{
"WAN": false,
"Datacenter": "dc2",
"Segment": "",
"Keys": {
"0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"PrimaryKeys": {
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"NumNodes": 8
},
{
"WAN": false,
"Datacenter": "dc1",
"Segment": "",
"Keys": {
"0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"PrimaryKeys": {
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"NumNodes": 8
}
]
```
I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output
Fixes#3393.
Fixes#8466
Since Consul 1.8.0 there was a bug in how ingress gateway protocol
compatibility was enforced. At the point in time that an ingress-gateway
config entry was modified the discovery chain for each upstream was
checked to ensure the ingress gateway protocol matched. Unfortunately
future modifications of other config entries were not validated against
existing ingress-gateway definitions, such as:
1. create tcp ingress-gateway pointing to 'api' (ok)
2. create service-defaults for 'api' setting protocol=http (worked, but not ok)
3. create service-splitter or service-router for 'api' (worked, but caused an agent panic)
If you were to do these in a different order, it would fail without a
crash:
1. create service-defaults for 'api' setting protocol=http (ok)
2. create service-splitter or service-router for 'api' (ok)
3. create tcp ingress-gateway pointing to 'api' (fail with message about
protocol mismatch)
This PR introduces the missing validation. The two new behaviors are:
1. create tcp ingress-gateway pointing to 'api' (ok)
2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat)
3. (NEW) create service-splitter or service-router for 'api' (fail with
message about protocol mismatch)
In consideration for any existing users that may be inadvertently be
falling into item (2) above, that is now officiall a valid configuration
to be in. For anyone falling into item (3) above while you cannot use
the API to manufacture that scenario anymore, anyone that has old (now
bad) data will still be able to have the agent use them just enough to
generate a new agent/proxycfg error message rather than a panic.
Unfortunately we just don't have enough information to properly fix the
config entries.
* changes some functions to return data instead of modifying pointer
arguments
* renames globalRPC() to keyringRPCs() to make its purpose more clear
* restructures KeyringOperation() to make it more understandable
NotifyShutdown was only used for testing. Now that t.Cleanup exists, we
can use that instead of attaching cleanup to the Server shutdown.
The Autopilot test which used NotifyShutdown doesn't need this
notification because Shutdown is synchronous. Waiting for the function
to return is equivalent.
Ensure that enabling AutoConfig sets the tls configurator properly
This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.