Peerings are terminated when a peer decides to delete the peering from
their end. Deleting a peering sends a termination message to the peer
and triggers them to mark the peering as terminated but does NOT delete
the peering itself. This is to prevent peerings from disappearing from
both sides just because one side deleted them.
Previously the Delete endpoint was skipping the deletion if the peering
was not marked as active. However, terminated peerings are also
inactive.
This PR makes some updates so that peerings marked as terminated can be
deleted by users.
We need to watch for changes to peerings and update the server addresses which get served by the ring buffer.
Also, if there is an active connection for a peer, we are getting up-to-date server addresses from the replication stream and can safely ignore the token's addresses which may be stale.
Contains 2 changes to the GetEnvoyBootstrapParams response to support
consul-dataplane.
Exposing node_name and node_id:
consul-dataplane will support providing either the node_id or node_name in its
configuration. Unfortunately, supporting both in the xDS meta adds a fair amount
of complexity (partly because most tables are currently indexed on node_name)
so for now we're going to return them both from the bootstrap params endpoint,
allowing consul-dataplane to exchange a node_id for a node_name (which it will
supply in the xDS meta).
Properly setting service for gateways:
To avoid the need to special case gateways in consul-dataplane, service will now
either be the destination service name for connect proxies, or the gateway
service name. This means it can be used as-is in Envoy configuration (i.e. as a
cluster name or in metric tags).
Consul 1.13.0 changed ServiceVirtualIP to use PeeredServiceName instead of ServiceName which was a breaking change for those using service mesh and wanted to restore their snapshot after upgrading to 1.13.0.
This commit handles existing data with older ServiceName and converts it during restore so that there are no issues when restoring from older snapshots.
1. Create a bexpr filter for performing the filtering
2. Change the state store functions to return the raw (not aggregated)
list of ServiceNodes.
3. Move the aggregate service tags by name logic out of the state store
functions into a new function called from the RPC endpoint
4. Perform the filtering in the endpoint before aggregation.
Previously establishment and pending secrets were only checked at the
RPC layer. However, given that these are Check-and-Set transactions we
should ensure that the given secrets are still valid when persisting a
secret exchange or promotion.
Otherwise it would be possible for concurrent requests to overwrite each
other.
Previously there was a field indicating the operation that triggered a
secrets write. Now there is a message for each operation and it contains
the secret ID being persisted.
Previously the updates to the peering secrets UUID table relied on
inferring what action triggered the update based on a reconciliation
against the existing secrets.
Instead we now explicitly require the operation to be given so that the
inference isn't necessary. This makes the UUID table logic easier to
reason about and fixes some related bugs.
There is also an update so that the peering secrets get handled on
snapshots/restores.
Dialers do not keep track of peering secret UUIDs, so they should not
attempt to clean up data from that table when their peering is deleted.
We also now keep peer server addresses when marking peerings for
deletion. Peer server addresses are used by the ShouldDial() helper
when determining whether the peering is for a dialer or an acceptor.
We need to keep this data so that peering secrets can be cleaned up
accordingly.
* Avoid logging StreamSecretID
* Wrap additional errors in stream handler
* Fix flakiness in leader test and rename servers for clarity. There was
a race condition where the peering was being deleted in the test
before the stream was active. Now the test waits for the stream to be
connected on both sides before deleting the associated peering.
* Run flaky test serially
* defaulting to false because peering will be released as beta
* Ignore peering disabled error in bundles cachetype
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
When we receive a FailedPrecondition error, retry that more quickly
because we expect it will resolve shortly. This is particularly
important in the context of Consul servers behind a load balancer
because when establishing a connection we have to retry until we
randomly land on a leader node.
The default retry backoff goes from 2s, 4s, 8s, etc. which can result in
very long delays quite quickly. Instead, this backoff retries in 8ms
five times, then goes exponentially from there: 16ms, 32ms, ... up to a
max of 8152ms.
There were 16 combinations of tests but 4 of them were duplicates since the default key type and bits were "ec" and 256. That entry was commented out to reduce the subtest count to 12.
testrpc.WaitForLeader was failing on arm64 environments; the cause is unknown but it might be due to the environment being flooded with parallel tests making RPC calls. The RPC polling+retry was replaced with a simpler check for leadership based on raft.
- when register service using catalog endpoint, the key of service
name actually should be "service". Add this information to the
error message will help user to quickly fix in the request.
This is the OSS portion of enterprise PR 2352.
It adds a server-local implementation of the proxycfg.PeeredUpstreams interface
based on a blocking query against the server's state store.
It also fixes an omission in the Virtual IP freeing logic where we were never
updating the max index (and therefore blocking queries against
VirtualIPsForAllImportedServices would not return on service deletion).
* peerstream: dialer should reconnect when stream closes
If the stream is closed unexpectedly (i.e. when we haven't received
a terminated message), the dialer should attempt to re-establish the
stream.
Previously, the `HandleStream` would return `nil` when the stream
was closed. The caller then assumed the stream was terminated on purpose
and so didn't reconnect when instead it was stopped unexpectedly and
the dialer should have attempted to reconnect.
Ensure that the peer stream replication rpc can successfully be used with TLS activated.
Also:
- If key material is configured for the gRPC port but HTTPS is not
enabled now TLS will still be activated for the gRPC port.
- peerstream replication stream opened by the establishing-side will now
ignore grpc.WithBlock so that TLS errors will bubble up instead of
being awkwardly delayed or suppressed
This is the OSS portion of enterprise PR 2242.
This PR introduces a server-local implementation of the proxycfg.ServiceList
interface, backed by streaming events and a local materializer.
Previously, public referred to gRPC services that are both exposed on
the dedicated gRPC port and have their definitions in the proto-public
directory (so were considered usable by 3rd parties). Whereas private
referred to services on the multiplexed server port that are only usable
by agents and other servers.
Now, we're splitting these definitions, such that external/internal
refers to the port and public/private refers to whether they can be used
by 3rd parties.
This is necessary because the peering replication API needs to be
exposed on the dedicated port, but is not (yet) suitable for use by 3rd
parties.
- Use some protobuf construction helper methods for brevity.
- Rename a local variable to avoid later shadowing.
- Rename the Nonce field to be more like xDS's naming.
- Be more explicit about which PeerID fields are empty.
For L4/tcp exported services the mesh gateways will not be terminating
TLS. A caller in one peer will be directly establishing TLS connections
to the ultimate exported service in the other peer.
The caller will be doing SAN validation using the replicated SpiffeID
values shipped from the exporting side. There are a class of discovery
chain edits that could be done on the exporting side that would cause
the introduction of a new SpiffeID value. In between the time of the
config entry update on the exporting side and the importing side getting
updated peer stream data requests to the exported service would fail due
to SAN validation errors.
This is unacceptable so instead prohibit the exporting peer from making
changes that would break peering in this way.
Because peerings are pairwise, between two tuples of (datacenter,
partition) having any exported reference via a discovery chain that
crosses out of the peered datacenter or partition will ultimately not be
able to work for various reasons. The biggest one is that there is no
way in the ultimate destination to configure an intention that can allow
an external SpiffeID to access a service.
This PR ensures that a user simply cannot do this, so they won't run
into weird situations like this.
Peer replication is intended to be between separate Consul installs and
effectively should be considered "external". This PR moves the peer
stream replication bidirectional RPC endpoint to the external gRPC
server and ensures that things continue to function.
Currently servers exchange information about their WAN serf port
and RPC port with serf tags, so that they all learn of each other's
addressing information. We intend to make larger use of the new
public-facing gRPC port exposed on all of the servers, so this PR
addresses that by passing around the gRPC port via serf tags and
then ensuring the generated consul service in the catalog has
metadata about that new port as well for ease of non-serf-based lookup.
These changes are primarily for Consul's UI, where we want to be more
specific about the state a peering is in.
- The "initial" state was renamed to pending, and no longer applies to
peerings being established from a peering token.
- Upon request to establish a peering from a peering token, peerings
will be set as "establishing". This will help distinguish between the
two roles: the cluster that generates the peering token and the
cluster that establishes the peering.
- When marked for deletion, peering state will be set to "deleting".
This way the UI determines the deletion via the state rather than the
"DeletedAt" field.
Co-authored-by: freddygv <freddy@hashicorp.com>
This is the OSS portion of enterprise PR 2157.
It builds on the local blocking query work in #13438 to implement the
proxycfg.IntentionUpstreams interface using server-local data.
Also moves the ACL filtering logic from agent/consul into the acl/filter
package so that it can be reused here.
This is the OSS portion of enterprise PR 2141.
This commit provides a server-local implementation of the `proxycfg.Intentions`
interface that sources data from streaming events.
It adds events for the `service-intentions` config entry type, and then consumes
event streams (via materialized views) for the service's explicit intentions and
any applicable wildcard intentions, merging them into a single list of intentions.
An alternative approach I considered was to consume _all_ intention events (via
`SubjectWildcard`) and filter out the irrelevant ones. This would admittedly
remove some complexity in the `agent/proxycfg-glue` package but at the expense
of considerable overhead from waking potentially many thousands of connect
proxies every time any intention is updated.
This is the OSS portion of enterprise PR 2056.
This commit provides server-local implementations of the proxycfg.ConfigEntry
and proxycfg.ConfigEntryList interfaces, that source data from streaming events.
It makes use of the LocalMaterializer type introduced for peering replication,
adding the necessary support for authorization.
It also adds support for "wildcard" subscriptions (within a topic) to the event
publisher, as this is needed to fetch service-resolvers for all services when
configuring mesh gateways.
Currently, events will be emitted for just the ingress-gateway, service-resolver,
and mesh config entry types, as these are the only entries required by proxycfg
— the events will be emitted on topics named IngressGateway, ServiceResolver,
and MeshConfig topics respectively.
Though these events will only be consumed "locally" for now, they can also be
consumed via the gRPC endpoint (confirmed using grpcurl) so using them from
client agents should be a case of swapping the LocalMaterializer for an
RPCMaterializer.
For initial cluster peering TProxy support we consider all imported services of a partition to be potential upstreams.
We leverage the VirtualIP table because it stores plain service names (e.g. "api", not "api-sidecar-proxy").
When traversing an exported peered service, the discovery chain
evaluation at the other side may re-route the request to a variety of
endpoints. Furthermore we intend to terminate mTLS at the mesh gateway
for arriving peered traffic that is http-like (L7), so the caller needs
to know the mesh gateway's SpiffeID in that case as well.
The following new SpiffeID values will be shipped back in the peerstream
replication:
- tcp: all possible SpiffeIDs resulting from the service-resolver
component of the exported discovery chain
- http-like: the SpiffeID of the mesh gateway
Adds fine-grained node.[node] entries to the index table, allowing blocking queries to return fine-grained indexes that prevent them from returning immediately when unrelated nodes/services are updated.
Co-authored-by: kisunji <ckim@hashicorp.com>
We have many indexer functions in Consul which take interface{} and type assert before building the index. We can use generics to get rid of the initial plumbing and pass around functions with better defined signatures. This has two benefits: 1) Less verbosity; 2) Developers can parse the argument types to memdb schemas without having to introspect the function for the type assertion.
The name of the metric as registered with the metrics library to provide
the help string, was incorrect compared with the actual code that sets
the metric value - bring them into sync.
Also, the help message was incorrect. Rather than copy the help message
from telemetry.mdx, which was correct, but felt a bit unnatural in the
way it was worded, update both of them to a new wording.
Having this type live in the agent/consul package makes it difficult to
put anything that relies on token resolution (e.g. the new gRPC services)
in separate packages without introducing import cycles.
For example, if package foo imports agent/consul for the ACLResolveResult
type it means that agent/consul cannot import foo to register its service.
We've previously worked around this by wrapping the ACLResolver to
"downgrade" its return type to an acl.Authorizer - aside from the
added complexity, this also loses the resolved identity information.
In the future, we may want to move the whole ACLResolver into the
acl/resolver package. For now, putting the result type there at least,
fixes the immediate import cycle issues.
This is only configured in xDS when a service with an L7 protocol is
exported.
They also load any relevant trust bundles for the peered services to
eventually use for L7 SPIFFE validation during mTLS termination.
Adds the merge-central-config query param option to the /catalog/node-services/:node-name API,
to get a service definition in the response that is merged with central defaults (proxy-defaults/service-defaults).
Updated the consul connect envoy command to use this option when
retrieving the proxy service details so as to render the bootstrap configuration correctly.
When our peer deletes the peering it is locally marked as terminated.
This termination should kick off deleting all imported data, but should
not delete the peering object itself.
Keeping peerings marked as terminated acts as a signal that the action
took place.
Once a peering is marked for deletion a new leader routine will now
clean up all imported resources and then the peering itself.
A lot of the logic was grabbed from the namespace/partitions deferred
deletions but with a handful of simplifications:
- The rate limiting is not configurable.
- Deleting imported nodes/services/checks is done by deleting nodes with
the Txn API. The services and checks are deleted as a side-effect.
- There is no "round rate limiter" like with namespaces and partitions.
This is because peerings are purely local, and deleting a peering in
the datacenter does not depend on deleting data from other DCs like
with WAN-federated namespaces. All rate limiting is handled by the
Raft rate limiter.
1. Fix a bug where the peering leader routine would not track all active
peerings in the "stored" reconciliation map. This could lead to
tearing down streams where the token was generated, since the
ConnectedStreams() method used for reconciliation returns all streams
and not just the ones initiated by this leader routine.
2. Fix a race where stream contexts were being canceled before
termination messages were being processed by a peer.
Previously the leader routine would tear down streams by canceling
their context right after the termination message was sent. This
context cancelation could be propagated to the server side faster
than the termination message. Now there is a change where the
dialing peer uses CloseSend() to signal when no more messages will
be sent. Eventually the server peer will read an EOF after receiving
and processing the preceding termination message.
Using CloseSend() is actually not enough to address the issue
mentioned, since it doesn't wait for the server peer to finish
processing messages. Because of this now the dialing peer also reads
from the stream until an error signals that there are no more
messages. Receiving an EOF from our peer indicates that they
processed the termination message and have no additional work to do.
Given that the stream is being closed, all the messages received by
Recv are discarded. We only check for errors to avoid importing new
data.
When deleting a peering we do not want to delete the peering and all
imported data in a single operation, since deleting a large amount of
data at once could overload Consul.
Instead we defer deletion of peerings so that:
1. When a peering deletion request is received via gRPC the peering is
marked for deletion by setting the DeletedAt field.
2. A leader routine will monitor for peerings that are marked for
deletion and kick off a throttled deletion of all imported resources
before deleting the peering itself.
This commit mostly addresses point #1 by modifying the peering service
to mark peerings for deletion. Another key change is to add a
PeeringListDeleted state store function which can return all peerings
marked for deletion. This function is what will be watched by the
deferred deletion leader routine.
Previously, imported data would never be deleted. As
nodes/services/checks were registered and deregistered, resources
deleted from the exporting cluster would accumulate in the imported
cluster.
This commit makes updates to replication so that whenever an update is
received for a service name we reconcile what was present in the catalog
against what was received.
This handleUpdateService method can handle both updates and deletions.
* when enterprise meta are wildcard assume it's a service intention
* fix partition and namespace
* move kind outside the loops
* get the kind check outside the loop and add a comment
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
* update gateway-services table with endpoints
* fix failing test
* remove unneeded config in test
* rename "endpoint" to "destination"
* more endpoint renaming to destination in tests
* update isDestination based on service-defaults config entry creation
* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)
* set unknown state to empty to avoid modifying alot of tests
* fix logic to set the kind correctly on CRUD
* fix failing tests
* add missing tests and fix service delete
* fix failing test
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* fix a bug with kind and add relevant test
* fix compile error
* fix failing tests
* add kind to clone
* fix failing tests
* fix failing tests in catalog endpoint
* fix service dump test
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* remove duplicate tests
* first draft of destinations intention in connect proxy
* remove ServiceDestinationList
* fix failing tests
* fix agent/consul failing tests
* change to filter intentions in the state store instead of adding a field.
* fix failing tests
* fix comment
* fix comments
* store service kind destination and add relevant tests
* changes based on review
* filter on destinations when querying source match
* change state store API to get an IntentionTarget parameter
* add intentions tests
* add destination upstream endpoint
* fix failing test
* fix failing test and a bug with wildcard intentions
* fix failing test
* Apply suggestions from code review
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
* add missing test and clarify doc
* fix style
* gofmt intention.go
* fix merge introduced issue
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
* update gateway-services table with endpoints
* fix failing test
* remove unneeded config in test
* rename "endpoint" to "destination"
* more endpoint renaming to destination in tests
* update isDestination based on service-defaults config entry creation
* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)
* set unknown state to empty to avoid modifying alot of tests
* fix logic to set the kind correctly on CRUD
* fix failing tests
* add missing tests and fix service delete
* fix failing test
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* fix a bug with kind and add relevant test
* fix compile error
* fix failing tests
* add kind to clone
* fix failing tests
* fix failing tests in catalog endpoint
* fix service dump test
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* remove duplicate tests
* first draft of destinations intention in connect proxy
* remove ServiceDestinationList
* fix failing tests
* fix agent/consul failing tests
* change to filter intentions in the state store instead of adding a field.
* fix failing tests
* fix comment
* fix comments
* store service kind destination and add relevant tests
* changes based on review
* filter on destinations when querying source match
* Apply suggestions from code review
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
* fix style
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* rename destinationType to targetType.
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied.
Note: this does not change the callers to use these mesh gateways.
There are a handful of changes in this commit:
* When querying trust bundles for a service we need to be able to
specify the namespace of the service.
* The endpoint needs to track the index because the cache watches use
it.
* Extracted bulk of the endpoint's logic to a state store function
so that index tracking could be tested more easily.
* Removed check for service existence, deferring that sort of work to ACL authz
* Added the cache type
Given that the exported-services config entry can use wildcards, the
precedence for wildcards is handled as with intentions. The most exact
match is the match that applies for any given service. We do not take
the union of all that apply.
Another update that was made was to reflect that only one
exported-services config entry applies to any given service in a
partition. This is a pre-existing constraint that gets enforced by
the Normalize() method on that config entry type.
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.
Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.
This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.
* update gateway-services table with endpoints
* fix failing test
* remove unneeded config in test
* rename "endpoint" to "destination"
* more endpoint renaming to destination in tests
* update isDestination based on service-defaults config entry creation
* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)
* set unknown state to empty to avoid modifying alot of tests
* fix logic to set the kind correctly on CRUD
* fix failing tests
* add missing tests and fix service delete
* fix failing test
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* fix a bug with kind and add relevant test
* fix compile error
* fix failing tests
* add kind to clone
* fix failing tests
* fix failing tests in catalog endpoint
* fix service dump test
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* remove duplicate tests
* rename consts and fix kind when no destination is defined in the service-defaults.
* rename Kind to ServiceKind and change switch to use .(type)
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
OSS port of enterprise PR 1822
Includes the necessary changes to the `proxycfg` and `xds` packages to enable
Consul servers to configure arbitrary proxies using catalog data.
Broadly, `proxycfg.Manager` now has public methods for registering,
deregistering, and listing registered proxies — the existing local agent
state-sync behavior has been moved into a separate component that makes use of
these methods.
When an xDS session is started for a proxy service in the catalog, a goroutine
will be spawned to watch the service in the server's state store and
re-register it with the `proxycfg.Manager` whenever it is updated (and clean
it up when the client goes away).
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Adds a new query param merge-central-config for use with the below endpoints:
/catalog/service/:service
/catalog/connect/:service
/health/service/:service
/health/connect/:service
If set on the request, the response will include a fully resolved service definition which is merged with the proxy-defaults/global and service-defaults/:service config entries (on-demand style). This is useful to view the full service definition for a mesh service (connect-proxy kind or gateway kind) which might not be merged before being written into the catalog (example: in case of services in the agentless model).
The importing peer will need to know what SNI and SPIFFE name
corresponds to each exported service. Additionally it will need to know
at a high level the protocol in use (L4/L7) to generate the appropriate
connection pool and local metrics.
For replicated connect synthetic entities we edit the `Connect{}` part
of a `NodeService` to have a new section:
{
"PeerMeta": {
"SNI": [
"web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul"
],
"SpiffeID": [
"spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web"
],
"Protocol": "tcp"
}
}
This data is then replicated and saved as-is at the importing side. Both
SNI and SpiffeID are slices for now until I can be sure we don't need
them for how mesh gateways will ultimately work.
Occasionally we had seen the TestWatchServers_ACLToken_PermissionDenied be flagged as flaky in circleci. This change should fix that.
Why it fixes it is complicated. The test was failing with a panic when a mocked ACL Resolver was being called more times than expected. I struggled for a while to determine how that could be. This test should call authorize once and only once and the error returned should cause the stream to be terminated and the error returned to the gRPC client. Another oddity was no amount of running this test locally seemed to be able to reproduce the issue. I ran the test hundreds of thousands of time and it always passed.
It turns out that there is nothing wrong with the test. It just so happens that the panic from unexpected invocation of a mocked call happened during the test but was caused by a previous test (specifically the TestWatchServers_StreamLifecycle test)
The stream from the previous test remained open after all the test Cleanup functions were run and it just so happened that when the EventPublisher eventually picked up that the context was cancelled during cleanup, it force closes all subscriptions which causes some loops to be re-entered and the streams to be reauthorized. Its that looping in response to forced subscription closures that causes the mock to eventually panic. All the components, publisher, server, client all operate based on contexts. We cancel all those contexts but there is no syncrhonous way to know when they are stopped.
We could have implemented a syncrhonous stop but in the context of an actual running Consul, context cancellation + async stopping is perfectly fine. What we (Dan and I) eventually thought was that the behavior of grpc streams such as this when a server was shutting down wasn’t super helpful. What we would want is for a client to be able to distinguish between subscription closed because something may have changed requiring re-authentication and subscription closed because the server is shutting down. That way we can send back appropriate error messages to detail that the server is shutting down and not confuse users with potentially needing to resubscribe.
So thats what this PR does. We have introduced a shutting down state to our event subscriptions and the various streaming gRPC services that rely on the event publisher will all just behave correctly and actually stop the stream (not attempt transparent reauthorization) if this particular error is the one we get from the stream. Additionally the error that gets transmitted back through gRPC when this does occur indicates to the consumer that the server is going away. That is more helpful so that a client can then attempt to reconnect to another server.
Treat each exported service as a "discovery chain" and replicate one
synthetic CheckServiceNode for each chain and remote mesh gateway.
The health will be a flattened generated check of the checks for that
mesh gateway node.
Introduces two new public gRPC endpoints (`Login` and `Logout`) and
includes refactoring of the equivalent net/rpc endpoints to enable the
majority of logic to be reused (i.e. by extracting the `Binder` and
`TokenWriter` types).
This contains the OSS portions of the following enterprise commits:
- 75fcdbfcfa6af21d7128cb2544829ead0b1df603
- bce14b714151af74a7f0110843d640204082630a
- cc508b70fbf58eda144d9af3d71bd0f483985893
The primary bug here is in the streaming subsystem that makes the overall v1/health/service/:service request behave incorrectly when servicing a blocking request with a filter provided.
There is a secondary non-streaming bug being fixed here that is much less obvious related to when to update the `reply` variable in a `blockingQuery` evaluation. It is unlikely that it is triggerable in practical environments and I could not actually get the bug to manifest, but I fixed it anyway while investigating the original issue.
Simple reproduction (streaming):
1. Register a service with a tag.
curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
--header 'Content-Type: application/json' \
--data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "a" ], "EnableTagOverride": true }'
2. Do an initial filter query that matches on the tag.
curl -sLi --get 'http://localhost:8500/v1/health/service/test' --data-urlencode 'filter=a in Service.Tags'
3. Note you get one result. Use the `X-Consul-Index` header to establish
a blocking query in another terminal, this should not return yet.
curl -sLi --get 'http://localhost:8500/v1/health/service/test?index=$INDEX' --data-urlencode 'filter=a in Service.Tags'
4. Re-register that service with a different tag.
curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
--header 'Content-Type: application/json' \
--data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "b" ], "EnableTagOverride": true }'
5. Your blocking query from (3) should return with a header
`X-Consul-Query-Backend: streaming` and empty results if it works
correctly `[]`.
Attempts to reproduce with non-streaming failed (where you add `&near=_agent` to the read queries and ensure `X-Consul-Query-Backend: blocking-query` shows up in the results).
* update raft to v1.3.7
* add changelog
* fix compilation error
* fix HeartbeatTimeout
* fix ElectionTimeout to reload only if value is valid
* fix default values for `ElectionTimeout` and `HeartbeatTimeout`
* fix test defaults
* bump raft to v1.3.8
Adds a timeout (deadline) to client RPC calls, so that streams will no longer hang indefinitely in unstable network conditions.
Co-authored-by: kisunji <ckim@hashicorp.com>
* Implement the ServerDiscovery.WatchServers gRPC endpoint
* Fix the ConnectCA.Sign gRPC endpoints metadata forwarding.
* Unify public gRPC endpoints around the public.TraceID function for request_id logging
Adds a new gRPC endpoint to get envoy bootstrap params. The new consul-dataplane service will use this
endpoint to generate an envoy bootstrap configuration.
Whenever autopilot updates its state it notifies Consul. That notification will then trigger Consul to extract out the ready server information. If the ready servers have changed, then an event will be published to notify any subscribers of the full set of ready servers.
All these ready server event things are contained within an autopilotevents package instead of the consul package to make importing them into the grpc related packages possible
Introduces a gRPC endpoint for signing Connect leaf certificates. It's also
the first of the public gRPC endpoints to perform leader-forwarding, so
establishes the pattern of forwarding over the multiplexed internal RPC port.
Previously we had 1 EventPublisher per state.Store. When a state store was closed/abandoned such as during a consul snapshot restore, this had the behavior of force closing subscriptions for that topic and evicting event snapshots from the cache.
The intention of this commit is to keep all that behavior. To that end, the shared EventPublisher now supports the ability to refresh a topic. That will perform the force close + eviction. The FSM upon abandoning the previous state.Store will call RefreshTopic for all the topics with events generated by the state.Store.
Just like standard upstreams the order of applicability in descending precedence:
1. caller's `service-defaults` upstream override for destination
2. caller's `service-defaults` upstream defaults
3. destination's `service-resolver` ConnectTimeout
4. system default of 5s
Co-authored-by: mrspanishviking <kcardenas@hashicorp.com>
* Fixes a lint warning about t.Errorf not supporting %w
* Enable running autopilot on all servers
On the non-leader servers all they do is update the state and do not attempt any modifications.
* Fix the RPC conn limiting tests
Technically they were relying on racey behavior before. Now they should be reliable.
Adds a new gRPC service and endpoint to return the list of supported
consul dataplane features. The Consul Dataplane will use this API to
customize its interaction with that particular server.
Adds a new gRPC streaming endpoint (WatchRoots) that dataplane clients will
use to fetch the current list of active Connect CA roots and receive new
lists whenever the roots are rotated.
This adds an aws-iam auth method type which supports authenticating to Consul using AWS IAM identities.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
* tlsutil: initial implementation of types/TLSVersion
tlsutil: add test for parsing deprecated agent TLS version strings
tlsutil: return TLSVersionInvalid with error
tlsutil: start moving tlsutil cipher suite lookups over to types/tls
tlsutil: rename tlsLookup to ParseTLSVersion, add cipherSuiteLookup
agent: attempt to use types in runtime config
agent: implement b.tlsVersion validation in config builder
agent: fix tlsVersion nil check in builder
tlsutil: update to renamed ParseTLSVersion and goTLSVersions
tlsutil: fixup TestConfigurator_CommonTLSConfigTLSMinVersion
tlsutil: disable invalid config parsing tests
tlsutil: update tests
auto_config: lookup old config strings from base.TLSMinVersion
auto_config: update endpoint tests to use TLS types
agent: update runtime_test to use TLS types
agent: update TestRuntimeCinfig_Sanitize.golden
agent: update config runtime tests to expect TLS types
* website: update Consul agent tls_min_version values
* agent: fixup TLS parsing and compilation errors
* test: fixup lint issues in agent/config_runtime_test and tlsutil/config_test
* tlsutil: add CHACHA20_POLY1305 cipher suites to goTLSCipherSuites
* test: revert autoconfig tls min version fixtures to old format
* types: add TLSVersions public function
* agent: add warning for deprecated TLS version strings
* agent: move agent config specific logic from tlsutil.ParseTLSVersion into agent config builder
* tlsutil(BREAKING): change default TLS min version to TLS 1.2
* agent: move ParseCiphers logic from tlsutil into agent config builder
* tlsutil: remove unused CipherString function
* agent: fixup import for types package
* Revert "tlsutil: remove unused CipherString function"
This reverts commit 6ca7f6f58d268e617501b7db9500113c13bae70c.
* agent: fixup config builder and runtime tests
* tlsutil: fixup one remaining ListenerConfig -> ProtocolConfig
* test: move TLS cipher suites parsing test from tlsutil into agent config builder tests
* agent: remove parseCiphers helper from auto_config_endpoint_test
* test: remove unused imports from tlsutil
* agent: remove resolved FIXME comment
* tlsutil: remove TODO and FIXME in cipher suite validation
* agent: prevent setting inherited cipher suite config when TLS 1.3 is specified
* changelog: add entry for converting agent config to TLS types
* agent: remove FIXME in runtime test, this is covered in builder tests with invalid tls9 value now
* tlsutil: remove config tests for values checked at agent config builder boundary
* tlsutil: remove tls version check from loadProtocolConfig
* tlsutil: remove tests and TODOs for logic checked in TestBuilder_tlsVersion and TestBuilder_tlsCipherSuites
* website: update search link for supported Consul agent cipher suites
* website: apply review suggestions for tls_min_version description
* website: attempt to clean up markdown list formatting for tls_min_version
* website: moar linebreaks to fix tls_min_version formatting
* Revert "website: moar linebreaks to fix tls_min_version formatting"
This reverts commit 38585927422f73ebf838a7663e566ac245f2a75c.
* autoconfig: translate old values for TLSMinVersion
* agent: rename var for translated value of deprecated TLS version value
* Update agent/config/deprecated.go
Co-authored-by: Dan Upton <daniel@floppy.co>
* agent: fix lint issue
* agent: fixup deprecated config test assertions for updated warning
Co-authored-by: Dan Upton <daniel@floppy.co>
Looks like something got munged at some point. Not sure how it slipped in, but my best guess is that because TestTxn_Apply_ACLDeny is marked flaky we didn't block merge because it failed.
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
This extends the acl.AllowAuthorizer with source of authority information.
The next step is to unify the AllowAuthorizer and ACLResolveResult structures; that will be done in a separate PR.
Part of #12481
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
Introduces the capability to configure TLS differently for Consul's
listeners/ports (i.e. HTTPS, gRPC, and the internal multiplexed RPC
port) which is useful in scenarios where you may want the HTTPS or
gRPC interfaces to present a certificate signed by a well-known/public
CA, rather than the certificate used for internal communication which
must have a SAN in the form `server.<dc>.consul`.
Currently the config_entry.go subsystem delegates authorization decisions via the ConfigEntry interface CanRead and CanWrite code. Unfortunately this returns a true/false value and loses the details of the source.
This is not helpful, especially since it the config subsystem can be more complex to understand, since it covers so many domains.
This refactors CanRead/CanWrite to return a structured error message (PermissionDenied or the like) with more details about the reason for denial.
Part of #12241
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* First pass for helper for bulk changes
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Convert ACLRead and ACLWrite to new form
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* AgentRead and AgentWRite
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Fix EventWrite
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* KeyRead, KeyWrite, KeyList
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* KeyRing
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* NodeRead NodeWrite
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* OperatorRead and OperatorWrite
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* PreparedQuery
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Intention partial
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Fix ServiceRead, Write ,etc
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Error check ServiceRead?
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Fix Sessionread/Write
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Fixup snapshot ACL
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Error fixups for txn
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Add changelog
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Fixup review comments
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
Improves tests from #12362
These tests try to setup the following concurrent scenario:
1. (goroutine 1) execute read RPC with index=0
2. (goroutine 1) get response from (1) @ index=10
3. (goroutine 1) execute read RPC with index=10 and block
4. (goroutine 2) WHILE (3) is blocking, start slamming the system with stray writes that will cause the WatchSet to wakeup
5. (goroutine 2) after doing all writes, shut down the reader above
6. (goroutine 1) stops reading, double checks that it only ever woke up once (from 1)
Minor fix for behavior in #12362
IsDefault sometimes returns true even if there was a proxy-defaults or service-defaults config entry that was consulted. This PR fixes that.
before:
$ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf
ok github.com/hashicorp/consul/agent/consul 21.147s
after:
$ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf
ok github.com/hashicorp/consul/agent/consul 5.402s
Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect:
- `DiscoveryChain.Get`
- `ConfigEntry.ResolveServiceConfig`
- `Intentions.Match`
All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism.
### No Config Entries Found
In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110.
### No Change Since Last Wakeup
Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110.
This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report.
### Extra Endpoints
Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated:
- `ConfigEntry.List`
- `Internal.IntentionUpstreams` (tproxy)
Many places in consul already treated node names case insensitively.
The state store indexes already do it, but there are a few places that
did a direct byte comparison which have now been corrected.
One place of particular consideration is ensureCheckIfNodeMatches
which is executed during snapshot restore (among other places). If a
node check used a slightly different casing than the casing of the node
during register then the snapshot restore here would deterministically
fail. This has been fixed.
Primary approach:
git grep -i "node.*[!=]=.*node" -- ':!*_test.go' ':!docs'
git grep -i '\[[^]]*member[^]]*\]
git grep -i '\[[^]]*\(member\|name\|node\)[^]]*\]' -- ':!*_test.go' ':!website' ':!ui' ':!agent/proxycfg/testing.go:' ':!*.md'
There are some cross-config-entry relationships that are enforced during
"graph validation" at persistence time that are required to be
maintained. This means that config entries may form a digraph at times.
Config entry replication procedes in a particular sorted order by kind
and name.
Occasionally there are some fixups to these digraphs that end up
replicating in the wrong order and replicating the leaves
(ingress-gateway) before the roots (service-defaults) leading to
replication halting due to a graph validation error related to things
like mismatched service protocol requirements.
This PR changes replication to give each computed change (upsert/delete)
a fair shot at being applied before deciding to terminate that round of
replication in error. In the case where we've simply tried to do the
operations in the wrong order at least ONE of the outstanding requests
will complete in the right order, leading the subsequent round to have
fewer operations to do, with a smaller likelihood of graph validation
errors.
This does not address all scenarios, but for scenarios where the edits
are being applied in the wrong order this should avoid replication
halting.
Fixes#9319
The scenario that is NOT ADDRESSED by this PR is as follows:
1. create: service-defaults: name=new-web, protocol=http
2. create: service-defaults: name=old-web, protocol=http
3. create: service-resolver: name=old-web, redirect-to=new-web
4. delete: service-resolver: name=old-web
5. update: service-defaults: name=old-web, protocol=grpc
6. update: service-defaults: name=new-web, protocol=grpc
7. create: service-resolver: name=old-web, redirect-to=new-web
If you shutdown dc2 just before (4) and turn it back on after (7)
replication is impossible as there is no single edit you can make to
make forward progress.
Otherwise when the query times out we might incorrectly send a value for
the reply, when we should send an empty reply.
Also document errNotFound and how to handle the result in that case.
By using the query results as state.
Blocking queries are efficient when the query matches some results,
because the ModifyIndex of those results, returned as queryMeta.Mindex,
will never change unless the items themselves change.
Blocking queries for non-existent items are not efficient because the
queryMeta.Index can (and often does) change when other entities are
written.
This commit reduces the churn of these queries by using a different
comparison for "has changed". Instead of using the modified index, we
use the existence of the results. If the previous result was "not found"
and the new result is still "not found", we know we can ignore the
modified index and continue to block.
This is done by setting the minQueryIndex to the returned
queryMeta.Index, which prevents the query from returning before a state
change is observed.
This test shows how blocking queries are not efficient when the query
returns no results. The test fails with 100+ calls instead of the
expected 2.
This test is still a bit flaky because it depends on the timing of the
writes. It can sometimes return 3 calls.
A future commit should fix this and make blocking queries even more
optimal for not-found results.
Follow the Go convention of accepting a small interface that documents
the methods used by the function.
Clarify the rules for implementing a query function passed to
blockingQuery.
This will both save on unnecessary raft operations as well as
unnecessarily incrementing the raft modify index of config entries
subject to no-op updates.
This commit syncs ENT changes to the OSS repo.
Original commit details in ENT:
```
commit 569d25f7f4578981c3801e6e067295668210f748
Author: FFMMM <FFMMM@users.noreply.github.com>
Date: Thu Feb 10 10:23:33 2022 -0800
Vendor fork net rpc (#1538)
* replace net/rpc w consul-net-rpc/net/rpc
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
* replace msgpackrpc and go-msgpack with fork from mono repo
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
* gofmt all files touched
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
```
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
* First phase of refactoring PermissionDeniedError
Add extended type PermissionDeniedByACLError that captures information
about the accessor, particular permission type and the object and name
of the thing being checked.
It may be worth folding the test and error return into a single helper
function, that can happen at a later date.
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
Due to timing, a transparent proxy could have two upstreams to dial
directly with the same address.
For example:
- The orders service can dial upstreams shipping and payment directly.
- An instance of shipping at address 10.0.0.1 is deregistered.
- Payments is scaled up and scheduled to have address 10.0.0.1.
- The orders service receives the event for the new payments instance
before seeing the deregistration for the shipping instance. At this
point two upstreams have the same passthrough address and Envoy will
reject the listener configuration.
To disambiguate this commit considers the Raft index when storing
passthrough addresses. In the example above, 10.0.0.1 would only be
associated with the newer payments service instance.
This commit makes two changes to the validation.
Previously we would call this validation in GenerateRoot, which happens
both on initialization (when a follower becomes leader), and when a
configuration is updated. We only want to do this validation during
config update so the logic was moved to the UpdateConfiguration
function.
Previously we would compare the config values against the actual cert.
This caused problems when the cert was created manually in Vault (not
created by Consul). Now we compare the new config against the previous
config. Using a already created CA cert should never error now.
Adding the key bit and types to the config should only error when
the previous values were not the defaults.
These two tests require debug logging enabled, because they look for log lines.
Also switched to testify assertions because the previous errors were not clear.
This test found a bug in the secondary. We were appending the root cert
to the PEM, but that cert was already appended. This was failing
validation in Vault here:
https://github.com/hashicorp/vault/blob/sdk/v0.3.0/sdk/helper/certutil/types.go#L329
Previously this worked because self signed certs have the same
SubjectKeyID and AuthorityKeyID. So having the same self-signed cert
repeated doesn't fail that check.
However with an intermediate that is not self-signed, those values are
different, and so we fail the check. A test I added in a previous commit
should show that this continues to work with self-signed root certs as
well.
This is safer than embedding two interface because there are a number of
places where we check the concrete type. If we check the concrete type
on the top-level interface it will fail. So instead expose the
ACLIdentity from a method.
This change allows us to remove one of the last remaining duplicate
resolve token methods (Server.ResolveToken).
With this change we are down to only 2, where the second one also
handles setting the default EnterpriseMeta from the token.