Commit Graph

4360 Commits

Author SHA1 Message Date
DanStough 37694eefb5 feat: tgtwy xDS generation for destinations
Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>
2022-06-16 16:17:49 -04:00
alex d73adfef81
peering: block Intention.Apply ops (#13451)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-16 12:07:28 -07:00
alex ba1f235d70
peering, state: account for peer intentions (#13443)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-16 10:27:31 -07:00
R.B. Boyer 9c5d818546
xds: begin refactor to always pass test snapshots through all xDS types (#13461) 2022-06-15 14:58:28 -05:00
R.B. Boyer 93611819e2
xds: mesh gateways now have their own leaf certificate when involved in a peering (#13460)
This is only configured in xDS when a service with an L7 protocol is
exported.

They also load any relevant trust bundles for the peered services to
eventually use for L7 SPIFFE validation during mTLS termination.
2022-06-15 14:36:18 -05:00
Riddhi Shah 414bb7e34e
[OSS] Support merge-central-config option in node services list API (#13450)
Adds the merge-central-config query param option to the /catalog/node-services/:node-name API,
to get a service definition in the response that is merged with central defaults (proxy-defaults/service-defaults).

Updated the consul connect envoy command to use this option when
retrieving the proxy service details so as to render the bootstrap configuration correctly.
2022-06-15 08:30:31 -07:00
Evan Culver ca7acd2970
connect: Use Envoy 1.22.2 instead of 1.22.1 (#13444) 2022-06-14 15:29:41 -07:00
freddygv a288d0c388 Avoid deleting peerings marked as terminated.
When our peer deletes the peering it is locally marked as terminated.
This termination should kick off deleting all imported data, but should
not delete the peering object itself.

Keeping peerings marked as terminated acts as a signal that the action
took place.
2022-06-14 15:37:09 -06:00
freddygv a5283e4361 Add leader routine to clean up peerings
Once a peering is marked for deletion a new leader routine will now
clean up all imported resources and then the peering itself.

A lot of the logic was grabbed from the namespace/partitions deferred
deletions but with a handful of simplifications:
- The rate limiting is not configurable.

- Deleting imported nodes/services/checks is done by deleting nodes with
  the Txn API. The services and checks are deleted as a side-effect.

- There is no "round rate limiter" like with namespaces and partitions.
  This is because peerings are purely local, and deleting a peering in
  the datacenter does not depend on deleting data from other DCs like
  with WAN-federated namespaces. All rate limiting is handled by the
  Raft rate limiter.
2022-06-14 15:36:50 -06:00
Evan Culver 2adb9f7c8a
connect: Update Envoy support matrix to latest patch releases (#13431) 2022-06-14 13:19:09 -07:00
alex 6dbcb1d88e
peering: intentions list test (#13435) 2022-06-14 10:59:53 -07:00
freddygv dbcbf3978f Fixup stream tear-down steps.
1. Fix a bug where the peering leader routine would not track all active
   peerings in the "stored" reconciliation map. This could lead to
   tearing down streams where the token was generated, since the
   ConnectedStreams() method used for reconciliation returns all streams
   and not just the ones initiated by this leader routine.

2. Fix a race where stream contexts were being canceled before
   termination messages were being processed by a peer.

   Previously the leader routine would tear down streams by canceling
   their context right after the termination message was sent. This
   context cancelation could be propagated to the server side faster
   than the termination message. Now there is a change where the
   dialing peer uses CloseSend() to signal when no more messages will
   be sent. Eventually the server peer will read an EOF after receiving
   and processing the preceding termination message.

   Using CloseSend() is actually not enough to address the issue
   mentioned, since it doesn't wait for the server peer to finish
   processing messages. Because of this now the dialing peer also reads
   from the stream until an error signals that there are no more
   messages. Receiving an EOF from our peer indicates that they
   processed the termination message and have no additional work to do.

   Given that the stream is being closed, all the messages received by
   Recv are discarded. We only check for errors to avoid importing new
   data.
2022-06-13 12:10:42 -06:00
freddygv 6d368b5eed Update peering state and RPC for deferred deletion
When deleting a peering we do not want to delete the peering and all
imported data in a single operation, since deleting a large amount of
data at once could overload Consul.

Instead we defer deletion of peerings so that:

1. When a peering deletion request is received via gRPC the peering is
   marked for deletion by setting the DeletedAt field.

2. A leader routine will monitor for peerings that are marked for
   deletion and kick off a throttled deletion of all imported resources
   before deleting the peering itself.

This commit mostly addresses point #1 by modifying the peering service
to mark peerings for deletion. Another key change is to add a
PeeringListDeleted state store function which can return all peerings
marked for deletion. This function is what will be watched by the
deferred deletion leader routine.
2022-06-13 12:10:32 -06:00
Freddy 9eeb9e4ee3
Clean up imported nodes/services/checks as needed (#13367)
Previously, imported data would never be deleted. As
nodes/services/checks were registered and deregistered, resources
deleted from the exporting cluster would accumulate in the imported
cluster.

This commit makes updates to replication so that whenever an update is
received for a service name we reconcile what was present in the catalog
against what was received.

This handleUpdateService method can handle both updates and deletions.
2022-06-13 11:52:28 -06:00
Mark Anderson 6cb05666e6
Merge pull request #13357 from hashicorp/ma/add-build-date-oss
Add build date (oss)
2022-06-13 08:43:20 -07:00
Chris S. Kim 4cb251497f
Update RBAC to handle imported services (#13404)
When converting from Consul intentions to xds RBAC rules, services imported from other peers must encode additional data like partition (from the remote cluster) and trust domain.

This PR updates the PeeringTrustBundle to hold the sending side's local partition as ExportedPartition. It also updates RBAC code to encode SpiffeIDs of imported services with the ExportedPartition and TrustDomain.
2022-06-10 17:15:22 -04:00
R.B. Boyer 4626b65124
xds: allow for peered upstreams to use tagged addresses that are hostnames (#13422)
Mesh gateways can use hostnames in their tagged addresses (#7999). This is useful
if you were to expose a mesh gateway using a cloud networking load balancer appliance
that gives you a DNS name but no reliable static IPs.

Envoy cannot accept hostnames via EDS and those must be configured using CDS.
There was already logic when configuring gateways in other locations in the code, but
given the illusions in play for peering the downstream of a peered service wasn't aware
that it should be doing that.

Also:
- ensuring that we always try to use wan-like addresses to cross peer boundaries.
2022-06-10 16:11:40 -05:00
Kyle Havlovitz cfc90ea2d5 Add dns node lookup support in partitions 2022-06-10 11:23:51 -07:00
R.B. Boyer 33b497e7c9
peering: rename initiate to establish in the context of the APIs (#13419) 2022-06-10 11:10:46 -05:00
Mark Anderson 8fc25aefcf Change default dates
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:07:41 -07:00
Mark Anderson 41446f449a Fixup some more tests
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:04:05 -07:00
Mark Anderson 18d3cee29e Add build date to self endpoint
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:04:05 -07:00
Mark Anderson ac1f17326c Build date in config file
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:04:05 -07:00
R.B. Boyer d81d8468db
peering: mesh gateways are required for cross-peer service mesh communication (#13410)
Require use of mesh gateways in order for service mesh data plane
traffic to flow between peers.

This also adds plumbing for envoy integration tests involving peers, and
one starter peering test.
2022-06-09 11:05:18 -05:00
kisunji 1eff652620 Add missing index for read 2022-06-08 13:53:31 -04:00
kisunji 86bf8b8fb6 Add IntentionMatch tests for source peers 2022-06-08 13:53:31 -04:00
kisunji 7d9f413949 Update ServiceIntentionSourceIndex to handle peer 2022-06-08 13:53:31 -04:00
Chris S. Kim 3e71754e7a
Add SourcePeer fields to relevant Intentions types (#13390) 2022-06-08 13:24:10 -04:00
R.B. Boyer c1f20d17ee
peering: allow protobuf requests to populate the default partition or namespace (#13398) 2022-06-08 11:55:18 -05:00
Dhia Ayachi 80556c9ffc
Fix intentions wildcard dest (#13397)
* when enterprise meta are wildcard assume it's a service intention

* fix partition and namespace

* move kind outside the loops

* get the kind check outside the loop and add a comment

Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2022-06-08 10:38:55 -04:00
R.B. Boyer bf647bc9d2
peering: avoid a race between peering establishment and termination (#13389) 2022-06-07 16:29:09 -05:00
Dhia Ayachi 3deaf767f2
Egress gtw/intention rpc endpoint (#13354)
* update gateway-services table with endpoints

* fix failing test

* remove unneeded config in test

* rename "endpoint" to "destination"

* more endpoint renaming to destination in tests

* update isDestination based on service-defaults config entry creation

* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)

* set unknown state to empty to avoid modifying alot of tests

* fix logic to set the kind correctly on CRUD

* fix failing tests

* add missing tests and fix service delete

* fix failing test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* fix a bug with kind and add relevant test

* fix compile error

* fix failing tests

* add kind to clone

* fix failing tests

* fix failing tests in catalog endpoint

* fix service dump test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* remove duplicate tests

* first draft of destinations intention in connect proxy

* remove ServiceDestinationList

* fix failing tests

* fix agent/consul failing tests

* change to filter intentions in the state store instead of adding a field.

* fix failing tests

* fix comment

* fix comments

* store service kind destination and add relevant tests

* changes based on review

* filter on destinations when querying source match

* change state store API to get an IntentionTarget parameter

* add intentions tests

* add destination upstream endpoint

* fix failing test

* fix failing test and a bug with wildcard intentions

* fix failing test

* Apply suggestions from code review

Co-authored-by: alex <8968914+acpana@users.noreply.github.com>

* add missing test and clarify doc

* fix style

* gofmt intention.go

* fix merge introduced issue

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2022-06-07 15:55:02 -04:00
Dhia Ayachi 7602b6ebf2
Egress gtw/connect destination intentions (#13341)
* update gateway-services table with endpoints

* fix failing test

* remove unneeded config in test

* rename "endpoint" to "destination"

* more endpoint renaming to destination in tests

* update isDestination based on service-defaults config entry creation

* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)

* set unknown state to empty to avoid modifying alot of tests

* fix logic to set the kind correctly on CRUD

* fix failing tests

* add missing tests and fix service delete

* fix failing test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* fix a bug with kind and add relevant test

* fix compile error

* fix failing tests

* add kind to clone

* fix failing tests

* fix failing tests in catalog endpoint

* fix service dump test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* remove duplicate tests

* first draft of destinations intention in connect proxy

* remove ServiceDestinationList

* fix failing tests

* fix agent/consul failing tests

* change to filter intentions in the state store instead of adding a field.

* fix failing tests

* fix comment

* fix comments

* store service kind destination and add relevant tests

* changes based on review

* filter on destinations when querying source match

* Apply suggestions from code review

Co-authored-by: alex <8968914+acpana@users.noreply.github.com>

* fix style

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* rename destinationType to targetType.

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2022-06-07 15:03:59 -04:00
R.B. Boyer 0681f3571d
peering: allow mesh gateways to proxy L4 peered traffic (#13339)
Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied.

Note: this does not change the callers to use these mesh gateways.
2022-06-06 14:20:41 -05:00
Fulvio f155ff347c
UDP check for service stanza #12221 (#12722)
* UDP check for service stanza #12221

* add pass status on timeout condition

* delete useless files

* Update check_test.go

improve comment in test

* fix test

* fix requested changes and update TestRuntimeConfig_Sanitize.golden

* add freeport to TestCheckUDPCritical

* improve comment for CheckUDP struct

* fix requested changes

* fix requested changes

* fix requested changes

* add UDP to proto

* add UDP to proto and add a changelog

* add requested test on agent_endpoint_test.go

* add test for given endpoints

* fix failing tests

* add documentation for udp healthcheck

* regenerate proto using buf

* Update website/content/api-docs/agent/check.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/api-docs/agent/check.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/docs/discovery/checks.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/docs/ecs/configuration-reference.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/docs/ecs/configuration-reference.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* add debug echo

* add debug circle-ci

* add debug circle-ci bash

* use echo instead of status_stage

* remove debug and status from devtools script and use echo instead

* Update website/content/api-docs/agent/check.mdx

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* fix test

* replace status_stage with status

* replace functions with echo

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
2022-06-06 15:13:19 -04:00
alex ff2ad3ba0c
peering: send leader addr (#13342)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-06 10:00:38 -07:00
Dan Upton 5cd31933d1
xds: remove HTTPCheckFetcher dependency (#13366)
This is the OSS portion of enterprise PR 1994

Rather than directly interrogating the agent-local state for HTTP
checks using the `HTTPCheckFetcher` interface, we now rely on the
config snapshot containing the checks.

This reduces the number of changes required to support server xDS
sessions.

It's not clear why the fetching approach was introduced in
931d167ebb2300839b218d08871f22323c60175d.
2022-06-06 15:15:33 +01:00
R.B. Boyer 4c781d1e15
peering: update how cross-peer upstreams and represented in proxycfg and rendered in xds (#13362)
This removes unnecessary, vestigal remnants of discovery chains.
2022-06-03 16:42:50 -05:00
cskh bd3a4dfeab
Add isLeader metric to track if a server is a leader (#13304)
CTIA-21: sdd is_leader metric to track if a server is a leader

Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
2022-06-03 13:07:37 -04:00
Freddy 36322c49cf
Merge pull request #13340 from hashicorp/peering/public-listener 2022-06-02 15:15:29 -06:00
Chris S. Kim 8a7dfbaa7d
Fix KVSGet method to handle QueryOptions properly (#13344) 2022-06-02 12:26:18 -04:00
Freddy 137bfbb02b Update public listener with SPIFFE Validator
Envoy's SPIFFE certificate validation extension allows for us to
validate against different root certificates depending on the trust
domain of the dialing proxy.

If there are any trust bundles from peers in the config snapshot then we
use the SPIFFE validator as the validation context, rather than the
usual TrustedCA.

The injected validation config includes the local root certificates as
well.
2022-06-01 17:06:33 -06:00
freddygv ad6dbe081a Add agent cache-type for TrustBundleListByService
There are a handful of changes in this commit:
* When querying trust bundles for a service we need to be able to
  specify the namespace of the service.
* The endpoint needs to track the index because the cache watches use
  it.
* Extracted bulk of the endpoint's logic to a state store function
  so that index tracking could be tested more easily.
* Removed check for service existence, deferring that sort of work to ACL authz
* Added the cache type
2022-06-01 17:05:10 -06:00
freddygv 073c9e3a91 Update assumptions around exported-service config
Given that the exported-services config entry can use wildcards, the
precedence for wildcards is handled as with intentions. The most exact
match is the match that applies for any given service. We do not take
the union of all that apply.

Another update that was made was to reflect that only one
exported-services config entry applies to any given service in a
partition. This is a pre-existing constraint that gets enforced by
the Normalize() method on that config entry type.
2022-06-01 17:03:51 -06:00
Freddy 6ef38eaea7
Configure upstream TLS context with peer root certs (#13321)
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.

Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.

This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.
2022-06-01 15:53:52 -06:00
R.B. Boyer addfef5065
test: regenerate golden files (#13336)
make envoy-regen
    go test ./agent/config -update
2022-06-01 15:17:03 -05:00
Chris S. Kim 58ffa0488d
Revert getPathSuffixUnescaped (#13256) 2022-06-01 13:17:14 -04:00
Dan Upton e6dc26e087
proxycfg: replace direct agent cache usage with interfaces (#13320)
This is the OSS portion of enterprise PRs 1904, 1905, 1906, 1907, 1949,
and 1971.

It replaces the proxycfg manager's direct dependency on the agent cache
with interfaces that will be implemented differently when serving xDS
sessions from a Consul server.
2022-06-01 16:18:06 +01:00
Chris S. Kim 44a318ef73
Reimplement fs.FileInfo interface (#13315)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-06-01 11:09:51 -04:00
Dhia Ayachi d4a04457e1
update gateway-services table with endpoints (#13217)
* update gateway-services table with endpoints

* fix failing test

* remove unneeded config in test

* rename "endpoint" to "destination"

* more endpoint renaming to destination in tests

* update isDestination based on service-defaults config entry creation

* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)

* set unknown state to empty to avoid modifying alot of tests

* fix logic to set the kind correctly on CRUD

* fix failing tests

* add missing tests and fix service delete

* fix failing test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* fix a bug with kind and add relevant test

* fix compile error

* fix failing tests

* add kind to clone

* fix failing tests

* fix failing tests in catalog endpoint

* fix service dump test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* remove duplicate tests

* rename consts and fix kind when no destination is defined in the service-defaults.

* rename Kind to ServiceKind and change switch to use .(type)

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
2022-05-31 16:20:12 -04:00