Commit graph

4826 commits

Author SHA1 Message Date
alex 4ff097c4cf
peering: track exported services (#13784)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-07-18 10:20:04 -07:00
R.B. Boyer bec4df0679
peerstream: require a resource subscription to receive updates of that type (#13767)
This mimics xDS's discovery protocol where you must request a resource
explicitly for the exporting side to send those events to you.

As part of this I aligned the overall ResourceURL with the TypeURL that
gets embedded into the encoded protobuf Any construct. The
CheckServiceNodes is now wrapped in a better named "ExportedService"
struct now.
2022-07-15 15:03:40 -05:00
R.B. Boyer 7da65c02a6
peerstream: fix test assertions (#13780) 2022-07-15 14:43:24 -05:00
Luke Kysow 3968f21339
Add docs for peerStreamServer vs peeringServer. (#13781) 2022-07-15 12:23:05 -07:00
Luke Kysow a8721c33c5
peerstream: dialer should reconnect when stream closes (#13745)
* peerstream: dialer should reconnect when stream closes

If the stream is closed unexpectedly (i.e. when we haven't received
a terminated message), the dialer should attempt to re-establish the
stream.

Previously, the `HandleStream` would return `nil` when the stream
was closed. The caller then assumed the stream was terminated on purpose
and so didn't reconnect when instead it was stopped unexpectedly and
the dialer should have attempted to reconnect.
2022-07-15 11:58:33 -07:00
R.B. Boyer 61ebb38092
server: ensure peer replication can successfully use TLS over external gRPC (#13733)
Ensure that the peer stream replication rpc can successfully be used with TLS activated.

Also:

- If key material is configured for the gRPC port but HTTPS is not
  enabled now TLS will still be activated for the gRPC port.

- peerstream replication stream opened by the establishing-side will now
  ignore grpc.WithBlock so that TLS errors will bubble up instead of
  being awkwardly delayed or suppressed
2022-07-15 13:15:50 -05:00
alex 70ad4804b6
peering: track imported services (#13718) 2022-07-15 10:20:43 -07:00
Matt Keeler 7ae0c69729
Use Node Name for peering healthSnapshot instead of ID (#13773)
A Node ID is not a required field with Consul’s data model. Therefore we cannot reliably expect all uses to have it. However the node name is required and must be unique so its equally as good of a key for the internal healthSnapshot node tracking.
2022-07-15 10:51:38 -04:00
Matt Keeler 17565a4fca
Enable partition support for peering establishment (#13772)
Prior to this the dialing side of the peering would only ever work within the default partition. This commit allows properly parsing the partition field out of the API struct request body, query param and header.
2022-07-15 10:07:07 -04:00
Dan Stough 084f9d7084 feat: connect proxy xDS for destinations
Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>
2022-07-14 15:27:02 -04:00
Daniel Upton 7f69e27926 proxycfg-glue: server-local implementation of FederationStateListMeshGateways
This is the OSS portion of enterprise PR 2265.

This PR provides a server-local implementation of the
proxycfg.FederationStateListMeshGateways interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton a5a6102a3b proxycfg-glue: server-local implementation of GatewayServices
This is the OSS portion of enterprise PR 2259.

This PR provides a server-local implementation of the proxycfg.GatewayServices
interface based on blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton a280c9a10b proxycfg-glue: server-local implementation of TrustBundle and TrustBundleList
This is the OSS portion of enterprise PR 2250.

This PR provides server-local implementations of the proxycfg.TrustBundle and
proxycfg.TrustBundleList interfaces, based on local blocking queries.
2022-07-14 18:22:12 +01:00
Daniel Upton 70f29942f4 proxycfg-glue: server-local implementation of the Health interface
This is the OSS portion of enterprise PR 2249.

This PR introduces an implementation of the proxycfg.Health interface based on a
local materialized view of the health events.

It reuses the view and request machinery from agent/rpcclient/health, which made
it super straightforward.
2022-07-14 18:22:12 +01:00
Daniel Upton 688dfe3138 proxycfg-glue: server-local implementation of ServiceList
This is the OSS portion of enterprise PR 2242.

This PR introduces a server-local implementation of the proxycfg.ServiceList
interface, backed by streaming events and a local materializer.
2022-07-14 18:22:12 +01:00
Daniel Upton 599f5e2207 proxycfg-glue: server-local compiled discovery chain data source
This is the OSS portion of enterprise PR 2236.

Adds a local blocking query-based implementation of the proxycfg.CompiledDiscoveryChain interface.
2022-07-14 18:22:12 +01:00
Chris S. Kim d12b3d286e Check if an upstream is implicit from either intentions or peered services 2022-07-13 16:53:20 -04:00
Chris S. Kim 5d890cdbb2 Use new maps for proxycfg peered data 2022-07-13 16:05:10 -04:00
Chris S. Kim 34c0093d44 Add new watch.Map type to refactor proxycfg 2022-07-13 16:05:10 -04:00
Chris S. Kim 0936942b2d Scrub VirtualIPs before exporting 2022-07-13 16:05:10 -04:00
Kyle Havlovitz a7ea6cb771
Merge pull request #13699 from hashicorp/tgate-http2-upstream
Respect http2 protocol for upstreams of terminating gateways
2022-07-13 09:41:15 -07:00
Dan Upton 34140ff3e0
grpc: rename public/private directories to external/internal (#13721)
Previously, public referred to gRPC services that are both exposed on
the dedicated gRPC port and have their definitions in the proto-public
directory (so were considered usable by 3rd parties). Whereas private
referred to services on the multiplexed server port that are only usable
by agents and other servers.

Now, we're splitting these definitions, such that external/internal
refers to the port and public/private refers to whether they can be used
by 3rd parties.

This is necessary because the peering replication API needs to be
exposed on the dedicated port, but is not (yet) suitable for use by 3rd
parties.
2022-07-13 16:33:48 +01:00
R.B. Boyer c880728ab4
peerstream: some cosmetic refactors to make this easier to follow (#13732)
- Use some protobuf construction helper methods for brevity.
- Rename a local variable to avoid later shadowing.
- Rename the Nonce field to be more like xDS's naming.
- Be more explicit about which PeerID fields are empty.
2022-07-13 10:00:35 -05:00
Kyle Havlovitz 0ac7de3bae Use protocol from resolved config entry, not gateway service 2022-07-12 16:23:40 -07:00
Kyle Havlovitz 54d8fe9032 Enable http2 options for grpc protocol 2022-07-12 14:38:44 -07:00
R.B. Boyer 81764a5650
peering: always send the mesh gateway SpiffeID even for tcp services (#13728)
If someone were to switch a peer-exported service from L4 to L7 there
would be a brief SAN validation hiccup as traffic shifted to the mesh
gateway for termination.

This PR sends the mesh gateway SpiffeID down all the time so the clients
always expect a switch.
2022-07-12 11:38:13 -05:00
R.B. Boyer ee5eb5a960
state: prohibit changing an exported tcp discovery chain in a way that would break SAN validation (#13727)
For L4/tcp exported services the mesh gateways will not be terminating
TLS. A caller in one peer will be directly establishing TLS connections
to the ultimate exported service in the other peer.

The caller will be doing SAN validation using the replicated SpiffeID
values shipped from the exporting side. There are a class of discovery
chain edits that could be done on the exporting side that would cause
the introduction of a new SpiffeID value. In between the time of the
config entry update on the exporting side and the importing side getting
updated peer stream data requests to the exported service would fail due
to SAN validation errors.

This is unacceptable so instead prohibit the exporting peer from making
changes that would break peering in this way.
2022-07-12 11:17:33 -05:00
R.B. Boyer 2c329475ce
state: prohibit exported discovery chains to have cross-datacenter or cross-partition references (#13726)
Because peerings are pairwise, between two tuples of (datacenter,
partition) having any exported reference via a discovery chain that
crosses out of the peered datacenter or partition will ultimately not be
able to work for various reasons. The biggest one is that there is no
way in the ultimate destination to configure an intention that can allow
an external SpiffeID to access a service.

This PR ensures that a user simply cannot do this, so they won't run
into weird situations like this.
2022-07-12 11:03:41 -05:00
Chris S. Kim 9f5ab3ec10
Return error if ServerAddresses is empty (#13714) 2022-07-12 11:09:00 -04:00
Kyle Havlovitz 616a2da835 Respect http2 protocol for upstreams of terminating gateways 2022-07-08 14:30:45 -07:00
R.B. Boyer 5b801db24b
peering: move peer replication to the external gRPC port (#13698)
Peer replication is intended to be between separate Consul installs and
effectively should be considered "external". This PR moves the peer
stream replication bidirectional RPC endpoint to the external gRPC
server and ensures that things continue to function.
2022-07-08 12:01:13 -05:00
R.B. Boyer 40c5c7eee2
server: broadcast the public grpc port using lan serf and update the consul service in the catalog with the same data (#13687)
Currently servers exchange information about their WAN serf port
and RPC port with serf tags, so that they all learn of each other's
addressing information. We intend to make larger use of the new
public-facing gRPC port exposed on all of the servers, so this PR
addresses that by passing around the gRPC port via serf tags and
then ensuring the generated consul service in the catalog has
metadata about that new port as well for ease of non-serf-based lookup.
2022-07-07 13:55:41 -05:00
Freddy ed9808c4f1
Parse peer name for virtual IP DNS queries (#13602)
This commit updates the DNS query locality parsing so that the virtual
IP for an imported service can be queried.

Note that:
- Support for parsing a peer in other service discovery queries was not
  added.
- Querying another datacenter for a virtual IP is not supported. This
  was technically allowed in 1.11 but is being rolled back for 1.13
  because it is not a use-case we intended to support. Virtual IPs in
  different datacenters are going to collide because they are allocated
  sequentially.
2022-07-06 10:30:04 -06:00
R.B. Boyer 4ce9651421
test: update mockery use to put mocks into test files (#13656)
--testonly doesn't do anything anymore so switch to --filename instead
2022-07-05 16:57:15 -05:00
Chris S. Kim 0910c41d95
Revise possible states for a peering. (#13661)
These changes are primarily for Consul's UI, where we want to be more
specific about the state a peering is in.

- The "initial" state was renamed to pending, and no longer applies to
  peerings being established from a peering token.

- Upon request to establish a peering from a peering token, peerings
  will be set as "establishing". This will help distinguish between the
  two roles: the cluster that generates the peering token and the
  cluster that establishes the peering.

- When marked for deletion, peering state will be set to "deleting".
  This way the UI determines the deletion via the state rather than the
  "DeletedAt" field.

Co-authored-by: freddygv <freddy@hashicorp.com>
2022-07-04 10:47:58 -04:00
Daniel Upton e1d0aff462 proxycfg: server-local intention upstreams data source
This is the OSS portion of enterprise PR 2157.

It builds on the local blocking query work in #13438 to implement the
proxycfg.IntentionUpstreams interface using server-local data.

Also moves the ACL filtering logic from agent/consul into the acl/filter
package so that it can be reused here.
2022-07-04 10:48:36 +01:00
Daniel Upton 21ea217b1d proxycfg: server-local intentions data source
This is the OSS portion of enterprise PR 2141.

This commit provides a server-local implementation of the `proxycfg.Intentions`
interface that sources data from streaming events.

It adds events for the `service-intentions` config entry type, and then consumes
event streams (via materialized views) for the service's explicit intentions and
any applicable wildcard intentions, merging them into a single list of intentions.

An alternative approach I considered was to consume _all_ intention events (via
`SubjectWildcard`) and filter out the irrelevant ones. This would admittedly
remove some complexity in the `agent/proxycfg-glue` package but at the expense
of considerable overhead from waking potentially many thousands of connect
proxies every time any intention is updated.
2022-07-04 10:48:36 +01:00
Daniel Upton 497df1ca3b proxycfg: server-local config entry data sources
This is the OSS portion of enterprise PR 2056.

This commit provides server-local implementations of the proxycfg.ConfigEntry
and proxycfg.ConfigEntryList interfaces, that source data from streaming events.

It makes use of the LocalMaterializer type introduced for peering replication,
adding the necessary support for authorization.

It also adds support for "wildcard" subscriptions (within a topic) to the event
publisher, as this is needed to fetch service-resolvers for all services when
configuring mesh gateways.

Currently, events will be emitted for just the ingress-gateway, service-resolver,
and mesh config entry types, as these are the only entries required by proxycfg
— the events will be emitted on topics named IngressGateway, ServiceResolver,
and MeshConfig topics respectively.

Though these events will only be consumed "locally" for now, they can also be
consumed via the gRPC endpoint (confirmed using grpcurl) so using them from
client agents should be a case of swapping the LocalMaterializer for an
RPCMaterializer.
2022-07-04 10:48:36 +01:00
alex 90577810cc
peering: add imported/exported counts to peering (#13644)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-06-29 14:07:30 -07:00
Chris S. Kim 0dc1b11e42
Fix ENT drift in files (#13647) 2022-06-29 16:53:22 -04:00
Chris S. Kim 25aec40e74
Add internal endpoint to fetch peered upstream candidates from VirtualIP table (#13642)
For initial cluster peering TProxy support we consider all imported services of a partition to be potential upstreams.

We leverage the VirtualIP table because it stores plain service names (e.g. "api", not "api-sidecar-proxy").
2022-06-29 16:34:58 -04:00
Eric Haberkorn f3bba7c963
Fix spelling mistake in serverless patcher (#13607)
passhthrough -> passthrough
2022-06-29 15:21:21 -04:00
alex 04ec093130
no 1.9 style metrics (#13532)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-29 09:46:37 -07:00
alex a8ae8de20e
peering: reconcile/ hint active state for list (#13619)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-29 09:43:50 -07:00
R.B. Boyer 7672532b05
xds: modify rbac rules to use the XFCC header for peered L7 enforcement (#13629)
When the protocol is http-like, and an intention has a peered source
then the normal RBAC mTLS SAN field check is replaces with a joint combo
of:

    mTLS SAN field must be the service's local mesh gateway leaf cert
      AND
    the first XFCC header (from the MGW) must have a URI field that matches the original intention source

Also:

- Update the regex program limit to be much higher than the teeny
  defaults, since the RBAC regex constructions are more complicated now.

- Fix a few stray panics in xds generation.
2022-06-29 10:29:54 -05:00
R.B. Boyer 3445c6b09a
xds: have mesh gateways forward peered SpiffeIDs using the XFCC header (#13625) 2022-06-28 15:32:42 -05:00
R.B. Boyer 115000144b
xds: mesh gateways now correctly load up peer-exported discovery chains using L7 protocols (#13624)
A mesh gateway will now configure the filter chains for L7 exported
services using the correct discovery chain information.
2022-06-28 14:52:25 -05:00
R.B. Boyer 2dba16be52
peering: replicate all SpiffeID values necessary for the importing side to do SAN validation (#13612)
When traversing an exported peered service, the discovery chain
evaluation at the other side may re-route the request to a variety of
endpoints. Furthermore we intend to terminate mTLS at the mesh gateway
for arriving peered traffic that is http-like (L7), so the caller needs
to know the mesh gateway's SpiffeID in that case as well.

The following new SpiffeID values will be shipped back in the peerstream
replication:

- tcp: all possible SpiffeIDs resulting from the service-resolver
        component of the exported discovery chain

- http-like: the SpiffeID of the mesh gateway
2022-06-27 14:37:18 -05:00
Max Bowsher 728cd03e24
Merge branch 'main' into fix-kv_entries-metric 2022-06-27 18:57:03 +01:00
alex 4333312be9
peering, internal: support UIServices, UINodes, UINodeInfo (#13577) 2022-06-24 15:17:35 -07:00
Chris S. Kim a5f9994128
Add new index for PeeredServiceName and ServiceVirtualIP (#13582)
For TProxy we will be leveraging the VirtualIP table, which needs to become peer-aware
2022-06-24 14:38:39 -04:00
alex 2c837a21df
Merge pull request #13570 from hashicorp/acpance/peering-oss-intentions
oss: peering, http: get peer service intentions (#2098)
2022-06-23 08:15:59 -07:00
Will Jordan 25f4c44268
Add per-node max indexes (#12399)
Adds fine-grained node.[node] entries to the index table, allowing blocking queries to return fine-grained indexes that prevent them from returning immediately when unrelated nodes/services are updated.

Co-authored-by: kisunji <ckim@hashicorp.com>
2022-06-23 11:13:25 -04:00
Chris S. Kim aaf3c051f2
Make memdb indexers generic (#13558)
We have many indexer functions in Consul which take interface{} and type assert before building the index. We can use generics to get rid of the initial plumbing and pass around functions with better defined signatures. This has two benefits: 1) Less verbosity; 2) Developers can parse the argument types to memdb schemas without having to introspect the function for the type assertion.
2022-06-23 11:07:19 -04:00
Matt Keeler dc19b9f46f
Port over the index 0 -> 1 code that lived in the old rpc setQueryMeta function. (#13561) 2022-06-23 09:34:47 -04:00
acpana 07cd838e77
oss: peering, http: get peer service intentions (#2098)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-22 16:25:09 -07:00
R.B. Boyer e7a7232a6b
state: peering ID assignment cannot happen inside of the state store (#13525)
Move peering ID assignment outisde of the FSM, so that the ID is written
to the raft log and the same ID is used by all voters, and after
restarts.
2022-06-21 13:04:08 -05:00
Matt Keeler c2421825f3
Add server local blocking queries and watches (#13438)
Co-authored-by: Dan Upton <daniel@floppy.co>
2022-06-21 13:36:49 -04:00
Chris S. Kim 247b56def3
Pass trust domain to RBAC to validate and fix use of wrong peer trust bundles (#13508) 2022-06-20 22:47:14 -04:00
Max Bowsher 884a63431e Delete definition of metric consul.acl.blocked.node.registration
Although the metric is defined, there is no code which ever sets its
value - the code in question is genuinely asymmetric - there are 3 types
of object for which registration can be tracked, but only 2 for which
deregistration can be tracked.
2022-06-19 17:38:04 +01:00
Max Bowsher 3f93b3cc88 Fix incorrect name and doc for kv_entries metric
The name of the metric as registered with the metrics library to provide
the help string, was incorrect compared with the actual code that sets
the metric value - bring them into sync.

Also, the help message was incorrect. Rather than copy the help message
from telemetry.mdx, which was correct, but felt a bit unnatural in the
way it was worded, update both of them to a new wording.
2022-06-19 11:58:23 +01:00
Dan Upton 989b22425c
Move ACLResolveResult into acl/resolver package (#13467)
Having this type live in the agent/consul package makes it difficult to
put anything that relies on token resolution (e.g. the new gRPC services)
in separate packages without introducing import cycles.

For example, if package foo imports agent/consul for the ACLResolveResult
type it means that agent/consul cannot import foo to register its service.

We've previously worked around this by wrapping the ACLResolver to
"downgrade" its return type to an acl.Authorizer - aside from the
added complexity, this also loses the resolved identity information.

In the future, we may want to move the whole ACLResolver into the
acl/resolver package. For now, putting the result type there at least,
fixes the immediate import cycle issues.
2022-06-17 10:24:43 +01:00
DanStough 37694eefb5 feat: tgtwy xDS generation for destinations
Signed-off-by: Dhia Ayachi <dhia@hashicorp.com>
2022-06-16 16:17:49 -04:00
alex d73adfef81
peering: block Intention.Apply ops (#13451)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-16 12:07:28 -07:00
alex ba1f235d70
peering, state: account for peer intentions (#13443)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-16 10:27:31 -07:00
R.B. Boyer 9c5d818546
xds: begin refactor to always pass test snapshots through all xDS types (#13461) 2022-06-15 14:58:28 -05:00
R.B. Boyer 93611819e2
xds: mesh gateways now have their own leaf certificate when involved in a peering (#13460)
This is only configured in xDS when a service with an L7 protocol is
exported.

They also load any relevant trust bundles for the peered services to
eventually use for L7 SPIFFE validation during mTLS termination.
2022-06-15 14:36:18 -05:00
Riddhi Shah 414bb7e34e
[OSS] Support merge-central-config option in node services list API (#13450)
Adds the merge-central-config query param option to the /catalog/node-services/:node-name API,
to get a service definition in the response that is merged with central defaults (proxy-defaults/service-defaults).

Updated the consul connect envoy command to use this option when
retrieving the proxy service details so as to render the bootstrap configuration correctly.
2022-06-15 08:30:31 -07:00
Evan Culver ca7acd2970
connect: Use Envoy 1.22.2 instead of 1.22.1 (#13444) 2022-06-14 15:29:41 -07:00
freddygv a288d0c388 Avoid deleting peerings marked as terminated.
When our peer deletes the peering it is locally marked as terminated.
This termination should kick off deleting all imported data, but should
not delete the peering object itself.

Keeping peerings marked as terminated acts as a signal that the action
took place.
2022-06-14 15:37:09 -06:00
freddygv a5283e4361 Add leader routine to clean up peerings
Once a peering is marked for deletion a new leader routine will now
clean up all imported resources and then the peering itself.

A lot of the logic was grabbed from the namespace/partitions deferred
deletions but with a handful of simplifications:
- The rate limiting is not configurable.

- Deleting imported nodes/services/checks is done by deleting nodes with
  the Txn API. The services and checks are deleted as a side-effect.

- There is no "round rate limiter" like with namespaces and partitions.
  This is because peerings are purely local, and deleting a peering in
  the datacenter does not depend on deleting data from other DCs like
  with WAN-federated namespaces. All rate limiting is handled by the
  Raft rate limiter.
2022-06-14 15:36:50 -06:00
Evan Culver 2adb9f7c8a
connect: Update Envoy support matrix to latest patch releases (#13431) 2022-06-14 13:19:09 -07:00
alex 6dbcb1d88e
peering: intentions list test (#13435) 2022-06-14 10:59:53 -07:00
freddygv dbcbf3978f Fixup stream tear-down steps.
1. Fix a bug where the peering leader routine would not track all active
   peerings in the "stored" reconciliation map. This could lead to
   tearing down streams where the token was generated, since the
   ConnectedStreams() method used for reconciliation returns all streams
   and not just the ones initiated by this leader routine.

2. Fix a race where stream contexts were being canceled before
   termination messages were being processed by a peer.

   Previously the leader routine would tear down streams by canceling
   their context right after the termination message was sent. This
   context cancelation could be propagated to the server side faster
   than the termination message. Now there is a change where the
   dialing peer uses CloseSend() to signal when no more messages will
   be sent. Eventually the server peer will read an EOF after receiving
   and processing the preceding termination message.

   Using CloseSend() is actually not enough to address the issue
   mentioned, since it doesn't wait for the server peer to finish
   processing messages. Because of this now the dialing peer also reads
   from the stream until an error signals that there are no more
   messages. Receiving an EOF from our peer indicates that they
   processed the termination message and have no additional work to do.

   Given that the stream is being closed, all the messages received by
   Recv are discarded. We only check for errors to avoid importing new
   data.
2022-06-13 12:10:42 -06:00
freddygv 6d368b5eed Update peering state and RPC for deferred deletion
When deleting a peering we do not want to delete the peering and all
imported data in a single operation, since deleting a large amount of
data at once could overload Consul.

Instead we defer deletion of peerings so that:

1. When a peering deletion request is received via gRPC the peering is
   marked for deletion by setting the DeletedAt field.

2. A leader routine will monitor for peerings that are marked for
   deletion and kick off a throttled deletion of all imported resources
   before deleting the peering itself.

This commit mostly addresses point #1 by modifying the peering service
to mark peerings for deletion. Another key change is to add a
PeeringListDeleted state store function which can return all peerings
marked for deletion. This function is what will be watched by the
deferred deletion leader routine.
2022-06-13 12:10:32 -06:00
Freddy 9eeb9e4ee3
Clean up imported nodes/services/checks as needed (#13367)
Previously, imported data would never be deleted. As
nodes/services/checks were registered and deregistered, resources
deleted from the exporting cluster would accumulate in the imported
cluster.

This commit makes updates to replication so that whenever an update is
received for a service name we reconcile what was present in the catalog
against what was received.

This handleUpdateService method can handle both updates and deletions.
2022-06-13 11:52:28 -06:00
Mark Anderson 6cb05666e6
Merge pull request #13357 from hashicorp/ma/add-build-date-oss
Add build date (oss)
2022-06-13 08:43:20 -07:00
Chris S. Kim 4cb251497f
Update RBAC to handle imported services (#13404)
When converting from Consul intentions to xds RBAC rules, services imported from other peers must encode additional data like partition (from the remote cluster) and trust domain.

This PR updates the PeeringTrustBundle to hold the sending side's local partition as ExportedPartition. It also updates RBAC code to encode SpiffeIDs of imported services with the ExportedPartition and TrustDomain.
2022-06-10 17:15:22 -04:00
R.B. Boyer 4626b65124
xds: allow for peered upstreams to use tagged addresses that are hostnames (#13422)
Mesh gateways can use hostnames in their tagged addresses (#7999). This is useful
if you were to expose a mesh gateway using a cloud networking load balancer appliance
that gives you a DNS name but no reliable static IPs.

Envoy cannot accept hostnames via EDS and those must be configured using CDS.
There was already logic when configuring gateways in other locations in the code, but
given the illusions in play for peering the downstream of a peered service wasn't aware
that it should be doing that.

Also:
- ensuring that we always try to use wan-like addresses to cross peer boundaries.
2022-06-10 16:11:40 -05:00
Kyle Havlovitz cfc90ea2d5 Add dns node lookup support in partitions 2022-06-10 11:23:51 -07:00
R.B. Boyer 33b497e7c9
peering: rename initiate to establish in the context of the APIs (#13419) 2022-06-10 11:10:46 -05:00
Mark Anderson 8fc25aefcf Change default dates
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:07:41 -07:00
Mark Anderson 41446f449a Fixup some more tests
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:04:05 -07:00
Mark Anderson 18d3cee29e Add build date to self endpoint
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:04:05 -07:00
Mark Anderson ac1f17326c Build date in config file
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-06-09 17:04:05 -07:00
R.B. Boyer d81d8468db
peering: mesh gateways are required for cross-peer service mesh communication (#13410)
Require use of mesh gateways in order for service mesh data plane
traffic to flow between peers.

This also adds plumbing for envoy integration tests involving peers, and
one starter peering test.
2022-06-09 11:05:18 -05:00
Alessandro De Blasis a0708045b4 lint: conversion 2022-06-09 16:17:20 +01:00
Alessandro De Blasis 72ed2c176b tests: removed redundant probe test 2022-06-09 15:49:45 +01:00
Alessandro De Blasis b53bb6f70e tests: added syscall mocking and tests for Check_OSService 2022-06-09 15:48:34 +01:00
kisunji 1eff652620 Add missing index for read 2022-06-08 13:53:31 -04:00
kisunji 86bf8b8fb6 Add IntentionMatch tests for source peers 2022-06-08 13:53:31 -04:00
kisunji 7d9f413949 Update ServiceIntentionSourceIndex to handle peer 2022-06-08 13:53:31 -04:00
Chris S. Kim 3e71754e7a
Add SourcePeer fields to relevant Intentions types (#13390) 2022-06-08 13:24:10 -04:00
R.B. Boyer c1f20d17ee
peering: allow protobuf requests to populate the default partition or namespace (#13398) 2022-06-08 11:55:18 -05:00
Dhia Ayachi 80556c9ffc
Fix intentions wildcard dest (#13397)
* when enterprise meta are wildcard assume it's a service intention

* fix partition and namespace

* move kind outside the loops

* get the kind check outside the loop and add a comment

Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2022-06-08 10:38:55 -04:00
R.B. Boyer bf647bc9d2
peering: avoid a race between peering establishment and termination (#13389) 2022-06-07 16:29:09 -05:00
Dhia Ayachi 3deaf767f2
Egress gtw/intention rpc endpoint (#13354)
* update gateway-services table with endpoints

* fix failing test

* remove unneeded config in test

* rename "endpoint" to "destination"

* more endpoint renaming to destination in tests

* update isDestination based on service-defaults config entry creation

* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)

* set unknown state to empty to avoid modifying alot of tests

* fix logic to set the kind correctly on CRUD

* fix failing tests

* add missing tests and fix service delete

* fix failing test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* fix a bug with kind and add relevant test

* fix compile error

* fix failing tests

* add kind to clone

* fix failing tests

* fix failing tests in catalog endpoint

* fix service dump test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* remove duplicate tests

* first draft of destinations intention in connect proxy

* remove ServiceDestinationList

* fix failing tests

* fix agent/consul failing tests

* change to filter intentions in the state store instead of adding a field.

* fix failing tests

* fix comment

* fix comments

* store service kind destination and add relevant tests

* changes based on review

* filter on destinations when querying source match

* change state store API to get an IntentionTarget parameter

* add intentions tests

* add destination upstream endpoint

* fix failing test

* fix failing test and a bug with wildcard intentions

* fix failing test

* Apply suggestions from code review

Co-authored-by: alex <8968914+acpana@users.noreply.github.com>

* add missing test and clarify doc

* fix style

* gofmt intention.go

* fix merge introduced issue

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2022-06-07 15:55:02 -04:00
Dhia Ayachi 7602b6ebf2
Egress gtw/connect destination intentions (#13341)
* update gateway-services table with endpoints

* fix failing test

* remove unneeded config in test

* rename "endpoint" to "destination"

* more endpoint renaming to destination in tests

* update isDestination based on service-defaults config entry creation

* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)

* set unknown state to empty to avoid modifying alot of tests

* fix logic to set the kind correctly on CRUD

* fix failing tests

* add missing tests and fix service delete

* fix failing test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* fix a bug with kind and add relevant test

* fix compile error

* fix failing tests

* add kind to clone

* fix failing tests

* fix failing tests in catalog endpoint

* fix service dump test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* remove duplicate tests

* first draft of destinations intention in connect proxy

* remove ServiceDestinationList

* fix failing tests

* fix agent/consul failing tests

* change to filter intentions in the state store instead of adding a field.

* fix failing tests

* fix comment

* fix comments

* store service kind destination and add relevant tests

* changes based on review

* filter on destinations when querying source match

* Apply suggestions from code review

Co-authored-by: alex <8968914+acpana@users.noreply.github.com>

* fix style

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* rename destinationType to targetType.

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
2022-06-07 15:03:59 -04:00
Alessandro De Blasis e38abe68af feat: windows service health checks 2022-06-07 18:27:14 +01:00
R.B. Boyer 0681f3571d
peering: allow mesh gateways to proxy L4 peered traffic (#13339)
Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied.

Note: this does not change the callers to use these mesh gateways.
2022-06-06 14:20:41 -05:00
Fulvio f155ff347c
UDP check for service stanza #12221 (#12722)
* UDP check for service stanza #12221

* add pass status on timeout condition

* delete useless files

* Update check_test.go

improve comment in test

* fix test

* fix requested changes and update TestRuntimeConfig_Sanitize.golden

* add freeport to TestCheckUDPCritical

* improve comment for CheckUDP struct

* fix requested changes

* fix requested changes

* fix requested changes

* add UDP to proto

* add UDP to proto and add a changelog

* add requested test on agent_endpoint_test.go

* add test for given endpoints

* fix failing tests

* add documentation for udp healthcheck

* regenerate proto using buf

* Update website/content/api-docs/agent/check.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/api-docs/agent/check.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/docs/discovery/checks.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/docs/ecs/configuration-reference.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* Update website/content/docs/ecs/configuration-reference.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

* add debug echo

* add debug circle-ci

* add debug circle-ci bash

* use echo instead of status_stage

* remove debug and status from devtools script and use echo instead

* Update website/content/api-docs/agent/check.mdx

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* fix test

* replace status_stage with status

* replace functions with echo

Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
2022-06-06 15:13:19 -04:00
alex ff2ad3ba0c
peering: send leader addr (#13342)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>
2022-06-06 10:00:38 -07:00
Dan Upton 5cd31933d1
xds: remove HTTPCheckFetcher dependency (#13366)
This is the OSS portion of enterprise PR 1994

Rather than directly interrogating the agent-local state for HTTP
checks using the `HTTPCheckFetcher` interface, we now rely on the
config snapshot containing the checks.

This reduces the number of changes required to support server xDS
sessions.

It's not clear why the fetching approach was introduced in
931d167ebb2300839b218d08871f22323c60175d.
2022-06-06 15:15:33 +01:00
R.B. Boyer 4c781d1e15
peering: update how cross-peer upstreams and represented in proxycfg and rendered in xds (#13362)
This removes unnecessary, vestigal remnants of discovery chains.
2022-06-03 16:42:50 -05:00
cskh bd3a4dfeab
Add isLeader metric to track if a server is a leader (#13304)
CTIA-21: sdd is_leader metric to track if a server is a leader

Co-authored-by: alex <8968914+acpana@users.noreply.github.com>
2022-06-03 13:07:37 -04:00
Freddy 36322c49cf
Merge pull request #13340 from hashicorp/peering/public-listener 2022-06-02 15:15:29 -06:00
Chris S. Kim 8a7dfbaa7d
Fix KVSGet method to handle QueryOptions properly (#13344) 2022-06-02 12:26:18 -04:00
Freddy 137bfbb02b Update public listener with SPIFFE Validator
Envoy's SPIFFE certificate validation extension allows for us to
validate against different root certificates depending on the trust
domain of the dialing proxy.

If there are any trust bundles from peers in the config snapshot then we
use the SPIFFE validator as the validation context, rather than the
usual TrustedCA.

The injected validation config includes the local root certificates as
well.
2022-06-01 17:06:33 -06:00
freddygv ad6dbe081a Add agent cache-type for TrustBundleListByService
There are a handful of changes in this commit:
* When querying trust bundles for a service we need to be able to
  specify the namespace of the service.
* The endpoint needs to track the index because the cache watches use
  it.
* Extracted bulk of the endpoint's logic to a state store function
  so that index tracking could be tested more easily.
* Removed check for service existence, deferring that sort of work to ACL authz
* Added the cache type
2022-06-01 17:05:10 -06:00
freddygv 073c9e3a91 Update assumptions around exported-service config
Given that the exported-services config entry can use wildcards, the
precedence for wildcards is handled as with intentions. The most exact
match is the match that applies for any given service. We do not take
the union of all that apply.

Another update that was made was to reflect that only one
exported-services config entry applies to any given service in a
partition. This is a pre-existing constraint that gets enforced by
the Normalize() method on that config entry type.
2022-06-01 17:03:51 -06:00
Freddy 6ef38eaea7
Configure upstream TLS context with peer root certs (#13321)
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.

Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.

This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.
2022-06-01 15:53:52 -06:00
R.B. Boyer addfef5065
test: regenerate golden files (#13336)
make envoy-regen
    go test ./agent/config -update
2022-06-01 15:17:03 -05:00
Chris S. Kim 58ffa0488d
Revert getPathSuffixUnescaped (#13256) 2022-06-01 13:17:14 -04:00
Dan Upton e6dc26e087
proxycfg: replace direct agent cache usage with interfaces (#13320)
This is the OSS portion of enterprise PRs 1904, 1905, 1906, 1907, 1949,
and 1971.

It replaces the proxycfg manager's direct dependency on the agent cache
with interfaces that will be implemented differently when serving xDS
sessions from a Consul server.
2022-06-01 16:18:06 +01:00
Chris S. Kim 44a318ef73
Reimplement fs.FileInfo interface (#13315)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-06-01 11:09:51 -04:00
Dhia Ayachi d4a04457e1
update gateway-services table with endpoints (#13217)
* update gateway-services table with endpoints

* fix failing test

* remove unneeded config in test

* rename "endpoint" to "destination"

* more endpoint renaming to destination in tests

* update isDestination based on service-defaults config entry creation

* use a 3 state kind to be able to set the kind to unknown (when neither a service or a destination exist)

* set unknown state to empty to avoid modifying alot of tests

* fix logic to set the kind correctly on CRUD

* fix failing tests

* add missing tests and fix service delete

* fix failing test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* fix a bug with kind and add relevant test

* fix compile error

* fix failing tests

* add kind to clone

* fix failing tests

* fix failing tests in catalog endpoint

* fix service dump test

* Apply suggestions from code review

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* remove duplicate tests

* rename consts and fix kind when no destination is defined in the service-defaults.

* rename Kind to ServiceKind and change switch to use .(type)

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
2022-05-31 16:20:12 -04:00
Chris S. Kim ea1e4aa52d
Update repo to use go:embed (#10996)
Replace bindata packages with stdlib go:embed.
Modernize some uiserver code with newer interfaces introduced in go 1.16 (mainly working with fs.File instead of http.File.
Remove steps that are no longer used from our build files.
Add Github Action to detect differences in agent/uiserver/dist and verify that the files are correct (by compiling UI assets and comparing contents).
2022-05-31 15:33:56 -04:00
Riddhi Shah d558914a0f
[OSS] Fix merge central config tests (#13309)
Setting the right enterprise meta to fix the merge central config tests.
Re-added the tests that were failing on the OSS to ENT merge.
2022-05-31 12:04:19 -07:00
freddygv 14bff4fba6 Use embedded SpiffeID for peered upstreams 2022-05-31 09:55:37 -06:00
freddygv 4d3e09e8f8 Remove intermediate representation of SPIFFE IDs
xDS only ever uses the string representation, so we can avoid passing
around connect.SpiffeIDService objects around.
2022-05-31 09:55:37 -06:00
freddygv 5cd5108075 Return SPIFFE ID for connect proxies in PeerMeta
Proxies dialing exporting services need to know the SPIFFE ID of
services dialed so that the upstream's SANs can be validated.

This commit attaches the SPIFFE ID to all connect proxies exported over
the peering stream so that they are available to importing clusters.

The data in the SPIFFE ID cannot be re-constructed in peer clusters
because the partition of exported services is overwritten on imports.
2022-05-31 09:55:37 -06:00
Freddy a75af9d94a
[OSS] Add grpc endpoint to fetch a specific trust bundle (#13292)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-05-31 09:54:40 -06:00
Matt Keeler b9e8b5c692
Fix a flaky test (#13282)
At the end of this test we were trying to ensure that updating a service in the local state causes it to re-register the service with the config manager.

The config manager in the same method will also call RegisteredProxies to determine if any need to be removed. This portion of the test is not attempting to verify that behavior.

Because the test is only blocked waiting for the Register event before it can end and assert all the mock expectations were met, we may not see the call to RegisteredProxies. This is especially apparent when tests are run with the race detector.

As we don’t actually care if that method is executed before the end of the test we can simply transition from expecting it to be called exactly once to a 0 or 1 times assertion.
2022-05-27 13:25:08 -04:00
Dan Upton a6a6d5a8ee
Enable servers to configure arbitrary proxies from the catalog (#13244)
OSS port of enterprise PR 1822

Includes the necessary changes to the `proxycfg` and `xds` packages to enable
Consul servers to configure arbitrary proxies using catalog data.

Broadly, `proxycfg.Manager` now has public methods for registering,
deregistering, and listing registered proxies — the existing local agent
state-sync behavior has been moved into a separate component that makes use of
these methods.

When an xDS session is started for a proxy service in the catalog, a goroutine
will be spawned to watch the service in the server's state store and
re-register it with the `proxycfg.Manager` whenever it is updated (and clean
it up when the client goes away).
2022-05-27 12:38:52 +01:00
alex 2d8664d384
monitor leadership in peering service (#13257)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-05-26 17:55:16 -07:00
Riddhi Shah 8714ade534
Termporarily disable validation of merge central config response (#13266)
Temporarily disabling the validation of merge central config response since
it is breaking OSS to ENT merging.
A follow up PR will patch the fixes.
2022-05-26 13:49:40 -07:00
Chris S. Kim d73a9522cb
Add support for streaming CA roots to peers (#13260)
Sender watches for changes to CA roots and sends
them through the replication stream. Receiver saves
CA roots to tablePeeringTrustBundle
2022-05-26 15:24:09 -04:00
Riddhi Shah 6f57acc1bf
Remove tests failing on ent (#13255)
Will follow up with the fixed version of these tests that passes in ent.
2022-05-26 10:17:59 -07:00
John Cowen bf5f1482fd
Export top-level HCP Enabled go-template variable for UI (#13165)
* Update ui template data to export HCPEnabled at the top level
2022-05-26 17:23:56 +01:00
DanStough 65ca7e0bfb fix: multiple grpc/http2 services for ingress listeners 2022-05-26 10:43:58 -04:00
Riddhi Shah e5f1d8dce4
Add support for merge-central-config query param (#13001)
Adds a new query param merge-central-config for use with the below endpoints:

/catalog/service/:service
/catalog/connect/:service
/health/service/:service
/health/connect/:service

If set on the request, the response will include a fully resolved service definition which is merged with the proxy-defaults/global and service-defaults/:service config entries (on-demand style). This is useful to view the full service definition for a mesh service (connect-proxy kind or gateway kind) which might not be merged before being written into the catalog (example: in case of services in the agentless model).
2022-05-25 13:20:17 -07:00
R.B. Boyer 4f9a9bb851
remove a source of test panics (#13227) 2022-05-25 14:33:00 -05:00
R.B. Boyer dae47101fa
api: ensure peering API endpoints do not use protobufs (#13204)
I noticed that the JSON api endpoints for peerings json encodes protobufs directly, rather than converting them into their `api` package equivalents before marshal/unmarshaling them.

I updated this and used `mog` to do the annoying part in the middle. 

Other changes:
- the status enum was converted into the friendlier string form of the enum for readability with tools like `curl`
- some of the `api` library functions were slightly modified to match other similar endpoints in UX (cc: @ndhanushkodi )
- peeringRead returns `nil` if not found
- partitions are NOT inferred from the agent's partition (matching 1.11-style logic)
2022-05-25 13:43:35 -05:00
R.B. Boyer bc10055edc
peering: replicate expected SNI, SPIFFE, and service protocol to peers (#13218)
The importing peer will need to know what SNI and SPIFFE name
corresponds to each exported service. Additionally it will need to know
at a high level the protocol in use (L4/L7) to generate the appropriate
connection pool and local metrics.

For replicated connect synthetic entities we edit the `Connect{}` part
of a `NodeService` to have a new section:

    {
      "PeerMeta": {
        "SNI": [
          "web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul"
        ],
        "SpiffeID": [
          "spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web"
        ],
        "Protocol": "tcp"
      }
    }

This data is then replicated and saved as-is at the importing side. Both
SNI and SpiffeID are slices for now until I can be sure we don't need
them for how mesh gateways will ultimately work.
2022-05-25 12:37:44 -05:00
R.B. Boyer 69191fc0da
peering: disable requirement for mesh gateways initially (#13213) 2022-05-25 10:13:23 -05:00
Kyle Havlovitz cebf7b23f6
Merge pull request #13143 from hashicorp/envoy-connection-limit
Add connection limit setting to service defaults
2022-05-25 07:48:50 -07:00
Kyle Havlovitz f5f949d486 Fix proto lint errors after version bump 2022-05-24 18:44:54 -07:00
Kyle Havlovitz 749591ec98 Specify go_package explicitly 2022-05-24 10:22:53 -07:00
cskh b7eec4c05b
fix: non-leader agents return 404 on Get Intention exact api (#13179)
* fix: non-leader agents return 404 on Get Intention exact api

- rpc call method appends extra error message, so change == to
  "Strings.Contains"

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-05-24 13:21:15 -04:00
Kyle Havlovitz 03dea180ad Add connection limit setting to service defaults 2022-05-24 10:13:38 -07:00
DanStough 2c8ca25d8a chore(test): Update bats version 2022-05-24 11:56:08 -04:00
DanStough df59d8ab0d feat: add endpoint struct to ServiceConfigEntry 2022-05-24 11:56:08 -04:00
alex 451dc50f4f
peering: expose IsLeader, hung up on dialer if follower (#13164)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-05-23 11:30:58 -07:00
Matt Keeler 1fd02a13c2
Migrate from protoc to buf (#12841)
* Install `buf` instead of `protoc`
* Created `buf.yaml` and `buf.gen.yaml` files in the two proto directories to control how `buf` generates/lints proto code.
* Invoke `buf` instead of `protoc`
* Added a `proto-format` make target.
* Committed the reformatted proto files.
* Added a `proto-lint` make target.
* Integrated proto linting with CI
* Fixed tons of proto linter warnings.
* Got rid of deprecated builtin protoc-gen-go grpc plugin usage. Moved to direct usage of protoc-gen-go-grpc.
* Unified all proto directories / go packages around using pb prefixes but ensuring all proto packages do not have the prefix.
2022-05-23 10:37:52 -04:00
cskh 39cb731988
Upgrade golangci-lint for go v1.18 (#13176) 2022-05-23 10:26:45 -04:00
R.B. Boyer 3b12a5179f
test: fix flaky test TestEventBufferFuzz (#13175) 2022-05-23 09:22:30 -05:00
Matt Keeler c629e89289
Fix tests broken in #13173 (#13178)
I changed the error type returned in a situation but didn’t update the tests to expect that error.
2022-05-23 10:00:06 -04:00
Matt Keeler 8a968299dd
Fix flaky tests in the agent/grpc/public/services/serverdiscovery package (#13173)
Occasionally we had seen the TestWatchServers_ACLToken_PermissionDenied be flagged as flaky in circleci. This change should fix that.

Why it fixes it is complicated. The test was failing with a panic when a mocked ACL Resolver was being called more times than expected. I struggled for a while to determine how that could be. This test should call authorize once and only once and the error returned should cause the stream to be terminated and the error returned to the gRPC client. Another oddity was no amount of running this test locally seemed to be able to reproduce the issue. I ran the test hundreds of thousands of time and it always passed.

It turns out that there is nothing wrong with the test. It just so happens that the panic from unexpected invocation of a mocked call happened during the test but was caused by a previous test (specifically the TestWatchServers_StreamLifecycle test)

The stream from the previous test remained open after all the test Cleanup functions were run and it just so happened that when the EventPublisher eventually picked up that the context was cancelled during cleanup, it force closes all subscriptions which causes some loops to be re-entered and the streams to be reauthorized. Its that looping in response to forced subscription closures that causes the mock to eventually panic. All the components, publisher, server, client all operate based on contexts. We cancel all those contexts but there is no syncrhonous way to know when they are stopped.

We could have implemented a syncrhonous stop but in the context of an actual running Consul, context cancellation + async stopping is perfectly fine. What we (Dan and I) eventually thought was that the behavior of grpc streams such as this when a server was shutting down wasn’t super helpful. What we would want is for a client to be able to distinguish between subscription closed because something may have changed requiring re-authentication and subscription closed because the server is shutting down. That way we can send back appropriate error messages to detail that the server is shutting down and not confuse users with potentially needing to resubscribe.

So thats what this PR does. We have introduced a shutting down state to our event subscriptions and the various streaming gRPC services that rely on the event publisher will all just behave correctly and actually stop the stream (not attempt transparent reauthorization) if this particular error is the one we get from the stream. Additionally the error that gets transmitted back through gRPC when this does occur indicates to the consumer that the server is going away. That is more helpful so that a client can then attempt to reconnect to another server.
2022-05-23 08:59:13 -04:00
R.B. Boyer 69d3e729a4
agent: allow for service discovery queries involving peer name to use streaming (#13168) 2022-05-20 15:27:01 -05:00
Dan Upton 30775ed54d
proxycfg: remove dependency on cache.UpdateEvent (#13144)
OSS portion of enterprise PR 1857.

This removes (most) references to the `cache.UpdateEvent` type in the
`proxycfg` package.

As we're going to be direct usage of the agent cache with interfaces that
can be satisfied by alternative server-local datasources, it doesn't make
sense to depend on this type everywhere anymore (particularly on the
`state.ch` channel).

We also plan to extract `proxycfg` out of Consul into a shared library in
the future, which would require removing this dependency.

Aside from a fairly rote find-and-replace, the main change is that the
`cache.Cache` and `health.Client` types now accept a callback function
parameter, rather than a `chan<- cache.UpdateEvents`. This allows us to
do the type conversion without running another goroutine.
2022-05-20 15:47:40 +01:00
R.B. Boyer 63a9175bd6
peering: accept replication stream of discovery chain information at the importing side (#13151) 2022-05-19 16:37:52 -05:00
R.B. Boyer 68789effeb
test: TestServer_RPC_MetricsIntercept should use a concurrency-safe metrics store (#13157) 2022-05-19 15:39:28 -05:00
cskh df27fa0c84
Retry on bad dogstatsd connection (#13091)
- Introduce a new telemetry configurable parameter retry_failed_connection. User can set the value to true to let consul agent continue its start process on failed connection to datadog server. When set to false, agent will stop on failed start. The default behavior is true.

Co-authored-by: Dan Upton <daniel@floppy.co>
Co-authored-by: Evan Culver <eculver@users.noreply.github.com>
2022-05-19 16:03:46 -04:00
R.B. Boyer 91691eca87 peering: replicate discovery chains information to importing peers
Treat each exported service as a "discovery chain" and replicate one
synthetic CheckServiceNode for each chain and remote mesh gateway.

The health will be a flattened generated check of the checks for that
mesh gateway node.
2022-05-19 14:21:44 -05:00
R.B. Boyer bf05e8c1f1 prefactor some functions out of the monolithic file 2022-05-19 14:21:29 -05:00
R.B. Boyer 09861a2792
test: fix incorrect use of t instead of r in retry test (#13146) 2022-05-19 14:00:07 -05:00
Dan Upton 7492357b43
config: prevent top-level verify_incoming enabling mTLS on gRPC port (#13118)
Fixes #13088

This is a backwards-compatibility bug introduced in 1.12.
2022-05-18 16:15:57 +01:00
Freddy 6c868b6c0e
Patches to peering initiation for POC demo (#13076)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-05-13 13:01:00 -06:00
Dhia Ayachi 70b93ea693
When a host header is defined override req.Host in the metrics ui (#13071)
* When a host header is defined override the req.Host in the metrics ui endpoint.

* add changelog
2022-05-13 14:05:22 -04:00
Freddy 160acdf876
Actually block when syncing subscriptions (#13066)
By changing to use WatchCtx we will actually block for changes to the peering list. WatchCh creates a goroutine to collect errors from WatchCtx and returns immediately.

The existing behavior wouldn't result in a tight loop because of the rate limiting in the surrounding function, but it would still lead to more work than is necessary.
2022-05-12 17:36:14 -06:00
Evan Culver 535e811020
peering: add TrustBundleListByService endpoint (#13048) 2022-05-12 15:58:22 -07:00
Freddy 8894365c5a
[OSS] Add upsert handling for receiving CheckServiceNode (#13061) 2022-05-12 15:04:44 -06:00
Matt Keeler 42aec5caf4
Watch the singular service resolver instead of the list + filtering to 1 (#13012)
* Watch the singular service resolver instead of the list + filtering to 1

* Rename the ConfigEntries cache type to ConfigEntryList
2022-05-12 16:34:17 -04:00
R.B. Boyer 26cfdc1967
structs: add convenience methods to sort slices of ServiceName values (#13038) 2022-05-12 10:08:50 -05:00
R.B. Boyer b932d0dabc
test: ensure this package uses freeport for port allocation (#13036) 2022-05-11 14:20:50 -05:00
R.B. Boyer c855df87ec
remove remaining shim runStep functions (#13015)
Wraps up the refactor from #13013
2022-05-10 16:24:45 -05:00
R.B. Boyer 9ad10318cd
add general runstep test helper instead of copying it all over the place (#13013) 2022-05-10 15:25:51 -05:00
Jared Kirschner a18d7092cb
Merge pull request #12463 from hashicorp/docs/consistency-mode-improvements
Improve consistency mode docs
2022-05-09 23:04:00 -04:00
Jared Kirschner bc69d70a55 docs: clarify consistency mode operation
Changes include:
- Add diagrams of the operation of different consistency modes
- Note that only stale reads benefit from horizontal scaling
- Increase scannability with headings
- Document consistency mode defaults and how to override for
  DNS and HTTP API interfaces
- Document X-Consul-Effective-Consistency response header
2022-05-09 16:39:48 -07:00
FFMMM ea083062d2
add err msg on PeeringRead not found (#12986)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-05-09 15:22:42 -07:00
FFMMM 76a6647700
expose meta tags for peering (#12964) 2022-05-09 13:47:37 -07:00
Mark Anderson 785a6d9c8e Add oss test
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-09 10:07:19 -07:00
Mark Anderson be059184dd Fix up enterprise version tag.
Changes to how the version string was handled created small regression with the release of consul 1.12.0 enterprise.

Many tools use the Config:Version field reported by the agent/self resource to determine whether Consul is an enterprise or OSS instance, expect something like 1.12.0+ent for enterprise and simply 1.12.0 for OSS. This was accidentally broken during the runup to 1.12.x

This work fixes the value returned by both the self endpoint in ["Config"]["Version"] and the metrics consul.version field.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-09 10:07:19 -07:00
Evan Culver d64726c8e9
peering: add store.PeeringsForService implementation (#12957) 2022-05-06 12:35:31 -07:00
Eric Haberkorn fa6e17bdfd
Merge pull request #12956 from hashicorp/suport-lambda-connect-proxy
Support Invoking Lambdas from Sidecar Proxies
2022-05-06 08:17:38 -04:00
Eric 999fc2dbd5 Support making requests to lambda from connect proxies. 2022-05-05 17:42:30 -04:00
FFMMM 5a6d1cad4e
api: add PeeeringList, polish (#12934) 2022-05-05 14:15:42 -07:00
Riddhi Shah a9c96b6975
Validate port on mesh service registration (#12881)
Add validation to ensure connect native services have a port or socketpath specified on catalog registration.
This was the only missing piece to ensure all mesh services are validated for a port (or socketpath) specification on catalog registration.
2022-05-05 09:13:30 -07:00
Mark Anderson 18193f2916
Support vault namespaces in connect CA (#12904)
* Support vault namespaces in connect CA

Follow on to some missed items from #12655

From an internal ticket "Support standard "Vault namespace in the
path" semantics for Connect Vault CA Provider"

Vault allows the namespace to be specified as a prefix in the path of
a PKI definition, but our usage of the Vault API includes calls that
don't support a namespaced key. In particular the sys.* family of
calls simply appends the key, instead of prefixing the namespace in
front of the path.

Unfortunately it is difficult to reliably parse a path with a
namespace; only vault knows what namespaces are present, and the '/'
separator can be inside a key name, as well as separating path
elements. This is in use in the wild; for example
'dc1/intermediate-key' is a relatively common naming schema.

Instead we add two new fields: RootPKINamespace and
IntermediatePKINamespace, which are the absolute namespace paths
'prefixed' in front of the respective PKI Paths.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 19:41:55 -07:00
Chris S. Kim e55aac9d30
Default discovery chain when upstream targets a DestinationPeer (#12942) 2022-05-04 16:25:25 -04:00
Mark Anderson 1497421b65
Merge pull request #12878 from hashicorp/ma/x-forwarded-client-cert
Support x-forwarded-client-cert
2022-05-04 11:05:44 -07:00
Dan Upton 6bfdb48560
acl: gRPC login and logout endpoints (#12935)
Introduces two new public gRPC endpoints (`Login` and `Logout`) and
includes refactoring of the equivalent net/rpc endpoints to enable the
majority of logic to be reused (i.e. by extracting the `Binder` and
`TokenWriter` types).

This contains the OSS portions of the following enterprise commits:

- 75fcdbfcfa6af21d7128cb2544829ead0b1df603
- bce14b714151af74a7f0110843d640204082630a
- cc508b70fbf58eda144d9af3d71bd0f483985893
2022-05-04 17:38:45 +01:00
Mark Anderson 13f5a1f6a8 Fix tests for APPEND_FORWARD change
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson 69c129c73f Change to use APPEND_FORWARD for terminating gateway
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson db0c61303f Update mesh config tests
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson e6282c7c64 Docs and changelog edits
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson c6dbc34172 Fixup missed config entry
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson d8f4cc5537 Add x-forwarded-client-cert headers
Description
Add x-fowarded-client-cert information on trusted incoming connections.

Envoy provides support forwarding and annotating the
x-forwarded-client-cert header via the forward_client_cert_details
set_current_client_cert_details filter fields. It would be helpful for
consul to support this directly in its config. The escape hatches are
a bit cumbersome for this purpose.

This has been implemented on incoming connections to envoy. Outgoing
(from the local service through the sidecar) will not have a
certificate, and so are left alone.

A service on an incoming connection will now get headers something like this:

```
X-Forwarded-Client-Cert:[By=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/counting;Hash=61ad5cbdfcb50f5a3ec0ca60923d61613c149a9d4495010a64175c05a0268ab2;Cert="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Chain="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Subject="";URI=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/dashboard]
```

Closes #12852
2022-05-04 08:50:58 -07:00
Kyle Havlovitz 369f4848e3
Merge pull request #12885 from hashicorp/acl-err-cache
Store and return RPC error in ACL cache entries
2022-05-03 10:44:22 -07:00
Kyle Havlovitz 3bd001fb29 Return ACLRemoteError from cache and test it correctly 2022-05-03 10:05:26 -07:00
FFMMM 4cd68b4534
[sync oss] api: add peering api module (#12911) 2022-05-02 11:49:05 -07:00
Chris S. Kim 829554c706
peering: Make Upstream peer-aware (#12900)
Adds DestinationPeer field to Upstream.
Adds Peer field to UpstreamID and its string conversion functions.
2022-04-29 18:12:51 -04:00
Chris S. Kim 33bfaf5671
Cleanup peering files that used error types that were removed (#12892) 2022-04-29 14:02:26 -04:00
Mathew Estafanous 893b740dff
Unify various status errors into one HTTP error type. (#12594)
Replaces specific error types for HTTP Status codes with 
a generic HTTPError type.

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-04-29 13:42:49 -04:00
Kyle Havlovitz f84ed5f70b Store and return rpc error in acl cache entries 2022-04-28 09:08:55 -07:00
R.B. Boyer 642b75b60b
health: ensure /v1/health/service/:service endpoint returns the most recent results when a filter is used with streaming (#12640)
The primary bug here is in the streaming subsystem that makes the overall v1/health/service/:service request behave incorrectly when servicing a blocking request with a filter provided.

There is a secondary non-streaming bug being fixed here that is much less obvious related to when to update the `reply` variable in a `blockingQuery` evaluation. It is unlikely that it is triggerable in practical environments and I could not actually get the bug to manifest, but I fixed it anyway while investigating the original issue.

Simple reproduction (streaming):

1. Register a service with a tag.

        curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
            --header 'Content-Type: application/json' \
            --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "a" ], "EnableTagOverride": true }'

2. Do an initial filter query that matches on the tag.

        curl -sLi --get 'http://localhost:8500/v1/health/service/test' --data-urlencode 'filter=a in Service.Tags'

3. Note you get one result. Use the `X-Consul-Index` header to establish
   a blocking query in another terminal, this should not return yet.

        curl -sLi --get 'http://localhost:8500/v1/health/service/test?index=$INDEX' --data-urlencode 'filter=a in Service.Tags'

4. Re-register that service with a different tag.

        curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
            --header 'Content-Type: application/json' \
            --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "b" ], "EnableTagOverride": true }'

5. Your blocking query from (3) should return with a header
   `X-Consul-Query-Backend: streaming` and empty results if it works
   correctly `[]`.

Attempts to reproduce with non-streaming failed (where you add `&near=_agent` to the read queries and ensure `X-Consul-Query-Backend: blocking-query` shows up in the results).
2022-04-27 10:39:45 -05:00
R.B. Boyer c949de9f50
structs: ensure exported-services PeerName field can be addressed as peer_name (#12862) 2022-04-27 10:27:21 -05:00
Dhia Ayachi 9dc5200155
update raft to v1.3.8 (#12844)
* update raft to v1.3.7

* add changelog

* fix compilation error

* fix HeartbeatTimeout

* fix ElectionTimeout to reload only if value is valid

* fix default values for `ElectionTimeout` and `HeartbeatTimeout`

* fix test defaults

* bump raft to v1.3.8
2022-04-25 10:19:26 -04:00
R.B. Boyer 809344a6f5
peering: initial sync (#12842)
- Add endpoints related to peering: read, list, generate token, initiate peering
- Update node/service/check table indexing to account for peers
- Foundational changes for pushing service updates to a peer
- Plumb peer name through Health.ServiceNodes path

see: ENT-1765, ENT-1280, ENT-1283, ENT-1283, ENT-1756, ENT-1739, ENT-1750, ENT-1679,
     ENT-1709, ENT-1704, ENT-1690, ENT-1689, ENT-1702, ENT-1701, ENT-1683, ENT-1663,
     ENT-1650, ENT-1678, ENT-1628, ENT-1658, ENT-1640, ENT-1637, ENT-1597, ENT-1634,
     ENT-1613, ENT-1616, ENT-1617, ENT-1591, ENT-1588, ENT-1596, ENT-1572, ENT-1555

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Evan Culver <eculver@hashicorp.com>
Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
2022-04-21 17:34:40 -05:00
Will Jordan 45ffdc360e
Add timeout to Client RPC calls (#11500)
Adds a timeout (deadline) to client RPC calls, so that streams will no longer hang indefinitely in unstable network conditions.

Co-authored-by: kisunji <ckim@hashicorp.com>
2022-04-21 16:21:35 -04:00
Matt Keeler f49adfaaf0
Implement the ServerDiscovery.WatchServers gRPC endpoint (#12819)
* Implement the ServerDiscovery.WatchServers gRPC endpoint
* Fix the ConnectCA.Sign gRPC endpoints metadata forwarding.
* Unify public gRPC endpoints around the public.TraceID function for request_id logging
2022-04-21 12:56:18 -04:00
Blake Covarrubias 2beea7eb7c
acl: Clarify node/service identities must be lowercase (#12807)
Modify ACL error message for invalid node/service identities names to
clearly state only lowercase alphanumeric characters are supported.
2022-04-21 09:29:16 -07:00
R.B. Boyer bbd38e95ce
chore: upgrade mockery to v2 and regenerate (#12836) 2022-04-21 09:48:21 -05:00
R.B. Boyer cf0c5110be
ca: fix a bug that caused a non blocking leaf cert query after a blocking leaf cert query to block (#12820)
Fixes #12048

Fixes #12319

Regression introduced in #11693

Local reproduction steps:
1. `consul agent -dev`
2. `curl -sLiv 'localhost:8500/v1/agent/connect/ca/leaf/web'`
3. make note of the `X-Consul-Index` header returned 
4. `curl -sLi 'localhost:8500/v1/agent/connect/ca/leaf/web?index=<VALUE_FROM_STEP_3>'`
5. Kill the above curl when it hangs with Ctrl-C
6. Repeat (2) and it should not hang.
2022-04-20 12:21:47 -05:00
Riddhi Shah 1d49f5c84e
[OSS] gRPC call to get envoy bootstrap params (#12825)
Adds a new gRPC endpoint to get envoy bootstrap params. The new consul-dataplane service will use this
endpoint to generate an envoy bootstrap configuration.
2022-04-19 17:24:21 -07:00
Matt Keeler 3badd4c35c
Add event generation for autopilot state updates (#12626)
Whenever autopilot updates its state it notifies Consul. That notification will then trigger Consul to extract out the ready server information. If the ready servers have changed, then an event will be published to notify any subscribers of the full set of ready servers.

All these ready server event things are contained within an autopilotevents package instead of the consul package to make importing them into the grpc related packages possible
2022-04-19 13:03:03 -04:00
Evan Culver 9d0b5bf8e9
connect: Add Envoy 1.22 to integration tests, remove Envoy 1.18 (#12805)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-04-18 09:36:07 -07:00
DanStough a050aa39b9 Update go version to 1.18.1 2022-04-18 11:41:10 -04:00
Kyle Havlovitz 449a0f855c
Add an internal env var for managed cluster config in the ui (#12796) 2022-04-15 09:55:52 -07:00
John Murret 1e118a5410 set vault namespaces on vault client prior to logging in with the vault auth method 2022-04-14 12:18:06 -06:00
Evan Culver e62745c82c
connect: Add Envoy 1.21.1 to support matrix, remove 1.17.4 (#12777) 2022-04-14 10:44:42 -07:00
Dan Upton 769d1d6e8e
ConnectCA.Sign gRPC Endpoint (#12787)
Introduces a gRPC endpoint for signing Connect leaf certificates. It's also
the first of the public gRPC endpoints to perform leader-forwarding, so
establishes the pattern of forwarding over the multiplexed internal RPC port.
2022-04-14 14:26:14 +01:00
Kyle Havlovitz 199f1c7200
Fix namespace default field names in expanded token output 2022-04-13 16:46:39 -07:00
Paul Glass 5eea62b47a
acl: Adjust region handling in AWS IAM auth method (#12774)
* acl: Adjust region handling in AWS IAM auth method
2022-04-13 14:31:37 -05:00
Eric Haberkorn 7961901153
Merge pull request #12773 from hashicorp/fix-lambda-intentions-and-routing
Implement Routing and Intentions for Lambdas
2022-04-13 13:01:15 -04:00
Eric b062f8c2aa Implement routing and intentions for AWS Lambdas 2022-04-13 11:45:25 -04:00
Karl Cardenas b0b197964c
Merge pull request #12562 from hashicorp/docs/blake-agent-config
docs: Agent configuration hierarchy reorganization
2022-04-12 12:33:42 -07:00
FFMMM cf7e6484aa
add more labels to RequestRecorder (#12727)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-04-12 10:50:25 -07:00
Matt Keeler 2a4ca71d3f
Move to using a shared EventPublisher (#12673)
Previously we had 1 EventPublisher per state.Store. When a state store was closed/abandoned such as during a consul snapshot restore, this had the behavior of force closing subscriptions for that topic and evicting event snapshots from the cache.

The intention of this commit is to keep all that behavior. To that end, the shared EventPublisher now supports the ability to refresh a topic. That will perform the force close + eviction. The FSM upon abandoning the previous state.Store will call RefreshTopic for all the topics with events generated by the state.Store.
2022-04-12 09:47:42 -04:00
Blake Covarrubias 3175bf6b1b Remove .html extensions from docs URLs 2022-04-11 17:38:49 -07:00
Natalie Smith cd17e98800 docs: fix yet more references to agent/options 2022-04-11 17:38:49 -07:00
R.B. Boyer f5a882f66c
fix broken test (#12741) 2022-04-11 10:56:57 -05:00
Jared Kirschner 9a81a77264
Merge pull request #12725 from hashicorp/clarify-service-deregister-after-critical-message
improve error msg for deregister critical service
2022-04-07 18:01:54 -04:00
R.B. Boyer f4eac06b21
xds: ensure that all connect timeout configs can apply equally to tproxy direct dial connections (#12711)
Just like standard upstreams the order of applicability in descending precedence:

1. caller's `service-defaults` upstream override for destination
2. caller's `service-defaults` upstream defaults
3. destination's `service-resolver` ConnectTimeout
4. system default of 5s

Co-authored-by: mrspanishviking <kcardenas@hashicorp.com>
2022-04-07 16:58:21 -05:00
Jared Kirschner 1835e761fd improve error msg for deregister critical service
If a service is automatically registered because it has a critical health check
for longer than deregister_critical_service_after, the error message will now
include:
- mention of the deregister_critical_service_after option
- the value of deregister_critical_service_after for that check
2022-04-07 14:50:02 -07:00
Kyle Havlovitz a2c28526a7
Merge pull request #12685 from hashicorp/http-check-redirect-option
Add a field to disable following redirects on http checks
2022-04-07 11:29:27 -07:00
Matt Keeler 3447880091
Enable running autopilot state updates on all servers (#12617)
* Fixes a lint warning about t.Errorf not supporting %w

* Enable running autopilot on all servers

On the non-leader servers all they do is update the state and do not attempt any modifications.

* Fix the RPC conn limiting tests

Technically they were relying on racey behavior before. Now they should be reliable.
2022-04-07 10:48:48 -04:00
FFMMM 0f68bf879a
[rpc/middleware][consul] plumb intercept off, add server level happy test (#12692) 2022-04-06 14:33:05 -07:00
FFMMM 6bdde40d5e
lower log to trace (#12708) 2022-04-06 11:37:08 -07:00
Kyle Havlovitz b3db499c74 Add a field to disable following redirects on http checks 2022-04-05 16:12:18 -07:00
Mark Anderson ed3e42296d Fixup acl.EnterpriseMeta
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 15:11:49 -07:00
Mark Anderson 0905c1d83d Manual Structs fixup
Change things by hand that I couldn't figure out how to automate

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 14:51:10 -07:00
Mark Anderson 497b300c76 add new entmeta stuff.
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 14:49:31 -07:00
R.B. Boyer 12523197f9
syncing changes back from enterprise (#12701) 2022-04-05 15:46:56 -05:00
Riddhi Shah 0e5d46e9c4
Merge pull request #12695 from hashicorp/feature-negotiation-grpc-api-oss
[OSS] Supported dataplane features gRPC endpoint
2022-04-05 11:26:33 -07:00
Dan Upton e3d2b91e34
ca: move ConnectCA.Sign authorization logic to CAManager (#12609)
OSS sync of enterprise changes at 8d6fd125
2022-04-05 13:16:20 -05:00
Kyle Havlovitz 9380343689
Merge pull request #12672 from hashicorp/tgate-san-validation
Respect SNI with terminating gateways and log a warning if it isn't set alongside TLS
2022-04-05 11:15:59 -07:00
Riddhi Shah f053279c4e [OSS] Supported dataplane features gRPC endpoint
Adds a new gRPC service and endpoint to return the list of supported
consul dataplane features. The Consul Dataplane will use this API to
customize its interaction with that particular server.
2022-04-05 07:38:58 -07:00
Dan Upton e48c1611ee
WatchRoots gRPC endpoint (#12678)
Adds a new gRPC streaming endpoint (WatchRoots) that dataplane clients will
use to fetch the current list of active Connect CA roots and receive new
lists whenever the roots are rotated.
2022-04-05 15:26:14 +01:00
Dhia Ayachi cdcb249449
add a rate limiter to config auto-reload (#12490)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* do not reset timer on errors, rename OS specific files

* rename New func

* events trigger on write and rename

* add missing test

* fix flaking tests

* fix flaky test

* check reconcile when removed

* delete invalid file

* fix test to create files with different mod time.

* back date file instead of sleeping

* add watching file in agent command.

* fix watcher call to use new API

* add configuration and stop watcher when server stop

* add certs as watched files

* move FileWatcher to the agent start instead of the command code

* stop watcher before replacing it

* save watched files in agent

* add add and remove interfaces to the file watcher

* fix remove to not return an error

* use `Add` and `Remove` to update certs files

* fix tests

* close events channel on the file watcher even when the context is done

* extract `NotAutoReloadableRuntimeConfig` is a separate struct

* fix linter errors

* add Ca configs and outgoing verify to the not auto reloadable config

* add some logs and fix to use background context

* add tests to auto-config reload

* remove stale test

* add tests to changes to config files

* add check to see if old cert files still trigger updates

* rename `NotAutoReloadableRuntimeConfig` to `StaticRuntimeConfig`

* fix to re add both key and cert file. Add test to cover this case.

* review suggestion

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* add check to static runtime config changes

* fix test

* add changelog file

* fix review comments

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* update flag description

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>

* fix compilation error

* add static runtime config support

* fix test

* fix review comments

* fix log test

* Update .changelog/12329.txt

Co-authored-by: Dan Upton <daniel@floppy.co>

* transfer tests to runtime_test.go

* fix filewatcher Replace to not deadlock.

* avoid having lingering locks

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* split ReloadConfig func

* fix warning message

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* convert `FileWatcher` into an interface

* fix compilation errors

* fix tests

* extract func for adding and removing files

* add a coalesceTimer with a very small timer

* extract coaelsce Timer and add a shim for testing

* add tests to coalesceTimer fix to send remaining events

* set `coalesceTimer` to 1 Second

* support symlink, fix a nil deref.

* fix compile error

* fix compile error

* refactor file watcher rate limiting to be a Watcher implementation

* fix linter issue

* fix runtime config

* fix runtime test

* fix flaky tests

* fix compile error

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix agent New to return an error if File watcher New return an error

* quit timer loop if ctx is canceled

* Apply suggestions from code review

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Daniel Upton <daniel@floppy.co>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-04-04 11:31:39 -04:00
Eric Haberkorn 66c40e073a
Merge pull request #12681 from hashicorp/lambda-patching-tweaks
Tweak the Lambda Envoy configuration generated by the serverless patcher
2022-04-01 19:59:30 -04:00
FFMMM b0cba2ec03
mark disable_compat_1.9 to deprecate in 1.13, change default to true (#12675)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-04-01 10:35:56 -07:00
R.B. Boyer 3060b5cb8f
xds: errors from the xds serverless plugin are fatal (#12682) 2022-04-01 10:30:26 -05:00
Eric 66391186ce Tweak the Lambda Envoy configuration generated by the serverless patcher
- Move from `strip_matching_host_port` to `strip_any_host_port`
- Remove `auto_host_rewrite` since it conflicts with `strip_any_host_port`
2022-04-01 11:13:44 -04:00
Eric Haberkorn 9b291d362a
Merge pull request #12676 from hashicorp/implement-lambda-patching
Implement Lambda Patching in the Serverless Plugin
2022-04-01 09:58:56 -04:00
Mark Anderson aa29324a24
Avoid using sys/mounts to enable namespaces (#12655)
* Avoid doing list of /sys/mounts

From an internal ticket "Support standard "Vault namespace in the path" semantics for Connect Vault CA Provider"

Vault allows the namespace to be specified as a prefix in the path of
a PKI definition, but this doesn't currently work for
```IntermediatePKIPath``` specifications, because we attempt to list
all of the paths to check if ours is already defined. This doesn't
really work in a namespaced world.

This changes the IntermediatePKIPath code to follow the same pattern
as the root key, where we directly get the key rather than listing.

This code is difficult to write automated tests for because it relies
on features of Vault Enterprise, which isn't currently part of our
test framework, so it was tested manually.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* add changelog

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-31 23:35:38 -07:00
Kyle Havlovitz 116b6c57cb Use the GatewayService SNI field for upstream SAN validation 2022-03-31 13:54:25 -07:00
Eric c7204528c5 Implement Lambda Patching in the Serverless Plugin 2022-03-31 16:45:32 -04:00
Kyle Havlovitz c2a48990cf
Merge pull request #12670 from hashicorp/token-read-expanded
oss: Add expanded token read flag and endpoint option
2022-03-31 12:24:11 -07:00
Kyle Havlovitz 4974d8471b Log a warning when a terminating gateway service has TLS but not SNI configured 2022-03-31 12:18:40 -07:00
Dhia Ayachi 8552efa955
auto-reload configuration when config files change (#12329)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* do not reset timer on errors, rename OS specific files

* rename New func

* events trigger on write and rename

* add missing test

* fix flaking tests

* fix flaky test

* check reconcile when removed

* delete invalid file

* fix test to create files with different mod time.

* back date file instead of sleeping

* add watching file in agent command.

* fix watcher call to use new API

* add configuration and stop watcher when server stop

* add certs as watched files

* move FileWatcher to the agent start instead of the command code

* stop watcher before replacing it

* save watched files in agent

* add add and remove interfaces to the file watcher

* fix remove to not return an error

* use `Add` and `Remove` to update certs files

* fix tests

* close events channel on the file watcher even when the context is done

* extract `NotAutoReloadableRuntimeConfig` is a separate struct

* fix linter errors

* add Ca configs and outgoing verify to the not auto reloadable config

* add some logs and fix to use background context

* add tests to auto-config reload

* remove stale test

* add tests to changes to config files

* add check to see if old cert files still trigger updates

* rename `NotAutoReloadableRuntimeConfig` to `StaticRuntimeConfig`

* fix to re add both key and cert file. Add test to cover this case.

* review suggestion

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* add check to static runtime config changes

* fix test

* add changelog file

* fix review comments

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* update flag description

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>

* fix compilation error

* add static runtime config support

* fix test

* fix review comments

* fix log test

* Update .changelog/12329.txt

Co-authored-by: Dan Upton <daniel@floppy.co>

* transfer tests to runtime_test.go

* fix filewatcher Replace to not deadlock.

* avoid having lingering locks

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* split ReloadConfig func

* fix warning message

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* convert `FileWatcher` into an interface

* fix compilation errors

* fix tests

* extract func for adding and removing files

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Daniel Upton <daniel@floppy.co>
2022-03-31 15:11:49 -04:00
Kyle Havlovitz 9a2474381a Add expanded token read flag and endpoint option 2022-03-31 10:49:49 -07:00
FFMMM 8b184197b3
polish rpc.service.call metric behavior (#12624) 2022-03-31 10:49:37 -07:00
Paul Glass aae6d8080d
Add IAM Auth Method (#12583)
This adds an aws-iam auth method type which supports authenticating to Consul using AWS IAM identities.

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-03-31 10:18:48 -05:00
Eric Haberkorn 9751626828
Merge pull request #12659 from hashicorp/bump-go-control-plane
Bump Go Control Plane
2022-03-30 15:07:47 -04:00
R.B. Boyer e9230e93d8
xds: adding control of the mesh-wide min/max TLS versions and cipher suites from the mesh config entry (#12601)
- `tls.incoming`: applies to the inbound mTLS targeting the public
  listener on `connect-proxy` and `terminating-gateway` envoy instances

- `tls.outgoing`: applies to the outbound mTLS dialing upstreams from
  `connect-proxy` and `ingress-gateway` envoy instances

Fixes #11966
2022-03-30 13:43:59 -05:00
R.B. Boyer ee11dff5a5 similar bump 2022-03-30 13:28:00 -05:00
R.B. Boyer 47693e3ebf fail on error and use ptypes.MarshalAny for now instead of anypb.New 2022-03-30 13:27:49 -05:00
Eric 91a493efe9 Bump go-control-plane
* `go get cloud.google.com/go@v0.59.0`
* `go get github.com/envoyproxy/go-control-plane@v0.9.9`
* `make envoy-library`
* Bumpprotoc to 3.15.8
2022-03-30 13:11:27 -04:00
R.B. Boyer d4e80b8800
server: ensure that service-defaults meta is incorporated into the discovery chain response (#12511)
Also add a new "Default" field to the discovery chain response to clients
2022-03-30 10:04:18 -05:00
FFMMM 0fd6cdc900
introduce EmptyReadRequest for status_endpoint (#12653)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2022-03-29 18:05:45 -07:00
Fulvio d1938482e7
remove DualStack field from check TCP #12629 (#12630) 2022-03-29 14:56:01 -04:00
Eric ab5b5e85f5 remove the rest of gogo 2022-03-28 17:34:41 -04:00
Eric 8fd73ede3e remove gogo from acl protobufs 2022-03-28 16:20:56 -04:00
Connor f8fc317731
Fix leaked Vault LifetimeRenewers (#12607)
* Fix leaked Vault LifetimeRenewers

When the Vault CA Provider is reconfigured we do not stop the
LifetimeRenewers which can cause them to leak until the Consul processes
recycles. On Configure execute stopWatcher if it exists and is not nil
before starting a new renewal

* Add jitter before restarting the LifetimeWatcher

If we fail to login to Vault or our token is no longer valid we can
overwhelm a Vault instance with many requests very quickly by restarting
the LifetimeWatcher. Before restarting the LifetimeWatcher provide a
backoff time of 1 second or less.

* Use a retry.Waiter instead of RandomStagger

* changelog

* gofmt'd

* Swap out bool for atomic.Unit32 in test

* Provide some extra clarification in comment and changelog
2022-03-28 09:58:16 -05:00
Eric 523e054c81 assorted changes required to remove gogo 2022-03-25 09:55:36 -04:00
FFMMM 560f8cbc89
fix bad oss sync, use gauges not counters (#12611) 2022-03-24 14:41:30 -07:00
Kyle Havlovitz d9f31345e0
Merge pull request #12596 from hashicorp/overview-endpoint
oss: Add overview UI internal endpoint
2022-03-24 14:27:54 -07:00
Mike Morris 8020fb2098
agent: convert listener config to TLS types (#12522)
* tlsutil: initial implementation of types/TLSVersion

tlsutil: add test for parsing deprecated agent TLS version strings

tlsutil: return TLSVersionInvalid with error

tlsutil: start moving tlsutil cipher suite lookups over to types/tls

tlsutil: rename tlsLookup to ParseTLSVersion, add cipherSuiteLookup

agent: attempt to use types in runtime config

agent: implement b.tlsVersion validation in config builder

agent: fix tlsVersion nil check in builder

tlsutil: update to renamed ParseTLSVersion and goTLSVersions

tlsutil: fixup TestConfigurator_CommonTLSConfigTLSMinVersion

tlsutil: disable invalid config parsing tests

tlsutil: update tests

auto_config: lookup old config strings from base.TLSMinVersion

auto_config: update endpoint tests to use TLS types

agent: update runtime_test to use TLS types

agent: update TestRuntimeCinfig_Sanitize.golden

agent: update config runtime tests to expect TLS types

* website: update Consul agent tls_min_version values

* agent: fixup TLS parsing and compilation errors

* test: fixup lint issues in agent/config_runtime_test and tlsutil/config_test

* tlsutil: add CHACHA20_POLY1305 cipher suites to goTLSCipherSuites

* test: revert autoconfig tls min version fixtures to old format

* types: add TLSVersions public function

* agent: add warning for deprecated TLS version strings

* agent: move agent config specific logic from tlsutil.ParseTLSVersion into agent config builder

* tlsutil(BREAKING): change default TLS min version to TLS 1.2

* agent: move ParseCiphers logic from tlsutil into agent config builder

* tlsutil: remove unused CipherString function

* agent: fixup import for types package

* Revert "tlsutil: remove unused CipherString function"

This reverts commit 6ca7f6f58d268e617501b7db9500113c13bae70c.

* agent: fixup config builder and runtime tests

* tlsutil: fixup one remaining ListenerConfig -> ProtocolConfig

* test: move TLS cipher suites parsing test from tlsutil into agent config builder tests

* agent: remove parseCiphers helper from auto_config_endpoint_test

* test: remove unused imports from tlsutil

* agent: remove resolved FIXME comment

* tlsutil: remove TODO and FIXME in cipher suite validation

* agent: prevent setting inherited cipher suite config when TLS 1.3 is specified

* changelog: add entry for converting agent config to TLS types

* agent: remove FIXME in runtime test, this is covered in builder tests with invalid tls9 value now

* tlsutil: remove config tests for values checked at agent config builder boundary

* tlsutil: remove tls version check from loadProtocolConfig

* tlsutil: remove tests and TODOs for logic checked in TestBuilder_tlsVersion and TestBuilder_tlsCipherSuites

* website: update search link for supported Consul agent cipher suites

* website: apply review suggestions for tls_min_version description

* website: attempt to clean up markdown list formatting for tls_min_version

* website: moar linebreaks to fix tls_min_version formatting

* Revert "website: moar linebreaks to fix tls_min_version formatting"

This reverts commit 38585927422f73ebf838a7663e566ac245f2a75c.

* autoconfig: translate old values for TLSMinVersion

* agent: rename var for translated value of deprecated TLS version value

* Update agent/config/deprecated.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* agent: fix lint issue

* agent: fixup deprecated config test assertions for updated warning

Co-authored-by: Dan Upton <daniel@floppy.co>
2022-03-24 15:32:25 -04:00
Kyle Havlovitz 0d5cbf6f30 Sort by partition/ns/servicename instead of the reverse 2022-03-24 12:16:05 -07:00
FFMMM 8f98bbda75
[metrics][rpc]: add basic prefix filter test for new rpc metric (#12598)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-23 13:29:12 -07:00
Kyle Havlovitz 1b654c9807 Clean up ent meta id usage in overview summary 2022-03-23 12:47:12 -07:00
Eric 98b733e41a remove gogo from pbservice 2022-03-23 12:18:01 -04:00
Mark Anderson 28c925f6d0 Fixup dropped SecretID usage
Looks like something got munged at some point. Not sure how it slipped in, but my best guess is that because TestTxn_Apply_ACLDeny is marked flaky we didn't block merge because it failed.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-22 21:20:03 -07:00
Kyle Havlovitz 04f1d9bcc9 oss: Add overview UI internal endpoint 2022-03-22 17:05:09 -07:00
Dhia Ayachi 5cb24b9bf8
split pbcommon to pbcommon and pbcommongogo (#12587)
* mogify needed pbcommon structs

* mogify needed pbconnect structs

* fix compilation errors and make config_translate_test pass

* add missing file

* remove redundant oss func declaration

* fix EnterpriseMeta to copy the right data for enterprise

* rename pbcommon package to pbcommongogo

* regenerate proto and mog files

* add missing mog files

* add pbcommon package

* pbcommon no mog

* fix enterprise meta code generation

* fix enterprise meta code generation (pbcommongogo)

* fix mog generation for gogo

* use `protoc-go-inject-tag` to inject tags

* rename proto package

* pbcommon no mog

* use `protoc-go-inject-tag` to inject tags

* add non gogo proto to make file

* fix proto get
2022-03-22 16:30:00 -04:00
Dan Upton 2fe06f663b
streaming: emit events when Connect CA Roots change (#12590)
OSS sync of enterprise changes at 614f786d
2022-03-22 19:13:59 +00:00
FFMMM 76d8798590
factor out recording func, add unit tests (#12585)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-22 09:31:54 -07:00
Dan Upton fb441e323a
Restructure gRPC server setup (#12586)
OSS sync of enterprise changes at 0b44395e
2022-03-22 12:40:24 +00:00
FFMMM 08f2838b78
pre register new rpc metric, rename metric (#12582) 2022-03-21 17:26:32 -07:00
Mark Anderson 2b367626f0
Add source of authority annotations to the PermissionDeniedError output. (#12567)
This extends the acl.AllowAuthorizer with source of authority information.

The next step is to unify the AllowAuthorizer and ACLResolveResult structures; that will be done in a separate PR.

Part of #12481

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-18 10:32:25 -07:00
Dan Upton 57f0f42733
Support per-listener TLS configuration ⚙️ (#12504)
Introduces the capability to configure TLS differently for Consul's
listeners/ports (i.e. HTTPS, gRPC, and the internal multiplexed RPC
port) which is useful in scenarios where you may want the HTTPS or
gRPC interfaces to present a certificate signed by a well-known/public
CA, rather than the certificate used for internal communication which
must have a SAN in the form `server.<dc>.consul`.
2022-03-18 10:46:58 +00:00
Evan Culver 27711fe5c7
lib: add validation package + DNS label validation (#12535)
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-03-17 18:31:28 -07:00
FFMMM 3c08843847
[sync oss] add net/rpc interceptor implementation (#12573)
* sync ent changes from 866dcb0667

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

* update oss go.mod

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-17 16:02:26 -07:00
Jared Kirschner 038fd90925
Merge pull request #11821 from hashicorp/error-if-get-request-has-body
http: error if GET request has non-empty body
2022-03-16 18:34:27 -04:00
Jared Kirschner 13712de2e7 http: WARN if GET request has non-empty body
Give the user a hint that they might be doing something wrong if their GET
request has a non-empty body, which can easily happen using curl's
--data-urlencode if specifying request type via "--request GET" rather than
"--get". See https://github.com/hashicorp/consul/issues/11471.
2022-03-16 14:19:50 -07:00
Eric ae1cdc85b1 Remove the stdduration gogo extension 2022-03-16 12:12:29 -04:00
mrspanishviking 1ae820ea0a
Revert "[Docs] Agent configuration hierarchy " 2022-03-15 16:13:58 -07:00
trujillo-adam 667976c94f fixing merge conflicts part 3 2022-03-15 15:25:03 -07:00
Eric Haberkorn 90ecd7b5fe
Merge pull request #12556 from hashicorp/wire-up-serverless-patcher
Create and wire up the serverless patcher
2022-03-15 14:05:40 -04:00
Eric Haberkorn d58f230bb8
Merge pull request #12557 from hashicorp/remove-healthcheck-gogo-stdduration
Remove Gogo Stdduration From the Healthcheck Protobufs
2022-03-15 13:20:49 -04:00
Eric fa13ce3e45 Remove gogo stdduration from the healthcheck protobufs 2022-03-15 10:51:40 -04:00
Eric 07bf0c5ce8 Create and wire up the serverless patcher 2022-03-15 10:12:57 -04:00
trujillo-adam 60a88bb40f merging new hierarchy for agent configuration 2022-03-14 15:44:41 -07:00
Mark Anderson ab099e5fcb
Refactor config checks oss (#12550)
Currently the config_entry.go subsystem delegates authorization decisions via the ConfigEntry interface CanRead and CanWrite code. Unfortunately this returns a true/false value and loses the details of the source.

This is not helpful, especially since it the config subsystem can be more complex to understand, since it covers so many domains.

This refactors CanRead/CanWrite to return a structured error message (PermissionDenied or the like) with more details about the reason for denial.

Part of #12241

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-11 13:45:51 -08:00
Eric Haberkorn 3297bc6988
Merge pull request #12536 from hashicorp/add-serverless-config
Add the `connect.enable_serverless_plugin` configuration option
2022-03-11 09:39:36 -05:00
Eric Haberkorn 35cb0c1312
Merge pull request #12539 from hashicorp/make-xds-lib
Make the xdscommon package
2022-03-11 09:21:10 -05:00
Eric 2d27f9dc47 Add the connect.enable_serverless_plugin configuration option. 2022-03-11 09:16:00 -05:00
Mark Anderson 5591cb1e11
Bulk acl message fixup oss (#12470)
* First pass for helper for bulk changes

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Convert ACLRead and ACLWrite to new form

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* AgentRead and AgentWRite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fix EventWrite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* KeyRead, KeyWrite, KeyList

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* KeyRing

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* NodeRead NodeWrite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* OperatorRead and OperatorWrite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* PreparedQuery

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Intention partial

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fix ServiceRead, Write ,etc

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Error check ServiceRead?

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fix Sessionread/Write

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fixup snapshot ACL

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Error fixups for txn

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Add changelog

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fixup review comments

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-10 18:48:27 -08:00
Eric 3f9b84e6a4 Make an xdscommon package that will be shared between Consul and Envoy plugins 2022-03-08 14:57:23 -05:00
Eric Haberkorn b4276e971e
Merge pull request #12529 from hashicorp/add-meta-to-service-config-response
Add `Meta` to `ServiceConfigResponse`
2022-03-07 16:35:21 -05:00
Eric Haberkorn 45312886fe Code review changes 2022-03-07 14:39:33 -05:00
R.B. Boyer cc4733e60d
proxycfg: change how various proxycfg test helpers for making ConfigSnapshot copies works to be more correct and less error prone (#12531)
Prior to this PR for the envoy xDS golden tests in the agent/xds package we
were hand-creating a proxycfg.ConfigSnapshot structure in the proper format for
input to the xDS generator. Over time this intermediate structure has gotten
trickier to build correctly for the various tests.

This PR proposes to switch to using the existing mechanism for turning a
structs.NodeService and a sequence of cache.UpdateEvent copies into a
proxycfg.ConfigSnapshot, as that is less error prone to construct and aligns
more with how the data arrives.

NOTE: almost all of this is in test-related code. I tried super hard to craft
correct event inputs to get the golden files to be the same, or similar enough
after construction to feel ok that i recreated the spirit of the original test
cases.
2022-03-07 11:47:14 -06:00
Eric 3d46f9ef7c Add Meta to ServiceConfigResponse 2022-03-07 10:05:18 -05:00
R.B. Boyer b63a0f3909
reduce flakiness/raciness of errNotFound and errNotChanged blocking query tests (#12518)
Improves tests from #12362

These tests try to setup the following concurrent scenario:

1. (goroutine 1) execute read RPC with index=0
2. (goroutine 1) get response from (1) @ index=10
3. (goroutine 1) execute read RPC with index=10 and block
4. (goroutine 2) WHILE (3) is blocking, start slamming the system with stray writes that will cause the WatchSet to wakeup
5. (goroutine 2) after doing all writes, shut down the reader above
6. (goroutine 1) stops reading, double checks that it only ever woke up once (from 1)
2022-03-04 11:20:01 -06:00
R.B. Boyer 07b92a2855
server: fix spurious blocking query suppression for discovery chains (#12512)
Minor fix for behavior in #12362

IsDefault sometimes returns true even if there was a proxy-defaults or service-defaults config entry that was consulted. This PR fixes that.
2022-03-03 16:54:41 -06:00
Daniel Nephin 8f4b6af68a
Merge pull request #12298 from jorgemarey/b-persistnewrootandconfig
Avoid raft change when no config is provided on persistNewRootAndConfig
2022-03-03 11:03:50 -05:00
Daniel Nephin 2082bdc286 ca: make sure the test fails without the fix
Also change the path used for the secondary so that both primary and secondary do not overwrite each other.
2022-03-02 18:22:49 -05:00
R.B. Boyer 679cea7171
raft: upgrade to v1.3.6 (#12496)
Add additional protections on the Consul side to prevent NonVoters from bootstrapping raft.

This should un-flake TestServer_Expect_NonVoters
2022-03-02 17:00:02 -06:00
Daniel Nephin 849d86e7f5
Merge pull request #12467 from hashicorp/dnephin/ci-vault-test-safer
ca: require that tests that use Vault are named correctly
2022-03-01 12:54:02 -05:00
R.B. Boyer 033e0ed13f
test: parallelize more of TestLeader_ReapOrLeftMember_IgnoreSelf (#12468)
before:

    $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf
    ok  	github.com/hashicorp/consul/agent/consul	21.147s

after:

    $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf
    ok  	github.com/hashicorp/consul/agent/consul	5.402s
2022-03-01 10:30:06 -06:00
Jorge Marey aba9e724a8 Fix vault test with suggested changes 2022-03-01 10:20:00 +01:00
Jorge Marey 8b1b264b6f Add test case to verify #12298 2022-03-01 09:25:52 +01:00
Jorge Marey 2ca00df0d8 Avoid raft change when no config is provided on CAmanager
- This avoids a change to the raft store when no roots or config
are provided to persistNewRootAndConfig
2022-03-01 09:25:52 +01:00
Daniel Nephin dd565aa5e4 ca: fix a test
This test does not use Vault, so does not need ca.SkipIfVaultNotPresent
2022-02-28 16:26:18 -05:00
Daniel Nephin bb7f2f15b3 ca: require that tests that use Vault are named correctly
Previously we were using two different criteria to decide where to run a
test.  The main `go-test` job would skip Vault tests based on the
presence of the `vault` binary, but the `test-connect-ca-providers` job
would run tests based on the name.

This led to a scenario where a test may never run in CI.

To fix this problem I added a name check to the function we use to skip
the test. This should ensure that any test that requires vault is named
correctly to be run as part of the `test-connect-ca-providers` job.

At the same time I relaxed the regex we use. I verified this runs the
same tests using `go test --list Vault`.  I made this change because a
bunch of tests in `agent/connect/ca` used `Vault` in the name, without
the underscores. Instead of changing a bunch of test names, this seemed
easier.

With this approach, the worst case is that we run a few extra tests in
the `test-connect-ca-providers` job, which doesn't seem like a problem.
2022-02-28 16:13:53 -05:00
R.B. Boyer 3804677570
server: suppress spurious blocking query returns where multiple config entries are involved (#12362)
Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect:

- `DiscoveryChain.Get`
- `ConfigEntry.ResolveServiceConfig`
- `Intentions.Match`

All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism.

### No Config Entries Found

In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110.

### No Change Since Last Wakeup

Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110.

This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report.

### Extra Endpoints

Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated:

- `ConfigEntry.List`
- `Internal.IntentionUpstreams` (tproxy)
2022-02-25 15:46:34 -06:00
Chris S. Kim aea00f10ae
Merge pull request #12442 from danieleva/12422-keyring
Allows keyring operations on client agents
2022-02-25 16:28:56 -05:00
Evan Culver 7889071385
connect: Update supported Envoy versions to include 1.19.3 and 1.18.6 2022-02-24 16:59:33 -08:00
Evan Culver 9f4d9f3f74
connect: Upgrade Envoy 1.20 to 1.20.2 (#12443) 2022-02-24 16:19:39 -08:00
R.B. Boyer 4b0f657b31
fix flaky test panic (#12446) 2022-02-24 17:35:46 -06:00
R.B. Boyer a97d20cf63
catalog: compare node names case insensitively in more places (#12444)
Many places in consul already treated node names case insensitively.
The state store indexes already do it, but there are a few places that
did a direct byte comparison which have now been corrected.

One place of particular consideration is ensureCheckIfNodeMatches
which is executed during snapshot restore (among other places). If a
node check used a slightly different casing than the casing of the node
during register then the snapshot restore here would deterministically
fail. This has been fixed.

Primary approach:

    git grep -i "node.*[!=]=.*node" -- ':!*_test.go' ':!docs'
    git grep -i '\[[^]]*member[^]]*\]
    git grep -i '\[[^]]*\(member\|name\|node\)[^]]*\]' -- ':!*_test.go' ':!website' ':!ui' ':!agent/proxycfg/testing.go:' ':!*.md'
2022-02-24 16:54:47 -06:00
Daniele Vazzola 397b5ed957 Allows keyring operations on client agents 2022-02-24 17:24:57 +00:00
R.B. Boyer d860384731
server: partly fix config entry replication issue that prevents replication in some circumstances (#12307)
There are some cross-config-entry relationships that are enforced during
"graph validation" at persistence time that are required to be
maintained. This means that config entries may form a digraph at times.

Config entry replication procedes in a particular sorted order by kind
and name.

Occasionally there are some fixups to these digraphs that end up
replicating in the wrong order and replicating the leaves
(ingress-gateway) before the roots (service-defaults) leading to
replication halting due to a graph validation error related to things
like mismatched service protocol requirements.

This PR changes replication to give each computed change (upsert/delete)
a fair shot at being applied before deciding to terminate that round of
replication in error. In the case where we've simply tried to do the
operations in the wrong order at least ONE of the outstanding requests
will complete in the right order, leading the subsequent round to have
fewer operations to do, with a smaller likelihood of graph validation
errors.

This does not address all scenarios, but for scenarios where the edits
are being applied in the wrong order this should avoid replication
halting.

Fixes #9319

The scenario that is NOT ADDRESSED by this PR is as follows:

1. create: service-defaults: name=new-web, protocol=http
2. create: service-defaults: name=old-web, protocol=http
3. create: service-resolver: name=old-web, redirect-to=new-web
4. delete: service-resolver: name=old-web
5. update: service-defaults: name=old-web, protocol=grpc
6. update: service-defaults: name=new-web, protocol=grpc
7. create: service-resolver: name=old-web, redirect-to=new-web

If you shutdown dc2 just before (4) and turn it back on after (7)
replication is impossible as there is no single edit you can make to
make forward progress.
2022-02-23 17:27:48 -06:00
Chris S. Kim 4b528edbe6
Merge pull request #12430 from hashicorp/ci/main-assetfs-build
auto-updated agent/uiserver/bindata_assetfs.go from commit 73b6687c5
2022-02-23 18:19:30 -05:00
Daniel Nephin 3639f4b551
Merge pull request #11910 from hashicorp/dnephin/ca-provider-interface-for-ica-in-primary
ca: add support for an external trusted CA
2022-02-22 13:14:52 -05:00
R.B. Boyer 11fdc70b34
configentry: make a new package to hold shared config entry structs that aren't used for RPC or the FSM (#12384)
First two candidates are ConfigEntryKindName and DiscoveryChainConfigEntries.
2022-02-22 10:36:36 -06:00
Dhia Ayachi 378f688a6a
file watcher to be used for configuration auto-reload feature (#12301)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* fix flaky test

* Apply suggestions from code review

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>

* fix a possible stack overflow

* do not reset timer on errors, rename OS specific files

* start the watcher when creating it

* fix data race in tests

* rename New func

* do not call handler when a remove event happen

* events trigger on write and rename

* fix watcher tests

* make handler async

* remove recursive call

* do not produce events for sub directories

* trim "/" at the end of a directory when adding

* add missing test

* fix logging

* add todo

* fix failing test

* fix flaking tests

* fix flaky test

* add logs

* fix log text

* increase timeout

* reconcile when remove

* check reconcile when removed

* fix reconcile move test

* fix logging

* delete invalid file

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix review comments

* fix is watched to properly catch a remove

* change test timeout

* fix test and rename id

* fix test to create files with different mod time.

* fix deadlock when stopping watcher

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix a deadlock when calling stop while emitting event is blocked

* make sure to close the event channel after the event loop is done

* add go doc

* back date file instead of sleeping

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* check error

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-02-21 11:36:52 -05:00
hc-github-team-consul-core ef5b6c8415 auto-updated agent/uiserver/bindata_assetfs.go from commit 73b6687c5 2022-02-21 12:27:52 +00:00
Evan Culver 067223337d
checks: populate interval and timeout when registering services (#11138) 2022-02-18 12:05:33 -08:00
Kyle Havlovitz 9c03b5dc3d
Merge pull request #12385 from hashicorp/tproxy-http-upstream-fix
xds: respect chain protocol on default discovery chain
2022-02-18 10:08:59 -08:00
Daniel Nephin cb1a80184f rpc: set response to nil when not found
Otherwise when the query times out we might incorrectly send a value for
the reply, when we should send an empty reply.

Also document errNotFound and how to handle the result in that case.
2022-02-18 12:26:06 -05:00
Daniel Nephin 79820738cc ca: test that original certs from secondary still verify
There's a chance this could flake if the secondary hasn't received the
update yet, but running this test many times doesn't show any flakes
yet.
2022-02-17 18:45:16 -05:00
Daniel Nephin ca4e60e09b Update TODOs to reference an issue with more details
And remove a no longer needed TODO
2022-02-17 18:21:30 -05:00
Daniel Nephin 0abaf29c10 ca: add test cases for rotating external trusted CA 2022-02-17 18:21:30 -05:00
Daniel Nephin aacc40012f ca: add a test for secondary with external CA 2022-02-17 18:21:30 -05:00
Daniel Nephin 471b2098bb ca: examine the full chain in newCARoot
make TestNewCARoot much more strict
compare the full result instead of only a few fields.
add a test case with 2 and 3 certificates in the pem
2022-02-17 18:21:30 -05:00
Daniel Nephin fc6c0ec139 ca: small docs improvements 2022-02-17 18:21:30 -05:00
Daniel Nephin af651eaaad ca: cleanup validateSetIntermediate 2022-02-17 18:21:30 -05:00
Daniel Nephin ef03f7be73 ca: only return the leaf cert from Sign in vault provider
The interface is documented as 'Sign will only return the leaf', and the other providers
only return the leaf. It seems like this was added during the initial implementation, so
is likely just something we missed. It doesn't break anything , but it does cause confusing cert chains
in the API response which could break something in the future.
2022-02-17 18:21:30 -05:00
Daniel Nephin 2d5254a73b
Merge pull request #12110 from hashicorp/dnephin/blocking-queries-not-found
rpc: make blocking queries for non-existent items more efficient
2022-02-17 18:09:39 -05:00
Ashwin Venkatesh 39be071264
Parse datacenter from request (#12370)
* Parse datacenter from request
- Parse the value of the datacenter from the create/delete requests for AuthMethods and BindingRules so that they can be created in and deleted from the datacenters specified in the request.
2022-02-17 16:41:27 -05:00
Kyle Havlovitz 58172c260b xds: respect chain protocol on default discovery chain 2022-02-17 11:47:20 -08:00
Florian Apolloner 895da50986
Support for connect native services in topology view. (#12098) 2022-02-16 16:51:54 -05:00
Chris S. Kim 18096fd2fb
Move IndexEntryName helpers to common files (#12365) 2022-02-16 12:56:38 -05:00
Daniel Nephin 06657e5be0 rpc: add errNotFound to all Get queries
Any query that returns a list of items is not part of this commit.
2022-02-15 18:24:34 -05:00
Daniel Nephin bdafa24c50 Make blockingQuery efficient with 'not found' results.
By using the query results as state.

Blocking queries are efficient when the query matches some results,
because the ModifyIndex of those results, returned as queryMeta.Mindex,
will never change unless the items themselves change.

Blocking queries for non-existent items are not efficient because the
queryMeta.Index can (and often does) change when other entities are
written.

This commit reduces the churn of these queries by using a different
comparison for "has changed". Instead of using the modified index, we
use the existence of the results. If the previous result was "not found"
and the new result is still "not found", we know we can ignore the
modified index and continue to block.

This is done by setting the minQueryIndex to the returned
queryMeta.Index, which prevents the query from returning before a state
change is observed.
2022-02-15 18:24:33 -05:00
Daniel Nephin 6e73df7dc2 Add a test for blocking query on non-existent entry
This test shows how blocking queries are not efficient when the query
returns no results.  The test fails with 100+ calls instead of the
expected 2.

This test is still a bit flaky because it depends on the timing of the
writes. It can sometimes return 3 calls.

A future commit should fix this and make blocking queries even more
optimal for not-found results.
2022-02-15 18:23:17 -05:00
Daniel Nephin a4e1c59cd8 rpc: improve docs for blockingQuery
Follow the Go convention of accepting a small interface that documents
the methods used by the function.

Clarify the rules for implementing a query function passed to
blockingQuery.
2022-02-15 14:20:14 -05:00
R.B. Boyer b216d52b66
server: conditionally avoid writing a config entry to raft if it was already the same (#12321)
This will both save on unnecessary raft operations as well as
unnecessarily incrementing the raft modify index of config entries
subject to no-op updates.
2022-02-14 14:39:12 -06:00
FFMMM 1f8fb17be7
Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311)
This commit syncs ENT changes to the OSS repo.

Original commit details in ENT:

```
commit 569d25f7f4578981c3801e6e067295668210f748
Author: FFMMM <FFMMM@users.noreply.github.com>
Date:   Thu Feb 10 10:23:33 2022 -0800

    Vendor fork net rpc (#1538)

    * replace net/rpc w consul-net-rpc/net/rpc

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * replace msgpackrpc and go-msgpack with fork from mono repo

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * gofmt all files touched

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
```

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-02-14 09:45:45 -08:00
R.B. Boyer d54a3e6aa1
missed this test adjustment (#12331) 2022-02-14 11:39:00 -06:00
R.B. Boyer 0b80f70a39
local: fixes a data race in anti-entropy sync (#12324)
The race detector noticed this initially in `TestAgentConfigWatcherSidecarProxy` but it is not restricted to just tests.

The two main changes here were:

- ensure that before we mutate the internal `agent/local` representation of a Service (for tags or VIPs) we clone those fields
- ensure that there's no function argument joint ownership between the caller of a function and the local state when calling `AddService`, `AddCheck`, and related using `copystructure` for now.
2022-02-14 10:41:33 -06:00
Dao Thanh Tung 0519a9240e
URL-encode/decode resource names for HTTP API part 5 (#12297) 2022-02-14 10:47:06 -05:00
Mark Anderson fa95afdcf6 Refactor to make ACL errors more structured. (#12308)
* First phase of refactoring PermissionDeniedError

Add extended type PermissionDeniedByACLError that captures information
about the accessor, particular permission type and the object and name
of the thing being checked.

It may be worth folding the test and error return into a single helper
function, that can happen at a later date.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-02-11 12:53:23 -08:00
Freddy f45bec7779
Merge pull request #12223 from hashicorp/proxycfg/passthrough-cleanup 2022-02-10 17:35:51 -07:00
freddygv 8eaca35df1 Account for upstream targets in another DC.
Transparent proxies typically cannot dial upstreams in remote
datacenters. However, if their upstream configures a redirect to a
remote DC then the upstream targets will be in another datacenter.

In that sort of case we should use the WAN address for the passthrough.
2022-02-10 17:01:57 -07:00
freddygv 7fba7456ec Fix race of upstreams with same passthrough ip
Due to timing, a transparent proxy could have two upstreams to dial
directly with the same address.

For example:
- The orders service can dial upstreams shipping and payment directly.
- An instance of shipping at address 10.0.0.1 is deregistered.
- Payments is scaled up and scheduled to have address 10.0.0.1.
- The orders service receives the event for the new payments instance
before seeing the deregistration for the shipping instance. At this
point two upstreams have the same passthrough address and Envoy will
reject the listener configuration.

To disambiguate this commit considers the Raft index when storing
passthrough addresses. In the example above, 10.0.0.1 would only be
associated with the newer payments service instance.
2022-02-10 17:01:57 -07:00
freddygv d5a2eb677f Ensure passthrough addresses get cleaned up
Transparent proxies can set up filter chains that allow direct
connections to upstream service instances. Services that can be dialed
directly are stored in the PassthroughUpstreams map of the proxycfg
snapshot.

Previously these addresses were not being cleaned up based on new
service health data. The list of addresses associated with an upstream
service would only ever grow.

As services scale up and down, eventually they will have instances
assigned to an IP that was previously assigned to a different service.
When IP addresses are duplicated across filter chain match rules the
listener config will be rejected by Envoy.

This commit updates the proxycfg snapshot management so that passthrough
addresses can get cleaned up when no longer associated with a given
upstream.

There is still the possibility of a race condition here where due to
timing an address is shared between multiple passthrough upstreams.
That concern is mitigated by #12195, but will be further addressed
in a follow-up.
2022-02-10 17:01:57 -07:00
Freddy bb129384b7
Prevent xDS tight loop on cfg errors (#12195) 2022-02-10 15:37:36 -07:00
Dhia Ayachi de7598f064
fix race when starting a service while the agent serviceManager is … (#12302)
* fix race when starting a service while the agent `serviceManager` is stopping

* add changelog
2022-02-10 13:30:49 -05:00
Daniel Nephin db4675bd1a
Merge pull request #12277 from hashicorp/dnephin/panic-in-service-register
catalog: initialize the refs map to prevent a nil panic
2022-02-09 19:48:22 -05:00
Daniel Nephin 6376141464 config-entry: fix a panic when registering a service or ingress gateway 2022-02-09 18:49:48 -05:00
R.B. Boyer 0cd0d505fa
xds: allow only one outstanding delta request at a time (#12236)
Fixes #11876

This enforces that multiple xDS mutations are not issued on the same ADS connection at once, so that we can 100% control the order that they are applied. The original code made assumptions about the way multiple in-flight mutations were applied on the Envoy side that was incorrect.
2022-02-08 10:36:48 -06:00
Daniel Nephin c20412ab14
Merge pull request #12265 from hashicorp/dnephin/logging-in-tests
sdk: add TestLogLevel for setting log level in tests
2022-02-07 16:11:23 -05:00
Daniel Nephin 5a0e6700c1 A test to reproduce the issue 2022-02-04 14:04:12 -05:00
Daniel Nephin 7b466a024b Make test more readable
And fix typo
2022-02-03 18:44:09 -05:00
Daniel Nephin 6721c1246d ca: relax and move private key type/bit validation for vault
This commit makes two changes to the validation.

Previously we would call this validation in GenerateRoot, which happens
both on initialization (when a follower becomes leader), and when a
configuration is updated. We only want to do this validation during
config update so the logic was moved to the UpdateConfiguration
function.

Previously we would compare the config values against the actual cert.
This caused problems when the cert was created manually in Vault (not
created by Consul).  Now we compare the new config against the previous
config. Using a already created CA cert should never error now.

Adding the key bit and types to the config should only error when
the previous values were not the defaults.
2022-02-03 17:21:20 -05:00
Daniel Nephin 3b78f81f9a ca: small cleanup of TestConnectCAConfig_Vault_TriggerRotation_Fails
Before adding more test cases
2022-02-03 17:21:20 -05:00
Daniel Nephin f6d7a0f7b2 testing: fix test failures caused by new log level
These two tests require debug logging enabled, because they look for log lines.

Also switched to testify assertions because the previous errors were not clear.
2022-02-03 17:07:39 -05:00
Daniel Nephin 1a9a656a7f sdk: add TestLogLevel for setting log level in tests
And default log level to WARN.
2022-02-03 13:42:28 -05:00
Daniel Nephin 44f9229b96 ca: add a test that uses an intermediate CA as the primary CA
This test found a bug in the secondary. We were appending the root cert
to the PEM, but that cert was already appended. This was failing
validation in Vault here:
https://github.com/hashicorp/vault/blob/sdk/v0.3.0/sdk/helper/certutil/types.go#L329

Previously this worked because self signed certs have the same
SubjectKeyID and AuthorityKeyID. So having the same self-signed cert
repeated doesn't fail that check.

However with an intermediate that is not self-signed, those values are
different, and so we fail the check. A test I added in a previous commit
should show that this continues to work with self-signed root certs as
well.
2022-02-02 13:41:35 -05:00
Daniel Nephin d00a9abca2 acl: un-embed ACLIdentity
This is safer than embedding two interface because there are a number of
places where we check the concrete type. If we check the concrete type
on the top-level interface it will fail. So instead expose the
ACLIdentity from a method.
2022-02-02 12:07:31 -05:00
Daniel Nephin 18ff00f985
Merge pull request #12167 from hashicorp/dnephin/acl-resolve-token-3
acl: rename ResolveTokenToIdentityAndAuthorizer to ResolveToken
2022-01-31 19:21:06 -05:00
Daniel Nephin ff64c13c3e
Merge pull request #12166 from hashicorp/dnephin/acl-resolve-token-2
acl: remove ResolveTokenToIdentity
2022-01-31 19:19:21 -05:00
Daniel Nephin aa4dbe2a17 acl: rename ResolveTokenToIdentityAndAuthorizer to ResolveToken
This change allows us to remove one of the last remaining duplicate
resolve token methods (Server.ResolveToken).

With this change we are down to only 2, where the second one also
handles setting the default EnterpriseMeta from the token.
2022-01-31 18:04:19 -05:00
Daniel Nephin 57eac90cae acl: remove unused methods on fakes, and add changelog
Also document the metric that was removed in a previous commit.
2022-01-31 17:53:53 -05:00
Daniel Nephin 1fb2d49826
Merge pull request #12165 from hashicorp/dnephin/acl-resolve-token
acl: remove some of the duplicate resolve token methods
2022-01-31 13:27:49 -05:00
Mathew Estafanous 1113a7533c
Change error-handling across handlers. (#12225) 2022-01-31 11:17:35 -05:00
Fulvio eff69b484b
URL-encode/decode resource names for HTTP API part 4 (#12190) 2022-01-28 15:01:47 -05:00
Dan Upton ebdda4848f
streaming: split event buffer by key (#12080) 2022-01-28 12:27:00 +00:00
freddygv 68dea758dd Add failing test
The updated test fails because passthrough upstream addresses are not
being cleaned up.
2022-01-27 18:56:47 -07:00
Daniel Nephin fa8ff28a63 ca/provider: remove ActiveRoot from Provider 2022-01-27 13:07:37 -05:00
Daniel Nephin 722e3a6ac4 ca: update MockProvider for new interface 2022-01-27 12:51:35 -05:00
Daniel Nephin 80f215675c ca: update GenerateRoot godoc 2022-01-27 12:51:35 -05:00
Daniel Nephin d56a1dfb2c
Merge pull request #11663 from hashicorp/dnephin/ca-remove-one-call-to-active-root-2
ca: remove second call to Provider.ActiveRoot
2022-01-27 12:41:05 -05:00
Daniel Nephin d3324d0d27
Merge pull request #12109 from hashicorp/dnephin/blocking-query-1
rpc: make blockingQuery easier to read
2022-01-26 18:13:55 -05:00
Daniel Nephin 6fe2311ce0 acl: Remove a call to aclAccessorID
I missed this on the first pass, we no longer need to look up this ID, because we have it
from the Authorizer.
2022-01-26 17:21:45 -05:00
Daniel Nephin 14a40fab1a
Merge pull request #11221 from hashicorp/dnephin/acl-resolver-5
acl: extract a backend type for the ACLResolverBackend
2022-01-26 16:57:03 -05:00
Dao Thanh Tung 42d6c61b62
URL-encode/decode resource names for HTTP API part 3 (#12103) 2022-01-26 13:12:42 -05:00
Daniel Nephin 74dc9925cc Apply suggestions from code review
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-01-26 12:24:13 -05:00
Daniel Nephin 2c311161cc acl: extract a backend type for the ACLResolverBackend
This is a small step to isolate the functionality that is used for the
ACLResolver from the large Client and Server structs.
2022-01-26 12:24:10 -05:00
R.B. Boyer b999b3edfc
xds: fix for delta xDS reconnect bug in LDS/CDS (#12174)
When a wildcard xDS type (LDS/CDS/SRDS) reconnects from a delta xDS stream,
prior to envoy `1.19.0` it would populate the `ResourceNamesSubscribe` field
with the full list of currently subscribed items, instead of simply omitting it
to infer that it wanted everything (which is what wildcard mode means).

This upstream issue was filed in envoyproxy/envoy#16063 and fixed in
envoyproxy/envoy#16153 which went out in Envoy `1.19.0` and is fixed in later
versions (later refactored in envoyproxy/envoy#16855).

This PR conditionally forces LDS/CDS to be wildcard-only even when the
connected Envoy requests a non-wildcard subscription, but only does so on
versions prior to `1.19.0`, as we should not need to do this on later versions.

This fixes the failure case as described here: #11833 (comment)

Co-authored-by: Huan Wang <fredwanghuan@gmail.com>
2022-01-25 11:24:27 -06:00
Daniel Nephin c1da07e2ea acl: remove calls to ResolveIdentityFromToken
We already have an ACLResolveResult, so we can get the accessor ID from
it.
2022-01-22 15:05:42 -05:00
Daniel Nephin ed1cc5f255 acl: remove ResolveTokenToIdentity
By exposing the AccessorID from the primary ResolveToken method we can
remove this duplication.
2022-01-22 14:47:59 -05:00
Daniel Nephin 26f0ebd96f acl: return a resposne from ResolveToken that includes the ACLIdentity
So that we can duplicate duplicate methods.
2022-01-22 14:33:09 -05:00
Daniel Nephin 314614f073 acl: remove duplicate methods
Now that ACLResolver is embedded we don't need ResolveTokenToIdentity on
Client and Server.

Moving ResolveTokenAndDefaultMeta to ACLResolver removes the duplicate
implementation.
2022-01-22 14:12:08 -05:00
Daniel Nephin 62c09b2d0a acl: embed ACLResolver in Client and Server
In preparation for removing duplicate resolve token methods.
2022-01-22 14:07:26 -05:00
Chris S. Kim 9ef448dedd
Generate bindata_assetfs.go (#12146) 2022-01-21 16:06:44 -05:00
R.B. Boyer 05c7373a28 bulk rewrite using this script
set -euo pipefail

    unset CDPATH

    cd "$(dirname "$0")"

    for f in $(git grep '\brequire := require\.New(' | cut -d':' -f1 | sort -u); do
        echo "=== require: $f ==="
        sed -i '/require := require.New(t)/d' $f
        # require.XXX(blah) but not require.XXX(tblah) or require.XXX(rblah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\([^tr]\)/require.\1(t,\2/g' $f
        # require.XXX(tblah) but not require.XXX(t, blah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/require.\1(t,\2/g' $f
        # require.XXX(rblah) but not require.XXX(r, blah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/require.\1(t,\2/g' $f
        gofmt -s -w $f
    done

    for f in $(git grep '\bassert := assert\.New(' | cut -d':' -f1 | sort -u); do
        echo "=== assert: $f ==="
        sed -i '/assert := assert.New(t)/d' $f
        # assert.XXX(blah) but not assert.XXX(tblah) or assert.XXX(rblah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\([^tr]\)/assert.\1(t,\2/g' $f
        # assert.XXX(tblah) but not assert.XXX(t, blah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/assert.\1(t,\2/g' $f
        # assert.XXX(rblah) but not assert.XXX(r, blah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/assert.\1(t,\2/g' $f
        gofmt -s -w $f
    done
2022-01-20 10:46:23 -06:00
R.B. Boyer c12b0ee3d2 test: normalize require.New and assert.New syntax 2022-01-20 10:45:56 -06:00
R.B. Boyer baf886c6f3
proxycfg: introduce explicit UpstreamID in lieu of bare string (#12125)
The gist here is that now we use a value-type struct proxycfg.UpstreamID
as the map key in ConfigSnapshot maps where we used to use "upstream
id-ish" strings. These are internal only and used just for bidirectional
trips through the agent cache keyspace (like the discovery chain target
struct).

For the few places where the upstream id needs to be projected into xDS,
that's what (proxycfg.UpstreamID).EnvoyID() is for. This lets us ALWAYS
inject the partition and namespace into these things without making
stuff like the golden testdata diverge.
2022-01-20 10:12:04 -06:00
Dan Upton 088ba2edaf
[OSS] Remove remaining references to master (#11827) 2022-01-20 12:47:50 +00:00
VictorBac 145703972a
Add GRPC and GRPCUseTLS to api.HealthCheckDefinition (#12108)
* Add GRPC to HealthCheckDefinition

* add GRPC and GRPCUseTLS
2022-01-19 16:09:15 -05:00
Evan Culver ec65890f01
connect: Upgrade Envoy 1.20 to 1.20.1 (#11895) 2022-01-18 14:35:27 -05:00
Daniel Nephin 59206e38c7 rpc: cleanup exit and blocking condition logic in blockingQuery
Remove some unnecessary comments around query_blocking metric. The only
line that needs any comments in the atomic decrement.

Cleanup the block and return comments and logic. The old comment about
AbandonCh may have been relevant before, but it is expected behaviour
now.

The logic was simplified by inverting the err condition.
2022-01-17 16:59:25 -05:00
Daniel Nephin a28d1268cb rpc: extract rpcQueryTimeout method
This helps keep the logic in blockingQuery more focused. In the future we
may have a separate struct for RPC queries which may allow us to move this
off of Server.
2022-01-17 16:59:25 -05:00
Daniel Nephin 751bc2e7d3 rpc: move the index defaulting to setQueryMeta.
This safeguard should be safe to apply in general. We are already
applying it to non-blocking queries that call blockingQuery, so it
should be fine to apply it to others.
2022-01-17 16:59:25 -05:00
Daniel Nephin 95e471052b rpc: add subtests to blockingQuery test 2022-01-17 16:59:25 -05:00
Daniel Nephin 6bf8efe607 rpc: refactor blocking query
To remove the TODO, and make it more readable.

In general this reduces the scope of variables, making them easier to reason about.
It also introduces more early returns so that we can see the flow from the structure
of the function.
2022-01-17 16:58:47 -05:00
Daniel Nephin 1971a58b29
Merge pull request #11661 from hashicorp/dnephin/ca-remove-one-call-to-active-root
ca: remove one call to Provider.ActiveRoot
2022-01-13 16:48:12 -05:00
Kyle Havlovitz 2ba76486d0 Add virtual IP generation for term gateway backed services 2022-01-12 12:08:49 -08:00
Chris S. Kim 4330a6a21a
Fix race with tags (#12041) 2022-01-12 11:24:51 -05:00
Chris S. Kim 4f0a3a997c
Fix races in anti-entropy tests (#12028) 2022-01-11 14:28:51 -05:00
Mike Morris 277c41d336
ingress: allow setting TLS min version and cipher suites in ingress gateway config entries (#11576)
* xds: refactor ingress listener SDS configuration

* xds: update resolveListenerSDS call args in listeners_test

* ingress: add TLS min, max and cipher suites to GatewayTLSConfig

* xds: implement envoyTLSVersions and envoyTLSCipherSuites

* xds: merge TLS config

* xds: configure TLS parameters with ingress TLS context from leaf

* xds: nil check in resolveListenerTLSConfig validation

* xds: nil check in makeTLSParameters* functions

* changelog: add entry for TLS params on ingress config entries

* xds: remove indirection for TLS params in TLSConfig structs

* xds: return tlsContext, nil instead of ambiguous err

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>

* xds: switch zero checks to types.TLSVersionUnspecified

* ingress: add validation for ingress config entry TLS params

* ingress: validate listener TLS config

* xds: add basic ingress with TLS params tests

* xds: add ingress listeners mixed TLS min version defaults precedence test

* xds: add more explicit tests for ingress listeners inheriting gateway defaults

* xds: add test for single TLS listener on gateway without TLS defaults

* xds: regen golden files for TLSVersionInvalid zero value, add TLSVersionAuto listener test

* types/tls: change TLSVersion to string

* types/tls: update TLSCipherSuite to string type

* types/tls: implement validation functions for TLSVersion and TLSCipherSuites, make some maps private

* api: add TLS params to GatewayTLSConfig, add tests

* api: add TLSMinVersion to ingress gateway config entry test JSON

* xds: switch to Envoy TLS cipher suite encoding from types package

* xds: fixup validation for TLSv1_3 min version with cipher suites

* add some kitchen sink tests and add a missing struct tag

* xds: check if mergedCfg.TLSVersion is in TLSVersionsWithConfigurableCipherSuites

* xds: update connectTLSEnabled comment

* xds: remove unsued resolveGatewayServiceTLSConfig function

 * xds: add makeCommonTLSContextFromLeafWithoutParams

* types/tls: add LessThan comparator function for concrete values

* types/tls: change tlsVersions validation map from string to TLSVersion keys

* types/tls: remove unused envoyTLSCipherSuites

* types/tls: enable chacha20 cipher suites for Consul agent

* types/tls: remove insecure cipher suites from allowed config

TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 and TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 are both explicitly listed as insecure and disabled in the Go source.

Refs https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/crypto/tls/cipher_suites.go;l=329-330

* types/tls: add ValidateConsulAgentCipherSuites function, make direct lookup map private

* types/tls: return all unmatched cipher suites in validation errors

* xds: check that Envoy API value matching TLS version is found when building TlsParameters

* types/tls: check that value is found in map before appending to slice in MarshalEnvoyTLSCipherSuiteStrings

* types/tls: cast to string rather than fmt.Printf in TLSCihperSuite.String()

* xds: add TLSVersionUnspecified to list of configurable cipher suites

* structs: update note about config entry warning

* xds: remove TLS min version cipher suite unconfigurable test placeholder

* types/tls: update tests to remove assumption about private map values

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-01-11 11:46:42 -05:00
Dao Thanh Tung 217e2dc656
URL-encode/decode resource names for HTTP API part 2 (#11957) 2022-01-11 08:52:45 -05:00
Daniel Nephin 262898e561 ca: remove unnecessary var, and slightly reduce cyclo complexity
`newIntermediate` is always equal to `needsNewIntermediate`, so we can
remove the extra variable and use the original directly.

Also remove the `activeRoot.ID != newActiveRoot.ID` case from an if,
because that case is already checked above, and `needsNewIntermediate` will
already be true in that case.

This condition now reads a lot better:

> Persist a new root if we did not have one before, or if generated a new intermediate.
2022-01-06 16:56:49 -05:00
Daniel Nephin d406f78c5c ca: remove unused provider.ActiveRoot call
In the previous commit the single use of this storedRoot was removed.

In this commit the original objective is completed. The
Provider.ActiveRoot is being removed because

1. the secondary should get the active root from the Consul primary DC,
   not the provider, so that secondary DCs do not need to communicate
   with a provider instance in a different DC.
2. so that the Provider.ActiveRoot interface can be changed without
   impacting other code paths.
2022-01-06 16:56:48 -05:00
Daniel Nephin 4d15e8a9ec ca: extract the lookup of the active primary CA
This method had only one caller, which always looked for the active
root. This commit moves the lookup into the method to reduce the logic
in the one caller.

This is being done in preparation for a larger change. Keeping this
separate so it is easier to see.

The `storedRootID != primaryRoots.ActiveRootID` is being removed because
these can never be different.

The `storedRootID` comes from `provider.ActiveRoot`, the
`primaryRoots.ActiveRootID` comes from the store `CARoot` from the
primary. In both cases the source of the data is the primary DC.

Technically they could be different if someone modified the provider
outside of Consul, but that would break many things, so is not a
supported flow.

If these were out of sync because of ordering of events then the
secondary will soon receive an update to `primaryRoots` and everything
will be sorted out again.
2022-01-06 16:56:48 -05:00
Daniel Nephin 37b09df427 ca: update godoc
To clarify what to expect from the data stored in this field, and the
behaviour of this function.
2022-01-06 16:56:48 -05:00
Daniel Nephin 1f670c22f5 ca: remove one call to provider.ActiveRoot
ActiveRoot should not be called from the secondary DC, because there
should not be a requirement to run the same Vault instance in a
secondary DC. SignIntermediate is called in a secondary DC, so it should
not call ActiveRoot

We would also like to change the interface of ActiveRoot so that we can
support using an intermediate cert as the primary CA in Consul. In
preparation for making that change I am reducing the number of calls to
ActiveRoot, so that there are fewer code paths to modify when the
interface changes.

This change required a change to the mockCAServerDelegate we use in
tests. It was returning the RootCert for SignIntermediate, but that is
not an accurate fake of production. In production this would also be a
separate cert.
2022-01-06 16:55:50 -05:00
Daniel Nephin 1f66120c20 ca: remove redundant append of an intermediate cert
Immediately above this line we are already appending the full list of
intermediates. The `provider.ActiveIntermediate` MUST be in this list of
intermediates because it must be available to all the other non-leader
Servers.  If it was not in this list of intermediates then any proxy
that received data from a non-leader would have the wrong certs.

This is being removed now because we are planning on changing the
`Provider.ActiveIntermediate` interface, and removing these extra calls ahead of
time helps make that change easier.
2022-01-06 16:55:50 -05:00
Daniel Nephin b66d259c1a ca: only generate a single private key for the whole test case
Using tracing and cpu profiling I found that the majority of the time in
these test cases is spent generating a private key. We really don't need
separate private keys, so we can generate only one and use it for all
cases.

With this change the test runs much faster.
2022-01-06 16:55:50 -05:00
Daniel Nephin 92a054cfa6 ca: cleanup a test
Fix the name to match the function it is testing

Remove unused code

Fix the signature, instead of returning (error, string) which should be (string, error)
accept a testing.T to emit errors.

Handle the error from encode.
2022-01-06 16:55:49 -05:00
Daniel Nephin 9ec7e07db4 ca: use the new leaf signing lookup func in leader metrics 2022-01-06 16:55:49 -05:00
Blake Covarrubias b13fb553ac
api: Return 404 when deregistering a non-existent check (#11950)
Update the `/agent/check/deregister/` API endpoint to return a 404
HTTP response code when an attempt is made to de-register a check ID
that does not exist on the agent.

This brings the behavior of /agent/check/deregister/ in line with the
behavior of /agent/service/deregister/ which was changed in #10632 to
similarly return a 404 when de-registering non-existent services.

Fixes #5821
2022-01-06 12:38:37 -08:00
Dhia Ayachi 7e0b8354a5
clone the service under lock to avoid a data race (#11940)
* clone the service under lock to avoid a data race

* add change log

* create a struct and copy the pointer to mutate it to avoid a data race

* fix failing test

* revert added space

* add comments, to clarify the data race.
2022-01-06 14:33:06 -05:00
Daniel Nephin d05264041e
Merge pull request #11918 from hashicorp/dnephin/tob-followup
Fix a few small bugs
2022-01-05 18:50:48 -05:00
Daniel Nephin 4983c27703 snapshot: return the error from replyFn
The only function passed to SnapshotRPC today always returns a nil error, so there's no
way to exercise this bug in practice. This change is being made for correctness so that
it doesn't become a problem in the future, if we ever pass a different function to
SnapshotRPC.
2022-01-05 17:51:03 -05:00
Daniel Nephin affe97e22d config: correctly capture all errors.
Some calls to multierror.Append were not using the existing b.err, which meant we
were losing all previous errors.
2022-01-05 17:51:03 -05:00
Chris S. Kim f7f5aca058
Fix test for ENT (#11946) 2022-01-05 15:18:08 -05:00
Chris S. Kim 407b0b8963
Fix test for ENT (#11941) 2022-01-05 12:24:44 -05:00
Dhia Ayachi 5f6bf369af
reset coalesceTimer to nil as soon as the event is consumed (#11924)
* reset `coalesceTimer` to nil as soon as the event is consumed

* add change log

* refactor to add relevant test.

* fix linter

* Apply suggestions from code review

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* remove non needed check

Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-01-05 12:17:47 -05:00
Mathew Estafanous dc18933cc2
Ensure consistency with error-handling across all handlers. (#11599) 2022-01-05 12:11:03 -05:00
Jared Kirschner a9371f18e5 Clarify service and check error messages (use ID)
Error messages related to service and check operations previously included
the following substrings:
- service %q
- check %q

From this error message, it isn't clear that the expected field is the ID for
the entity, not the name. For example, if the user has a service named test,
the error message would read 'Unknown service "test"'. This is misleading -
a service with that *name* does exist, but not with that *ID*.

The substrings above have been modified to make it clear that ID is needed,
not name:
- service with ID %q
- check with ID %q
2022-01-04 11:42:37 -08:00
Jared Kirschner fc076c02c7
Merge pull request #11335 from littlestar642/url-encoded-args
URL-encode/decode resource names for HTTP API
2022-01-04 14:00:14 -05:00
Chris S. Kim d87fe70a82
testing: Revert assertion for virtual IP flag (#11932) 2022-01-04 11:24:56 -05:00
Jared Kirschner d26f8e4529
Merge pull request #11820 from hashicorp/improve-ui-disabled-api-response
http: improve UI not enabled response message
2022-01-03 12:00:01 -05:00
littlestar642 7d1f2157eb add path escape and unescape to path params 2022-01-03 08:18:32 -08:00
Daniel Nephin 48d123e241
Merge pull request #11796 from hashicorp/dnephin/cleanup-test-server
testing: stop using an old version in testServer
2021-12-22 16:04:04 -05:00
freddygv d7975586d6 Purge chain if it shouldn't be there 2021-12-13 18:56:44 -07:00
freddygv be85ae11ca additional test fixes 2021-12-13 18:56:44 -07:00
freddygv e1d4797561 Account for new upstreams constraint in tests 2021-12-13 18:56:28 -07:00
freddygv 16d3efc4b5 Check ingress upstreams when gating chain watches 2021-12-13 18:56:28 -07:00
freddygv f4ddb5432c Use ptr receiver in all Upstream methods 2021-12-13 18:56:14 -07:00
freddygv d647141a7d Avoid storing chain without an upstream 2021-12-13 18:56:14 -07:00
freddygv 9e0958f1d2 Clean up chains separately from their watches 2021-12-13 18:56:14 -07:00
freddygv b704d4e2dd Validate chains are associated with upstreams
Previously we could get into a state where discovery chain entries were
not cleaned up after the associated watch was cancelled. These changes
add handling for that case where stray chain references are encountered.
2021-12-13 18:56:13 -07:00
freddygv ea26a7b7cf Store intention upstreams in snapshot 2021-12-13 18:56:13 -07:00
R.B. Boyer 72a81cfc4a
proxycfg: ensure all of the watches are canceled if they are cancelable (#11824) 2021-12-13 15:56:17 -06:00