Commit Graph

4608 Commits

Author SHA1 Message Date
freddygv 14bff4fba6 Use embedded SpiffeID for peered upstreams 2022-05-31 09:55:37 -06:00
freddygv 4d3e09e8f8 Remove intermediate representation of SPIFFE IDs
xDS only ever uses the string representation, so we can avoid passing
around connect.SpiffeIDService objects around.
2022-05-31 09:55:37 -06:00
freddygv 5cd5108075 Return SPIFFE ID for connect proxies in PeerMeta
Proxies dialing exporting services need to know the SPIFFE ID of
services dialed so that the upstream's SANs can be validated.

This commit attaches the SPIFFE ID to all connect proxies exported over
the peering stream so that they are available to importing clusters.

The data in the SPIFFE ID cannot be re-constructed in peer clusters
because the partition of exported services is overwritten on imports.
2022-05-31 09:55:37 -06:00
Freddy a75af9d94a
[OSS] Add grpc endpoint to fetch a specific trust bundle (#13292)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-05-31 09:54:40 -06:00
Matt Keeler b9e8b5c692
Fix a flaky test (#13282)
At the end of this test we were trying to ensure that updating a service in the local state causes it to re-register the service with the config manager.

The config manager in the same method will also call RegisteredProxies to determine if any need to be removed. This portion of the test is not attempting to verify that behavior.

Because the test is only blocked waiting for the Register event before it can end and assert all the mock expectations were met, we may not see the call to RegisteredProxies. This is especially apparent when tests are run with the race detector.

As we don’t actually care if that method is executed before the end of the test we can simply transition from expecting it to be called exactly once to a 0 or 1 times assertion.
2022-05-27 13:25:08 -04:00
Dan Upton a6a6d5a8ee
Enable servers to configure arbitrary proxies from the catalog (#13244)
OSS port of enterprise PR 1822

Includes the necessary changes to the `proxycfg` and `xds` packages to enable
Consul servers to configure arbitrary proxies using catalog data.

Broadly, `proxycfg.Manager` now has public methods for registering,
deregistering, and listing registered proxies — the existing local agent
state-sync behavior has been moved into a separate component that makes use of
these methods.

When an xDS session is started for a proxy service in the catalog, a goroutine
will be spawned to watch the service in the server's state store and
re-register it with the `proxycfg.Manager` whenever it is updated (and clean
it up when the client goes away).
2022-05-27 12:38:52 +01:00
alex 2d8664d384
monitor leadership in peering service (#13257)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-05-26 17:55:16 -07:00
Riddhi Shah 8714ade534
Termporarily disable validation of merge central config response (#13266)
Temporarily disabling the validation of merge central config response since
it is breaking OSS to ENT merging.
A follow up PR will patch the fixes.
2022-05-26 13:49:40 -07:00
Chris S. Kim d73a9522cb
Add support for streaming CA roots to peers (#13260)
Sender watches for changes to CA roots and sends
them through the replication stream. Receiver saves
CA roots to tablePeeringTrustBundle
2022-05-26 15:24:09 -04:00
Riddhi Shah 6f57acc1bf
Remove tests failing on ent (#13255)
Will follow up with the fixed version of these tests that passes in ent.
2022-05-26 10:17:59 -07:00
John Cowen bf5f1482fd
Export top-level HCP Enabled go-template variable for UI (#13165)
* Update ui template data to export HCPEnabled at the top level
2022-05-26 17:23:56 +01:00
DanStough 65ca7e0bfb fix: multiple grpc/http2 services for ingress listeners 2022-05-26 10:43:58 -04:00
Riddhi Shah e5f1d8dce4
Add support for merge-central-config query param (#13001)
Adds a new query param merge-central-config for use with the below endpoints:

/catalog/service/:service
/catalog/connect/:service
/health/service/:service
/health/connect/:service

If set on the request, the response will include a fully resolved service definition which is merged with the proxy-defaults/global and service-defaults/:service config entries (on-demand style). This is useful to view the full service definition for a mesh service (connect-proxy kind or gateway kind) which might not be merged before being written into the catalog (example: in case of services in the agentless model).
2022-05-25 13:20:17 -07:00
R.B. Boyer 4f9a9bb851
remove a source of test panics (#13227) 2022-05-25 14:33:00 -05:00
R.B. Boyer dae47101fa
api: ensure peering API endpoints do not use protobufs (#13204)
I noticed that the JSON api endpoints for peerings json encodes protobufs directly, rather than converting them into their `api` package equivalents before marshal/unmarshaling them.

I updated this and used `mog` to do the annoying part in the middle. 

Other changes:
- the status enum was converted into the friendlier string form of the enum for readability with tools like `curl`
- some of the `api` library functions were slightly modified to match other similar endpoints in UX (cc: @ndhanushkodi )
- peeringRead returns `nil` if not found
- partitions are NOT inferred from the agent's partition (matching 1.11-style logic)
2022-05-25 13:43:35 -05:00
R.B. Boyer bc10055edc
peering: replicate expected SNI, SPIFFE, and service protocol to peers (#13218)
The importing peer will need to know what SNI and SPIFFE name
corresponds to each exported service. Additionally it will need to know
at a high level the protocol in use (L4/L7) to generate the appropriate
connection pool and local metrics.

For replicated connect synthetic entities we edit the `Connect{}` part
of a `NodeService` to have a new section:

    {
      "PeerMeta": {
        "SNI": [
          "web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul"
        ],
        "SpiffeID": [
          "spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web"
        ],
        "Protocol": "tcp"
      }
    }

This data is then replicated and saved as-is at the importing side. Both
SNI and SpiffeID are slices for now until I can be sure we don't need
them for how mesh gateways will ultimately work.
2022-05-25 12:37:44 -05:00
R.B. Boyer 69191fc0da
peering: disable requirement for mesh gateways initially (#13213) 2022-05-25 10:13:23 -05:00
Kyle Havlovitz cebf7b23f6
Merge pull request #13143 from hashicorp/envoy-connection-limit
Add connection limit setting to service defaults
2022-05-25 07:48:50 -07:00
Kyle Havlovitz f5f949d486 Fix proto lint errors after version bump 2022-05-24 18:44:54 -07:00
Kyle Havlovitz 749591ec98 Specify go_package explicitly 2022-05-24 10:22:53 -07:00
cskh b7eec4c05b
fix: non-leader agents return 404 on Get Intention exact api (#13179)
* fix: non-leader agents return 404 on Get Intention exact api

- rpc call method appends extra error message, so change == to
  "Strings.Contains"

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-05-24 13:21:15 -04:00
Kyle Havlovitz 03dea180ad Add connection limit setting to service defaults 2022-05-24 10:13:38 -07:00
DanStough 2c8ca25d8a chore(test): Update bats version 2022-05-24 11:56:08 -04:00
DanStough df59d8ab0d feat: add endpoint struct to ServiceConfigEntry 2022-05-24 11:56:08 -04:00
alex 451dc50f4f
peering: expose IsLeader, hung up on dialer if follower (#13164)
Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-05-23 11:30:58 -07:00
Matt Keeler 1fd02a13c2
Migrate from `protoc` to `buf` (#12841)
* Install `buf` instead of `protoc`
* Created `buf.yaml` and `buf.gen.yaml` files in the two proto directories to control how `buf` generates/lints proto code.
* Invoke `buf` instead of `protoc`
* Added a `proto-format` make target.
* Committed the reformatted proto files.
* Added a `proto-lint` make target.
* Integrated proto linting with CI
* Fixed tons of proto linter warnings.
* Got rid of deprecated builtin protoc-gen-go grpc plugin usage. Moved to direct usage of protoc-gen-go-grpc.
* Unified all proto directories / go packages around using pb prefixes but ensuring all proto packages do not have the prefix.
2022-05-23 10:37:52 -04:00
cskh 39cb731988
Upgrade golangci-lint for go v1.18 (#13176) 2022-05-23 10:26:45 -04:00
R.B. Boyer 3b12a5179f
test: fix flaky test TestEventBufferFuzz (#13175) 2022-05-23 09:22:30 -05:00
Matt Keeler c629e89289
Fix tests broken in #13173 (#13178)
I changed the error type returned in a situation but didn’t update the tests to expect that error.
2022-05-23 10:00:06 -04:00
Matt Keeler 8a968299dd
Fix flaky tests in the agent/grpc/public/services/serverdiscovery package (#13173)
Occasionally we had seen the TestWatchServers_ACLToken_PermissionDenied be flagged as flaky in circleci. This change should fix that.

Why it fixes it is complicated. The test was failing with a panic when a mocked ACL Resolver was being called more times than expected. I struggled for a while to determine how that could be. This test should call authorize once and only once and the error returned should cause the stream to be terminated and the error returned to the gRPC client. Another oddity was no amount of running this test locally seemed to be able to reproduce the issue. I ran the test hundreds of thousands of time and it always passed.

It turns out that there is nothing wrong with the test. It just so happens that the panic from unexpected invocation of a mocked call happened during the test but was caused by a previous test (specifically the TestWatchServers_StreamLifecycle test)

The stream from the previous test remained open after all the test Cleanup functions were run and it just so happened that when the EventPublisher eventually picked up that the context was cancelled during cleanup, it force closes all subscriptions which causes some loops to be re-entered and the streams to be reauthorized. Its that looping in response to forced subscription closures that causes the mock to eventually panic. All the components, publisher, server, client all operate based on contexts. We cancel all those contexts but there is no syncrhonous way to know when they are stopped.

We could have implemented a syncrhonous stop but in the context of an actual running Consul, context cancellation + async stopping is perfectly fine. What we (Dan and I) eventually thought was that the behavior of grpc streams such as this when a server was shutting down wasn’t super helpful. What we would want is for a client to be able to distinguish between subscription closed because something may have changed requiring re-authentication and subscription closed because the server is shutting down. That way we can send back appropriate error messages to detail that the server is shutting down and not confuse users with potentially needing to resubscribe.

So thats what this PR does. We have introduced a shutting down state to our event subscriptions and the various streaming gRPC services that rely on the event publisher will all just behave correctly and actually stop the stream (not attempt transparent reauthorization) if this particular error is the one we get from the stream. Additionally the error that gets transmitted back through gRPC when this does occur indicates to the consumer that the server is going away. That is more helpful so that a client can then attempt to reconnect to another server.
2022-05-23 08:59:13 -04:00
R.B. Boyer 69d3e729a4
agent: allow for service discovery queries involving peer name to use streaming (#13168) 2022-05-20 15:27:01 -05:00
Dan Upton 30775ed54d
proxycfg: remove dependency on `cache.UpdateEvent` (#13144)
OSS portion of enterprise PR 1857.

This removes (most) references to the `cache.UpdateEvent` type in the
`proxycfg` package.

As we're going to be direct usage of the agent cache with interfaces that
can be satisfied by alternative server-local datasources, it doesn't make
sense to depend on this type everywhere anymore (particularly on the
`state.ch` channel).

We also plan to extract `proxycfg` out of Consul into a shared library in
the future, which would require removing this dependency.

Aside from a fairly rote find-and-replace, the main change is that the
`cache.Cache` and `health.Client` types now accept a callback function
parameter, rather than a `chan<- cache.UpdateEvents`. This allows us to
do the type conversion without running another goroutine.
2022-05-20 15:47:40 +01:00
R.B. Boyer 63a9175bd6
peering: accept replication stream of discovery chain information at the importing side (#13151) 2022-05-19 16:37:52 -05:00
R.B. Boyer 68789effeb
test: TestServer_RPC_MetricsIntercept should use a concurrency-safe metrics store (#13157) 2022-05-19 15:39:28 -05:00
cskh df27fa0c84
Retry on bad dogstatsd connection (#13091)
- Introduce a new telemetry configurable parameter retry_failed_connection. User can set the value to true to let consul agent continue its start process on failed connection to datadog server. When set to false, agent will stop on failed start. The default behavior is true.

Co-authored-by: Dan Upton <daniel@floppy.co>
Co-authored-by: Evan Culver <eculver@users.noreply.github.com>
2022-05-19 16:03:46 -04:00
R.B. Boyer 91691eca87 peering: replicate discovery chains information to importing peers
Treat each exported service as a "discovery chain" and replicate one
synthetic CheckServiceNode for each chain and remote mesh gateway.

The health will be a flattened generated check of the checks for that
mesh gateway node.
2022-05-19 14:21:44 -05:00
R.B. Boyer bf05e8c1f1 prefactor some functions out of the monolithic file 2022-05-19 14:21:29 -05:00
R.B. Boyer 09861a2792
test: fix incorrect use of t instead of r in retry test (#13146) 2022-05-19 14:00:07 -05:00
Dan Upton 7492357b43
config: prevent top-level `verify_incoming` enabling mTLS on gRPC port (#13118)
Fixes #13088

This is a backwards-compatibility bug introduced in 1.12.
2022-05-18 16:15:57 +01:00
Freddy 6c868b6c0e
Patches to peering initiation for POC demo (#13076)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-05-13 13:01:00 -06:00
Dhia Ayachi 70b93ea693
When a host header is defined override `req.Host` in the metrics ui (#13071)
* When a host header is defined override the req.Host in the metrics ui endpoint.

* add changelog
2022-05-13 14:05:22 -04:00
Freddy 160acdf876
Actually block when syncing subscriptions (#13066)
By changing to use WatchCtx we will actually block for changes to the peering list. WatchCh creates a goroutine to collect errors from WatchCtx and returns immediately.

The existing behavior wouldn't result in a tight loop because of the rate limiting in the surrounding function, but it would still lead to more work than is necessary.
2022-05-12 17:36:14 -06:00
Evan Culver 535e811020
peering: add TrustBundleListByService endpoint (#13048) 2022-05-12 15:58:22 -07:00
Freddy 8894365c5a
[OSS] Add upsert handling for receiving CheckServiceNode (#13061) 2022-05-12 15:04:44 -06:00
Matt Keeler 42aec5caf4
Watch the singular service resolver instead of the list + filtering to 1 (#13012)
* Watch the singular service resolver instead of the list + filtering to 1

* Rename the ConfigEntries cache type to ConfigEntryList
2022-05-12 16:34:17 -04:00
R.B. Boyer 26cfdc1967
structs: add convenience methods to sort slices of ServiceName values (#13038) 2022-05-12 10:08:50 -05:00
R.B. Boyer b932d0dabc
test: ensure this package uses freeport for port allocation (#13036) 2022-05-11 14:20:50 -05:00
R.B. Boyer c855df87ec
remove remaining shim runStep functions (#13015)
Wraps up the refactor from #13013
2022-05-10 16:24:45 -05:00
R.B. Boyer 9ad10318cd
add general runstep test helper instead of copying it all over the place (#13013) 2022-05-10 15:25:51 -05:00
Jared Kirschner a18d7092cb
Merge pull request #12463 from hashicorp/docs/consistency-mode-improvements
Improve consistency mode docs
2022-05-09 23:04:00 -04:00
Jared Kirschner bc69d70a55 docs: clarify consistency mode operation
Changes include:
- Add diagrams of the operation of different consistency modes
- Note that only stale reads benefit from horizontal scaling
- Increase scannability with headings
- Document consistency mode defaults and how to override for
  DNS and HTTP API interfaces
- Document X-Consul-Effective-Consistency response header
2022-05-09 16:39:48 -07:00
FFMMM ea083062d2
add err msg on PeeringRead not found (#12986)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-05-09 15:22:42 -07:00
FFMMM 76a6647700
expose meta tags for peering (#12964) 2022-05-09 13:47:37 -07:00
Mark Anderson 785a6d9c8e Add oss test
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-09 10:07:19 -07:00
Mark Anderson be059184dd Fix up enterprise version tag.
Changes to how the version string was handled created small regression with the release of consul 1.12.0 enterprise.

Many tools use the Config:Version field reported by the agent/self resource to determine whether Consul is an enterprise or OSS instance, expect something like 1.12.0+ent for enterprise and simply 1.12.0 for OSS. This was accidentally broken during the runup to 1.12.x

This work fixes the value returned by both the self endpoint in ["Config"]["Version"] and the metrics consul.version field.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-09 10:07:19 -07:00
Evan Culver d64726c8e9
peering: add store.PeeringsForService implementation (#12957) 2022-05-06 12:35:31 -07:00
Eric Haberkorn fa6e17bdfd
Merge pull request #12956 from hashicorp/suport-lambda-connect-proxy
Support Invoking Lambdas from Sidecar Proxies
2022-05-06 08:17:38 -04:00
Eric 999fc2dbd5 Support making requests to lambda from connect proxies. 2022-05-05 17:42:30 -04:00
FFMMM 5a6d1cad4e
api: add PeeeringList, polish (#12934) 2022-05-05 14:15:42 -07:00
Riddhi Shah a9c96b6975
Validate port on mesh service registration (#12881)
Add validation to ensure connect native services have a port or socketpath specified on catalog registration.
This was the only missing piece to ensure all mesh services are validated for a port (or socketpath) specification on catalog registration.
2022-05-05 09:13:30 -07:00
Mark Anderson 18193f2916
Support vault namespaces in connect CA (#12904)
* Support vault namespaces in connect CA

Follow on to some missed items from #12655

From an internal ticket "Support standard "Vault namespace in the
path" semantics for Connect Vault CA Provider"

Vault allows the namespace to be specified as a prefix in the path of
a PKI definition, but our usage of the Vault API includes calls that
don't support a namespaced key. In particular the sys.* family of
calls simply appends the key, instead of prefixing the namespace in
front of the path.

Unfortunately it is difficult to reliably parse a path with a
namespace; only vault knows what namespaces are present, and the '/'
separator can be inside a key name, as well as separating path
elements. This is in use in the wild; for example
'dc1/intermediate-key' is a relatively common naming schema.

Instead we add two new fields: RootPKINamespace and
IntermediatePKINamespace, which are the absolute namespace paths
'prefixed' in front of the respective PKI Paths.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 19:41:55 -07:00
Chris S. Kim e55aac9d30
Default discovery chain when upstream targets a DestinationPeer (#12942) 2022-05-04 16:25:25 -04:00
Mark Anderson 1497421b65
Merge pull request #12878 from hashicorp/ma/x-forwarded-client-cert
Support x-forwarded-client-cert
2022-05-04 11:05:44 -07:00
Dan Upton 6bfdb48560
acl: gRPC login and logout endpoints (#12935)
Introduces two new public gRPC endpoints (`Login` and `Logout`) and
includes refactoring of the equivalent net/rpc endpoints to enable the
majority of logic to be reused (i.e. by extracting the `Binder` and
`TokenWriter` types).

This contains the OSS portions of the following enterprise commits:

- 75fcdbfcfa6af21d7128cb2544829ead0b1df603
- bce14b714151af74a7f0110843d640204082630a
- cc508b70fbf58eda144d9af3d71bd0f483985893
2022-05-04 17:38:45 +01:00
Mark Anderson 13f5a1f6a8 Fix tests for APPEND_FORWARD change
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson 69c129c73f Change to use APPEND_FORWARD for terminating gateway
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson db0c61303f Update mesh config tests
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson e6282c7c64 Docs and changelog edits
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson c6dbc34172 Fixup missed config entry
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 08:50:59 -07:00
Mark Anderson d8f4cc5537 Add x-forwarded-client-cert headers
Description
Add x-fowarded-client-cert information on trusted incoming connections.

Envoy provides support forwarding and annotating the
x-forwarded-client-cert header via the forward_client_cert_details
set_current_client_cert_details filter fields. It would be helpful for
consul to support this directly in its config. The escape hatches are
a bit cumbersome for this purpose.

This has been implemented on incoming connections to envoy. Outgoing
(from the local service through the sidecar) will not have a
certificate, and so are left alone.

A service on an incoming connection will now get headers something like this:

```
X-Forwarded-Client-Cert:[By=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/counting;Hash=61ad5cbdfcb50f5a3ec0ca60923d61613c149a9d4495010a64175c05a0268ab2;Cert="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Chain="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Subject="";URI=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/dashboard]
```

Closes #12852
2022-05-04 08:50:58 -07:00
Kyle Havlovitz 369f4848e3
Merge pull request #12885 from hashicorp/acl-err-cache
Store and return RPC error in ACL cache entries
2022-05-03 10:44:22 -07:00
Kyle Havlovitz 3bd001fb29 Return ACLRemoteError from cache and test it correctly 2022-05-03 10:05:26 -07:00
FFMMM 4cd68b4534
[sync oss] api: add peering api module (#12911) 2022-05-02 11:49:05 -07:00
Chris S. Kim 829554c706
peering: Make Upstream peer-aware (#12900)
Adds DestinationPeer field to Upstream.
Adds Peer field to UpstreamID and its string conversion functions.
2022-04-29 18:12:51 -04:00
Chris S. Kim 33bfaf5671
Cleanup peering files that used error types that were removed (#12892) 2022-04-29 14:02:26 -04:00
Mathew Estafanous 893b740dff
Unify various status errors into one HTTP error type. (#12594)
Replaces specific error types for HTTP Status codes with 
a generic HTTPError type.

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-04-29 13:42:49 -04:00
Kyle Havlovitz f84ed5f70b Store and return rpc error in acl cache entries 2022-04-28 09:08:55 -07:00
R.B. Boyer 642b75b60b
health: ensure /v1/health/service/:service endpoint returns the most recent results when a filter is used with streaming (#12640)
The primary bug here is in the streaming subsystem that makes the overall v1/health/service/:service request behave incorrectly when servicing a blocking request with a filter provided.

There is a secondary non-streaming bug being fixed here that is much less obvious related to when to update the `reply` variable in a `blockingQuery` evaluation. It is unlikely that it is triggerable in practical environments and I could not actually get the bug to manifest, but I fixed it anyway while investigating the original issue.

Simple reproduction (streaming):

1. Register a service with a tag.

        curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
            --header 'Content-Type: application/json' \
            --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "a" ], "EnableTagOverride": true }'

2. Do an initial filter query that matches on the tag.

        curl -sLi --get 'http://localhost:8500/v1/health/service/test' --data-urlencode 'filter=a in Service.Tags'

3. Note you get one result. Use the `X-Consul-Index` header to establish
   a blocking query in another terminal, this should not return yet.

        curl -sLi --get 'http://localhost:8500/v1/health/service/test?index=$INDEX' --data-urlencode 'filter=a in Service.Tags'

4. Re-register that service with a different tag.

        curl -sL --request PUT 'http://localhost:8500/v1/agent/service/register' \
            --header 'Content-Type: application/json' \
            --data-raw '{ "ID": "ID1", "Name": "test", "Tags":[ "b" ], "EnableTagOverride": true }'

5. Your blocking query from (3) should return with a header
   `X-Consul-Query-Backend: streaming` and empty results if it works
   correctly `[]`.

Attempts to reproduce with non-streaming failed (where you add `&near=_agent` to the read queries and ensure `X-Consul-Query-Backend: blocking-query` shows up in the results).
2022-04-27 10:39:45 -05:00
R.B. Boyer c949de9f50
structs: ensure exported-services PeerName field can be addressed as peer_name (#12862) 2022-04-27 10:27:21 -05:00
Dhia Ayachi 9dc5200155
update raft to v1.3.8 (#12844)
* update raft to v1.3.7

* add changelog

* fix compilation error

* fix HeartbeatTimeout

* fix ElectionTimeout to reload only if value is valid

* fix default values for `ElectionTimeout` and `HeartbeatTimeout`

* fix test defaults

* bump raft to v1.3.8
2022-04-25 10:19:26 -04:00
R.B. Boyer 809344a6f5
peering: initial sync (#12842)
- Add endpoints related to peering: read, list, generate token, initiate peering
- Update node/service/check table indexing to account for peers
- Foundational changes for pushing service updates to a peer
- Plumb peer name through Health.ServiceNodes path

see: ENT-1765, ENT-1280, ENT-1283, ENT-1283, ENT-1756, ENT-1739, ENT-1750, ENT-1679,
     ENT-1709, ENT-1704, ENT-1690, ENT-1689, ENT-1702, ENT-1701, ENT-1683, ENT-1663,
     ENT-1650, ENT-1678, ENT-1628, ENT-1658, ENT-1640, ENT-1637, ENT-1597, ENT-1634,
     ENT-1613, ENT-1616, ENT-1617, ENT-1591, ENT-1588, ENT-1596, ENT-1572, ENT-1555

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Evan Culver <eculver@hashicorp.com>
Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
2022-04-21 17:34:40 -05:00
Will Jordan 45ffdc360e
Add timeout to Client RPC calls (#11500)
Adds a timeout (deadline) to client RPC calls, so that streams will no longer hang indefinitely in unstable network conditions.

Co-authored-by: kisunji <ckim@hashicorp.com>
2022-04-21 16:21:35 -04:00
Matt Keeler f49adfaaf0
Implement the ServerDiscovery.WatchServers gRPC endpoint (#12819)
* Implement the ServerDiscovery.WatchServers gRPC endpoint
* Fix the ConnectCA.Sign gRPC endpoints metadata forwarding.
* Unify public gRPC endpoints around the public.TraceID function for request_id logging
2022-04-21 12:56:18 -04:00
Blake Covarrubias 2beea7eb7c
acl: Clarify node/service identities must be lowercase (#12807)
Modify ACL error message for invalid node/service identities names to
clearly state only lowercase alphanumeric characters are supported.
2022-04-21 09:29:16 -07:00
R.B. Boyer bbd38e95ce
chore: upgrade mockery to v2 and regenerate (#12836) 2022-04-21 09:48:21 -05:00
R.B. Boyer cf0c5110be
ca: fix a bug that caused a non blocking leaf cert query after a blocking leaf cert query to block (#12820)
Fixes #12048

Fixes #12319

Regression introduced in #11693

Local reproduction steps:
1. `consul agent -dev`
2. `curl -sLiv 'localhost:8500/v1/agent/connect/ca/leaf/web'`
3. make note of the `X-Consul-Index` header returned 
4. `curl -sLi 'localhost:8500/v1/agent/connect/ca/leaf/web?index=<VALUE_FROM_STEP_3>'`
5. Kill the above curl when it hangs with Ctrl-C
6. Repeat (2) and it should not hang.
2022-04-20 12:21:47 -05:00
Riddhi Shah 1d49f5c84e
[OSS] gRPC call to get envoy bootstrap params (#12825)
Adds a new gRPC endpoint to get envoy bootstrap params. The new consul-dataplane service will use this
endpoint to generate an envoy bootstrap configuration.
2022-04-19 17:24:21 -07:00
Matt Keeler 3badd4c35c
Add event generation for autopilot state updates (#12626)
Whenever autopilot updates its state it notifies Consul. That notification will then trigger Consul to extract out the ready server information. If the ready servers have changed, then an event will be published to notify any subscribers of the full set of ready servers.

All these ready server event things are contained within an autopilotevents package instead of the consul package to make importing them into the grpc related packages possible
2022-04-19 13:03:03 -04:00
Evan Culver 9d0b5bf8e9
connect: Add Envoy 1.22 to integration tests, remove Envoy 1.18 (#12805)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-04-18 09:36:07 -07:00
DanStough a050aa39b9 Update go version to 1.18.1 2022-04-18 11:41:10 -04:00
Kyle Havlovitz 449a0f855c
Add an internal env var for managed cluster config in the ui (#12796) 2022-04-15 09:55:52 -07:00
John Murret 1e118a5410 set vault namespaces on vault client prior to logging in with the vault auth method 2022-04-14 12:18:06 -06:00
Evan Culver e62745c82c
connect: Add Envoy 1.21.1 to support matrix, remove 1.17.4 (#12777) 2022-04-14 10:44:42 -07:00
Dan Upton 769d1d6e8e
ConnectCA.Sign gRPC Endpoint (#12787)
Introduces a gRPC endpoint for signing Connect leaf certificates. It's also
the first of the public gRPC endpoints to perform leader-forwarding, so
establishes the pattern of forwarding over the multiplexed internal RPC port.
2022-04-14 14:26:14 +01:00
Kyle Havlovitz 199f1c7200
Fix namespace default field names in expanded token output 2022-04-13 16:46:39 -07:00
Paul Glass 5eea62b47a
acl: Adjust region handling in AWS IAM auth method (#12774)
* acl: Adjust region handling in AWS IAM auth method
2022-04-13 14:31:37 -05:00
Eric Haberkorn 7961901153
Merge pull request #12773 from hashicorp/fix-lambda-intentions-and-routing
Implement Routing and Intentions for Lambdas
2022-04-13 13:01:15 -04:00
Eric b062f8c2aa Implement routing and intentions for AWS Lambdas 2022-04-13 11:45:25 -04:00
Karl Cardenas b0b197964c
Merge pull request #12562 from hashicorp/docs/blake-agent-config
docs: Agent configuration hierarchy reorganization
2022-04-12 12:33:42 -07:00
FFMMM cf7e6484aa
add more labels to RequestRecorder (#12727)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-04-12 10:50:25 -07:00
Matt Keeler 2a4ca71d3f
Move to using a shared EventPublisher (#12673)
Previously we had 1 EventPublisher per state.Store. When a state store was closed/abandoned such as during a consul snapshot restore, this had the behavior of force closing subscriptions for that topic and evicting event snapshots from the cache.

The intention of this commit is to keep all that behavior. To that end, the shared EventPublisher now supports the ability to refresh a topic. That will perform the force close + eviction. The FSM upon abandoning the previous state.Store will call RefreshTopic for all the topics with events generated by the state.Store.
2022-04-12 09:47:42 -04:00
Blake Covarrubias 3175bf6b1b Remove .html extensions from docs URLs 2022-04-11 17:38:49 -07:00
Natalie Smith cd17e98800 docs: fix yet more references to agent/options 2022-04-11 17:38:49 -07:00
R.B. Boyer f5a882f66c
fix broken test (#12741) 2022-04-11 10:56:57 -05:00
Jared Kirschner 9a81a77264
Merge pull request #12725 from hashicorp/clarify-service-deregister-after-critical-message
improve error msg for deregister critical service
2022-04-07 18:01:54 -04:00
R.B. Boyer f4eac06b21
xds: ensure that all connect timeout configs can apply equally to tproxy direct dial connections (#12711)
Just like standard upstreams the order of applicability in descending precedence:

1. caller's `service-defaults` upstream override for destination
2. caller's `service-defaults` upstream defaults
3. destination's `service-resolver` ConnectTimeout
4. system default of 5s

Co-authored-by: mrspanishviking <kcardenas@hashicorp.com>
2022-04-07 16:58:21 -05:00
Jared Kirschner 1835e761fd improve error msg for deregister critical service
If a service is automatically registered because it has a critical health check
for longer than deregister_critical_service_after, the error message will now
include:
- mention of the deregister_critical_service_after option
- the value of deregister_critical_service_after for that check
2022-04-07 14:50:02 -07:00
Kyle Havlovitz a2c28526a7
Merge pull request #12685 from hashicorp/http-check-redirect-option
Add a field to disable following redirects on http checks
2022-04-07 11:29:27 -07:00
Matt Keeler 3447880091
Enable running autopilot state updates on all servers (#12617)
* Fixes a lint warning about t.Errorf not supporting %w

* Enable running autopilot on all servers

On the non-leader servers all they do is update the state and do not attempt any modifications.

* Fix the RPC conn limiting tests

Technically they were relying on racey behavior before. Now they should be reliable.
2022-04-07 10:48:48 -04:00
FFMMM 0f68bf879a
[rpc/middleware][consul] plumb intercept off, add server level happy test (#12692) 2022-04-06 14:33:05 -07:00
FFMMM 6bdde40d5e
lower log to trace (#12708) 2022-04-06 11:37:08 -07:00
Kyle Havlovitz b3db499c74 Add a field to disable following redirects on http checks 2022-04-05 16:12:18 -07:00
Mark Anderson ed3e42296d Fixup acl.EnterpriseMeta
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 15:11:49 -07:00
Mark Anderson 0905c1d83d Manual Structs fixup
Change things by hand that I couldn't figure out how to automate

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 14:51:10 -07:00
Mark Anderson 497b300c76 add new entmeta stuff.
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 14:49:31 -07:00
R.B. Boyer 12523197f9
syncing changes back from enterprise (#12701) 2022-04-05 15:46:56 -05:00
Riddhi Shah 0e5d46e9c4
Merge pull request #12695 from hashicorp/feature-negotiation-grpc-api-oss
[OSS] Supported dataplane features gRPC endpoint
2022-04-05 11:26:33 -07:00
Dan Upton e3d2b91e34
ca: move ConnectCA.Sign authorization logic to CAManager (#12609)
OSS sync of enterprise changes at 8d6fd125
2022-04-05 13:16:20 -05:00
Kyle Havlovitz 9380343689
Merge pull request #12672 from hashicorp/tgate-san-validation
Respect SNI with terminating gateways and log a warning if it isn't set alongside TLS
2022-04-05 11:15:59 -07:00
Riddhi Shah f053279c4e [OSS] Supported dataplane features gRPC endpoint
Adds a new gRPC service and endpoint to return the list of supported
consul dataplane features. The Consul Dataplane will use this API to
customize its interaction with that particular server.
2022-04-05 07:38:58 -07:00
Dan Upton e48c1611ee
WatchRoots gRPC endpoint (#12678)
Adds a new gRPC streaming endpoint (WatchRoots) that dataplane clients will
use to fetch the current list of active Connect CA roots and receive new
lists whenever the roots are rotated.
2022-04-05 15:26:14 +01:00
Dhia Ayachi cdcb249449
add a rate limiter to config auto-reload (#12490)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* do not reset timer on errors, rename OS specific files

* rename New func

* events trigger on write and rename

* add missing test

* fix flaking tests

* fix flaky test

* check reconcile when removed

* delete invalid file

* fix test to create files with different mod time.

* back date file instead of sleeping

* add watching file in agent command.

* fix watcher call to use new API

* add configuration and stop watcher when server stop

* add certs as watched files

* move FileWatcher to the agent start instead of the command code

* stop watcher before replacing it

* save watched files in agent

* add add and remove interfaces to the file watcher

* fix remove to not return an error

* use `Add` and `Remove` to update certs files

* fix tests

* close events channel on the file watcher even when the context is done

* extract `NotAutoReloadableRuntimeConfig` is a separate struct

* fix linter errors

* add Ca configs and outgoing verify to the not auto reloadable config

* add some logs and fix to use background context

* add tests to auto-config reload

* remove stale test

* add tests to changes to config files

* add check to see if old cert files still trigger updates

* rename `NotAutoReloadableRuntimeConfig` to `StaticRuntimeConfig`

* fix to re add both key and cert file. Add test to cover this case.

* review suggestion

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* add check to static runtime config changes

* fix test

* add changelog file

* fix review comments

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* update flag description

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>

* fix compilation error

* add static runtime config support

* fix test

* fix review comments

* fix log test

* Update .changelog/12329.txt

Co-authored-by: Dan Upton <daniel@floppy.co>

* transfer tests to runtime_test.go

* fix filewatcher Replace to not deadlock.

* avoid having lingering locks

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* split ReloadConfig func

* fix warning message

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* convert `FileWatcher` into an interface

* fix compilation errors

* fix tests

* extract func for adding and removing files

* add a coalesceTimer with a very small timer

* extract coaelsce Timer and add a shim for testing

* add tests to coalesceTimer fix to send remaining events

* set `coalesceTimer` to 1 Second

* support symlink, fix a nil deref.

* fix compile error

* fix compile error

* refactor file watcher rate limiting to be a Watcher implementation

* fix linter issue

* fix runtime config

* fix runtime test

* fix flaky tests

* fix compile error

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix agent New to return an error if File watcher New return an error

* quit timer loop if ctx is canceled

* Apply suggestions from code review

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Daniel Upton <daniel@floppy.co>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-04-04 11:31:39 -04:00
Eric Haberkorn 66c40e073a
Merge pull request #12681 from hashicorp/lambda-patching-tweaks
Tweak the Lambda Envoy configuration generated by the serverless patcher
2022-04-01 19:59:30 -04:00
FFMMM b0cba2ec03
mark disable_compat_1.9 to deprecate in 1.13, change default to true (#12675)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-04-01 10:35:56 -07:00
R.B. Boyer 3060b5cb8f
xds: errors from the xds serverless plugin are fatal (#12682) 2022-04-01 10:30:26 -05:00
Eric 66391186ce Tweak the Lambda Envoy configuration generated by the serverless patcher
- Move from `strip_matching_host_port` to `strip_any_host_port`
- Remove `auto_host_rewrite` since it conflicts with `strip_any_host_port`
2022-04-01 11:13:44 -04:00
Eric Haberkorn 9b291d362a
Merge pull request #12676 from hashicorp/implement-lambda-patching
Implement Lambda Patching in the Serverless Plugin
2022-04-01 09:58:56 -04:00
Mark Anderson aa29324a24
Avoid using sys/mounts to enable namespaces (#12655)
* Avoid doing list of /sys/mounts

From an internal ticket "Support standard "Vault namespace in the path" semantics for Connect Vault CA Provider"

Vault allows the namespace to be specified as a prefix in the path of
a PKI definition, but this doesn't currently work for
```IntermediatePKIPath``` specifications, because we attempt to list
all of the paths to check if ours is already defined. This doesn't
really work in a namespaced world.

This changes the IntermediatePKIPath code to follow the same pattern
as the root key, where we directly get the key rather than listing.

This code is difficult to write automated tests for because it relies
on features of Vault Enterprise, which isn't currently part of our
test framework, so it was tested manually.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* add changelog

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-31 23:35:38 -07:00
Kyle Havlovitz 116b6c57cb Use the GatewayService SNI field for upstream SAN validation 2022-03-31 13:54:25 -07:00
Eric c7204528c5 Implement Lambda Patching in the Serverless Plugin 2022-03-31 16:45:32 -04:00
Kyle Havlovitz c2a48990cf
Merge pull request #12670 from hashicorp/token-read-expanded
oss: Add expanded token read flag and endpoint option
2022-03-31 12:24:11 -07:00
Kyle Havlovitz 4974d8471b Log a warning when a terminating gateway service has TLS but not SNI configured 2022-03-31 12:18:40 -07:00
Dhia Ayachi 8552efa955
auto-reload configuration when config files change (#12329)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* do not reset timer on errors, rename OS specific files

* rename New func

* events trigger on write and rename

* add missing test

* fix flaking tests

* fix flaky test

* check reconcile when removed

* delete invalid file

* fix test to create files with different mod time.

* back date file instead of sleeping

* add watching file in agent command.

* fix watcher call to use new API

* add configuration and stop watcher when server stop

* add certs as watched files

* move FileWatcher to the agent start instead of the command code

* stop watcher before replacing it

* save watched files in agent

* add add and remove interfaces to the file watcher

* fix remove to not return an error

* use `Add` and `Remove` to update certs files

* fix tests

* close events channel on the file watcher even when the context is done

* extract `NotAutoReloadableRuntimeConfig` is a separate struct

* fix linter errors

* add Ca configs and outgoing verify to the not auto reloadable config

* add some logs and fix to use background context

* add tests to auto-config reload

* remove stale test

* add tests to changes to config files

* add check to see if old cert files still trigger updates

* rename `NotAutoReloadableRuntimeConfig` to `StaticRuntimeConfig`

* fix to re add both key and cert file. Add test to cover this case.

* review suggestion

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* add check to static runtime config changes

* fix test

* add changelog file

* fix review comments

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* update flag description

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>

* fix compilation error

* add static runtime config support

* fix test

* fix review comments

* fix log test

* Update .changelog/12329.txt

Co-authored-by: Dan Upton <daniel@floppy.co>

* transfer tests to runtime_test.go

* fix filewatcher Replace to not deadlock.

* avoid having lingering locks

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* split ReloadConfig func

* fix warning message

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* convert `FileWatcher` into an interface

* fix compilation errors

* fix tests

* extract func for adding and removing files

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Daniel Upton <daniel@floppy.co>
2022-03-31 15:11:49 -04:00
Kyle Havlovitz 9a2474381a Add expanded token read flag and endpoint option 2022-03-31 10:49:49 -07:00
FFMMM 8b184197b3
polish rpc.service.call metric behavior (#12624) 2022-03-31 10:49:37 -07:00
Paul Glass aae6d8080d
Add IAM Auth Method (#12583)
This adds an aws-iam auth method type which supports authenticating to Consul using AWS IAM identities.

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-03-31 10:18:48 -05:00
Eric Haberkorn 9751626828
Merge pull request #12659 from hashicorp/bump-go-control-plane
Bump Go Control Plane
2022-03-30 15:07:47 -04:00
R.B. Boyer e9230e93d8
xds: adding control of the mesh-wide min/max TLS versions and cipher suites from the mesh config entry (#12601)
- `tls.incoming`: applies to the inbound mTLS targeting the public
  listener on `connect-proxy` and `terminating-gateway` envoy instances

- `tls.outgoing`: applies to the outbound mTLS dialing upstreams from
  `connect-proxy` and `ingress-gateway` envoy instances

Fixes #11966
2022-03-30 13:43:59 -05:00
R.B. Boyer ee11dff5a5 similar bump 2022-03-30 13:28:00 -05:00
R.B. Boyer 47693e3ebf fail on error and use ptypes.MarshalAny for now instead of anypb.New 2022-03-30 13:27:49 -05:00
Eric 91a493efe9 Bump go-control-plane
* `go get cloud.google.com/go@v0.59.0`
* `go get github.com/envoyproxy/go-control-plane@v0.9.9`
* `make envoy-library`
* Bumpprotoc to 3.15.8
2022-03-30 13:11:27 -04:00
R.B. Boyer d4e80b8800
server: ensure that service-defaults meta is incorporated into the discovery chain response (#12511)
Also add a new "Default" field to the discovery chain response to clients
2022-03-30 10:04:18 -05:00
FFMMM 0fd6cdc900
introduce EmptyReadRequest for status_endpoint (#12653)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2022-03-29 18:05:45 -07:00
Fulvio d1938482e7
remove DualStack field from check TCP #12629 (#12630) 2022-03-29 14:56:01 -04:00
Eric ab5b5e85f5 remove the rest of gogo 2022-03-28 17:34:41 -04:00
Eric 8fd73ede3e remove gogo from acl protobufs 2022-03-28 16:20:56 -04:00
Connor f8fc317731
Fix leaked Vault LifetimeRenewers (#12607)
* Fix leaked Vault LifetimeRenewers

When the Vault CA Provider is reconfigured we do not stop the
LifetimeRenewers which can cause them to leak until the Consul processes
recycles. On Configure execute stopWatcher if it exists and is not nil
before starting a new renewal

* Add jitter before restarting the LifetimeWatcher

If we fail to login to Vault or our token is no longer valid we can
overwhelm a Vault instance with many requests very quickly by restarting
the LifetimeWatcher. Before restarting the LifetimeWatcher provide a
backoff time of 1 second or less.

* Use a retry.Waiter instead of RandomStagger

* changelog

* gofmt'd

* Swap out bool for atomic.Unit32 in test

* Provide some extra clarification in comment and changelog
2022-03-28 09:58:16 -05:00
Eric 523e054c81 assorted changes required to remove gogo 2022-03-25 09:55:36 -04:00
FFMMM 560f8cbc89
fix bad oss sync, use gauges not counters (#12611) 2022-03-24 14:41:30 -07:00
Kyle Havlovitz d9f31345e0
Merge pull request #12596 from hashicorp/overview-endpoint
oss: Add overview UI internal endpoint
2022-03-24 14:27:54 -07:00
Mike Morris 8020fb2098
agent: convert listener config to TLS types (#12522)
* tlsutil: initial implementation of types/TLSVersion

tlsutil: add test for parsing deprecated agent TLS version strings

tlsutil: return TLSVersionInvalid with error

tlsutil: start moving tlsutil cipher suite lookups over to types/tls

tlsutil: rename tlsLookup to ParseTLSVersion, add cipherSuiteLookup

agent: attempt to use types in runtime config

agent: implement b.tlsVersion validation in config builder

agent: fix tlsVersion nil check in builder

tlsutil: update to renamed ParseTLSVersion and goTLSVersions

tlsutil: fixup TestConfigurator_CommonTLSConfigTLSMinVersion

tlsutil: disable invalid config parsing tests

tlsutil: update tests

auto_config: lookup old config strings from base.TLSMinVersion

auto_config: update endpoint tests to use TLS types

agent: update runtime_test to use TLS types

agent: update TestRuntimeCinfig_Sanitize.golden

agent: update config runtime tests to expect TLS types

* website: update Consul agent tls_min_version values

* agent: fixup TLS parsing and compilation errors

* test: fixup lint issues in agent/config_runtime_test and tlsutil/config_test

* tlsutil: add CHACHA20_POLY1305 cipher suites to goTLSCipherSuites

* test: revert autoconfig tls min version fixtures to old format

* types: add TLSVersions public function

* agent: add warning for deprecated TLS version strings

* agent: move agent config specific logic from tlsutil.ParseTLSVersion into agent config builder

* tlsutil(BREAKING): change default TLS min version to TLS 1.2

* agent: move ParseCiphers logic from tlsutil into agent config builder

* tlsutil: remove unused CipherString function

* agent: fixup import for types package

* Revert "tlsutil: remove unused CipherString function"

This reverts commit 6ca7f6f58d268e617501b7db9500113c13bae70c.

* agent: fixup config builder and runtime tests

* tlsutil: fixup one remaining ListenerConfig -> ProtocolConfig

* test: move TLS cipher suites parsing test from tlsutil into agent config builder tests

* agent: remove parseCiphers helper from auto_config_endpoint_test

* test: remove unused imports from tlsutil

* agent: remove resolved FIXME comment

* tlsutil: remove TODO and FIXME in cipher suite validation

* agent: prevent setting inherited cipher suite config when TLS 1.3 is specified

* changelog: add entry for converting agent config to TLS types

* agent: remove FIXME in runtime test, this is covered in builder tests with invalid tls9 value now

* tlsutil: remove config tests for values checked at agent config builder boundary

* tlsutil: remove tls version check from loadProtocolConfig

* tlsutil: remove tests and TODOs for logic checked in TestBuilder_tlsVersion and TestBuilder_tlsCipherSuites

* website: update search link for supported Consul agent cipher suites

* website: apply review suggestions for tls_min_version description

* website: attempt to clean up markdown list formatting for tls_min_version

* website: moar linebreaks to fix tls_min_version formatting

* Revert "website: moar linebreaks to fix tls_min_version formatting"

This reverts commit 38585927422f73ebf838a7663e566ac245f2a75c.

* autoconfig: translate old values for TLSMinVersion

* agent: rename var for translated value of deprecated TLS version value

* Update agent/config/deprecated.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* agent: fix lint issue

* agent: fixup deprecated config test assertions for updated warning

Co-authored-by: Dan Upton <daniel@floppy.co>
2022-03-24 15:32:25 -04:00
Kyle Havlovitz 0d5cbf6f30 Sort by partition/ns/servicename instead of the reverse 2022-03-24 12:16:05 -07:00
FFMMM 8f98bbda75
[metrics][rpc]: add basic prefix filter test for new rpc metric (#12598)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-23 13:29:12 -07:00
Kyle Havlovitz 1b654c9807 Clean up ent meta id usage in overview summary 2022-03-23 12:47:12 -07:00
Eric 98b733e41a remove gogo from pbservice 2022-03-23 12:18:01 -04:00
Mark Anderson 28c925f6d0 Fixup dropped SecretID usage
Looks like something got munged at some point. Not sure how it slipped in, but my best guess is that because TestTxn_Apply_ACLDeny is marked flaky we didn't block merge because it failed.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-22 21:20:03 -07:00
Kyle Havlovitz 04f1d9bcc9 oss: Add overview UI internal endpoint 2022-03-22 17:05:09 -07:00
Dhia Ayachi 5cb24b9bf8
split `pbcommon` to `pbcommon` and `pbcommongogo` (#12587)
* mogify needed pbcommon structs

* mogify needed pbconnect structs

* fix compilation errors and make config_translate_test pass

* add missing file

* remove redundant oss func declaration

* fix EnterpriseMeta to copy the right data for enterprise

* rename pbcommon package to pbcommongogo

* regenerate proto and mog files

* add missing mog files

* add pbcommon package

* pbcommon no mog

* fix enterprise meta code generation

* fix enterprise meta code generation (pbcommongogo)

* fix mog generation for gogo

* use `protoc-go-inject-tag` to inject tags

* rename proto package

* pbcommon no mog

* use `protoc-go-inject-tag` to inject tags

* add non gogo proto to make file

* fix proto get
2022-03-22 16:30:00 -04:00
Dan Upton 2fe06f663b
streaming: emit events when Connect CA Roots change (#12590)
OSS sync of enterprise changes at 614f786d
2022-03-22 19:13:59 +00:00
FFMMM 76d8798590
factor out recording func, add unit tests (#12585)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-22 09:31:54 -07:00
Dan Upton fb441e323a
Restructure gRPC server setup (#12586)
OSS sync of enterprise changes at 0b44395e
2022-03-22 12:40:24 +00:00
FFMMM 08f2838b78
pre register new rpc metric, rename metric (#12582) 2022-03-21 17:26:32 -07:00
Mark Anderson 2b367626f0
Add source of authority annotations to the PermissionDeniedError output. (#12567)
This extends the acl.AllowAuthorizer with source of authority information.

The next step is to unify the AllowAuthorizer and ACLResolveResult structures; that will be done in a separate PR.

Part of #12481

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-18 10:32:25 -07:00
Dan Upton 57f0f42733
Support per-listener TLS configuration ⚙️ (#12504)
Introduces the capability to configure TLS differently for Consul's
listeners/ports (i.e. HTTPS, gRPC, and the internal multiplexed RPC
port) which is useful in scenarios where you may want the HTTPS or
gRPC interfaces to present a certificate signed by a well-known/public
CA, rather than the certificate used for internal communication which
must have a SAN in the form `server.<dc>.consul`.
2022-03-18 10:46:58 +00:00
Evan Culver 27711fe5c7
lib: add validation package + DNS label validation (#12535)
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-03-17 18:31:28 -07:00
FFMMM 3c08843847
[sync oss] add net/rpc interceptor implementation (#12573)
* sync ent changes from 866dcb0667

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

* update oss go.mod

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-17 16:02:26 -07:00
Jared Kirschner 038fd90925
Merge pull request #11821 from hashicorp/error-if-get-request-has-body
http: error if GET request has non-empty body
2022-03-16 18:34:27 -04:00
Jared Kirschner 13712de2e7 http: WARN if GET request has non-empty body
Give the user a hint that they might be doing something wrong if their GET
request has a non-empty body, which can easily happen using curl's
--data-urlencode if specifying request type via "--request GET" rather than
"--get". See https://github.com/hashicorp/consul/issues/11471.
2022-03-16 14:19:50 -07:00
Eric ae1cdc85b1 Remove the stdduration gogo extension 2022-03-16 12:12:29 -04:00
mrspanishviking 1ae820ea0a
Revert "[Docs] Agent configuration hierarchy " 2022-03-15 16:13:58 -07:00
trujillo-adam 667976c94f fixing merge conflicts part 3 2022-03-15 15:25:03 -07:00
Eric Haberkorn 90ecd7b5fe
Merge pull request #12556 from hashicorp/wire-up-serverless-patcher
Create and wire up the serverless patcher
2022-03-15 14:05:40 -04:00
Eric Haberkorn d58f230bb8
Merge pull request #12557 from hashicorp/remove-healthcheck-gogo-stdduration
Remove Gogo Stdduration From the Healthcheck Protobufs
2022-03-15 13:20:49 -04:00
Eric fa13ce3e45 Remove gogo stdduration from the healthcheck protobufs 2022-03-15 10:51:40 -04:00
Eric 07bf0c5ce8 Create and wire up the serverless patcher 2022-03-15 10:12:57 -04:00
trujillo-adam 60a88bb40f merging new hierarchy for agent configuration 2022-03-14 15:44:41 -07:00
Mark Anderson ab099e5fcb
Refactor config checks oss (#12550)
Currently the config_entry.go subsystem delegates authorization decisions via the ConfigEntry interface CanRead and CanWrite code. Unfortunately this returns a true/false value and loses the details of the source.

This is not helpful, especially since it the config subsystem can be more complex to understand, since it covers so many domains.

This refactors CanRead/CanWrite to return a structured error message (PermissionDenied or the like) with more details about the reason for denial.

Part of #12241

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-11 13:45:51 -08:00
Eric Haberkorn 3297bc6988
Merge pull request #12536 from hashicorp/add-serverless-config
Add the `connect.enable_serverless_plugin` configuration option
2022-03-11 09:39:36 -05:00
Eric Haberkorn 35cb0c1312
Merge pull request #12539 from hashicorp/make-xds-lib
Make the xdscommon package
2022-03-11 09:21:10 -05:00
Eric 2d27f9dc47 Add the `connect.enable_serverless_plugin` configuration option. 2022-03-11 09:16:00 -05:00
Mark Anderson 5591cb1e11
Bulk acl message fixup oss (#12470)
* First pass for helper for bulk changes

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Convert ACLRead and ACLWrite to new form

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* AgentRead and AgentWRite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fix EventWrite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* KeyRead, KeyWrite, KeyList

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* KeyRing

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* NodeRead NodeWrite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* OperatorRead and OperatorWrite

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* PreparedQuery

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Intention partial

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fix ServiceRead, Write ,etc

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Error check ServiceRead?

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fix Sessionread/Write

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fixup snapshot ACL

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Error fixups for txn

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Add changelog

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fixup review comments

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-10 18:48:27 -08:00
Eric 3f9b84e6a4 Make an xdscommon package that will be shared between Consul and Envoy plugins 2022-03-08 14:57:23 -05:00
Eric Haberkorn b4276e971e
Merge pull request #12529 from hashicorp/add-meta-to-service-config-response
Add `Meta` to `ServiceConfigResponse`
2022-03-07 16:35:21 -05:00
Eric Haberkorn 45312886fe Code review changes 2022-03-07 14:39:33 -05:00
R.B. Boyer cc4733e60d
proxycfg: change how various proxycfg test helpers for making ConfigSnapshot copies works to be more correct and less error prone (#12531)
Prior to this PR for the envoy xDS golden tests in the agent/xds package we
were hand-creating a proxycfg.ConfigSnapshot structure in the proper format for
input to the xDS generator. Over time this intermediate structure has gotten
trickier to build correctly for the various tests.

This PR proposes to switch to using the existing mechanism for turning a
structs.NodeService and a sequence of cache.UpdateEvent copies into a
proxycfg.ConfigSnapshot, as that is less error prone to construct and aligns
more with how the data arrives.

NOTE: almost all of this is in test-related code. I tried super hard to craft
correct event inputs to get the golden files to be the same, or similar enough
after construction to feel ok that i recreated the spirit of the original test
cases.
2022-03-07 11:47:14 -06:00
Eric 3d46f9ef7c Add `Meta` to `ServiceConfigResponse` 2022-03-07 10:05:18 -05:00
R.B. Boyer b63a0f3909
reduce flakiness/raciness of errNotFound and errNotChanged blocking query tests (#12518)
Improves tests from #12362

These tests try to setup the following concurrent scenario:

1. (goroutine 1) execute read RPC with index=0
2. (goroutine 1) get response from (1) @ index=10
3. (goroutine 1) execute read RPC with index=10 and block
4. (goroutine 2) WHILE (3) is blocking, start slamming the system with stray writes that will cause the WatchSet to wakeup
5. (goroutine 2) after doing all writes, shut down the reader above
6. (goroutine 1) stops reading, double checks that it only ever woke up once (from 1)
2022-03-04 11:20:01 -06:00
R.B. Boyer 07b92a2855
server: fix spurious blocking query suppression for discovery chains (#12512)
Minor fix for behavior in #12362

IsDefault sometimes returns true even if there was a proxy-defaults or service-defaults config entry that was consulted. This PR fixes that.
2022-03-03 16:54:41 -06:00
Daniel Nephin 8f4b6af68a
Merge pull request #12298 from jorgemarey/b-persistnewrootandconfig
Avoid raft change when no config is provided on persistNewRootAndConfig
2022-03-03 11:03:50 -05:00
Daniel Nephin 2082bdc286 ca: make sure the test fails without the fix
Also change the path used for the secondary so that both primary and secondary do not overwrite each other.
2022-03-02 18:22:49 -05:00
R.B. Boyer 679cea7171
raft: upgrade to v1.3.6 (#12496)
Add additional protections on the Consul side to prevent NonVoters from bootstrapping raft.

This should un-flake TestServer_Expect_NonVoters
2022-03-02 17:00:02 -06:00
Daniel Nephin 849d86e7f5
Merge pull request #12467 from hashicorp/dnephin/ci-vault-test-safer
ca: require that tests that use Vault are named correctly
2022-03-01 12:54:02 -05:00
R.B. Boyer 033e0ed13f
test: parallelize more of TestLeader_ReapOrLeftMember_IgnoreSelf (#12468)
before:

    $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf
    ok  	github.com/hashicorp/consul/agent/consul	21.147s

after:

    $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf
    ok  	github.com/hashicorp/consul/agent/consul	5.402s
2022-03-01 10:30:06 -06:00
Jorge Marey aba9e724a8 Fix vault test with suggested changes 2022-03-01 10:20:00 +01:00
Jorge Marey 8b1b264b6f Add test case to verify #12298 2022-03-01 09:25:52 +01:00
Jorge Marey 2ca00df0d8 Avoid raft change when no config is provided on CAmanager
- This avoids a change to the raft store when no roots or config
are provided to persistNewRootAndConfig
2022-03-01 09:25:52 +01:00
Daniel Nephin dd565aa5e4 ca: fix a test
This test does not use Vault, so does not need ca.SkipIfVaultNotPresent
2022-02-28 16:26:18 -05:00
Daniel Nephin bb7f2f15b3 ca: require that tests that use Vault are named correctly
Previously we were using two different criteria to decide where to run a
test.  The main `go-test` job would skip Vault tests based on the
presence of the `vault` binary, but the `test-connect-ca-providers` job
would run tests based on the name.

This led to a scenario where a test may never run in CI.

To fix this problem I added a name check to the function we use to skip
the test. This should ensure that any test that requires vault is named
correctly to be run as part of the `test-connect-ca-providers` job.

At the same time I relaxed the regex we use. I verified this runs the
same tests using `go test --list Vault`.  I made this change because a
bunch of tests in `agent/connect/ca` used `Vault` in the name, without
the underscores. Instead of changing a bunch of test names, this seemed
easier.

With this approach, the worst case is that we run a few extra tests in
the `test-connect-ca-providers` job, which doesn't seem like a problem.
2022-02-28 16:13:53 -05:00
R.B. Boyer 3804677570
server: suppress spurious blocking query returns where multiple config entries are involved (#12362)
Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect:

- `DiscoveryChain.Get`
- `ConfigEntry.ResolveServiceConfig`
- `Intentions.Match`

All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism.

### No Config Entries Found

In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110.

### No Change Since Last Wakeup

Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110.

This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report.

### Extra Endpoints

Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated:

- `ConfigEntry.List`
- `Internal.IntentionUpstreams` (tproxy)
2022-02-25 15:46:34 -06:00
Chris S. Kim aea00f10ae
Merge pull request #12442 from danieleva/12422-keyring
Allows keyring operations on client agents
2022-02-25 16:28:56 -05:00
Evan Culver 7889071385
connect: Update supported Envoy versions to include 1.19.3 and 1.18.6 2022-02-24 16:59:33 -08:00
Evan Culver 9f4d9f3f74
connect: Upgrade Envoy 1.20 to 1.20.2 (#12443) 2022-02-24 16:19:39 -08:00
R.B. Boyer 4b0f657b31
fix flaky test panic (#12446) 2022-02-24 17:35:46 -06:00
R.B. Boyer a97d20cf63
catalog: compare node names case insensitively in more places (#12444)
Many places in consul already treated node names case insensitively.
The state store indexes already do it, but there are a few places that
did a direct byte comparison which have now been corrected.

One place of particular consideration is ensureCheckIfNodeMatches
which is executed during snapshot restore (among other places). If a
node check used a slightly different casing than the casing of the node
during register then the snapshot restore here would deterministically
fail. This has been fixed.

Primary approach:

    git grep -i "node.*[!=]=.*node" -- ':!*_test.go' ':!docs'
    git grep -i '\[[^]]*member[^]]*\]
    git grep -i '\[[^]]*\(member\|name\|node\)[^]]*\]' -- ':!*_test.go' ':!website' ':!ui' ':!agent/proxycfg/testing.go:' ':!*.md'
2022-02-24 16:54:47 -06:00
Daniele Vazzola 397b5ed957 Allows keyring operations on client agents 2022-02-24 17:24:57 +00:00
R.B. Boyer d860384731
server: partly fix config entry replication issue that prevents replication in some circumstances (#12307)
There are some cross-config-entry relationships that are enforced during
"graph validation" at persistence time that are required to be
maintained. This means that config entries may form a digraph at times.

Config entry replication procedes in a particular sorted order by kind
and name.

Occasionally there are some fixups to these digraphs that end up
replicating in the wrong order and replicating the leaves
(ingress-gateway) before the roots (service-defaults) leading to
replication halting due to a graph validation error related to things
like mismatched service protocol requirements.

This PR changes replication to give each computed change (upsert/delete)
a fair shot at being applied before deciding to terminate that round of
replication in error. In the case where we've simply tried to do the
operations in the wrong order at least ONE of the outstanding requests
will complete in the right order, leading the subsequent round to have
fewer operations to do, with a smaller likelihood of graph validation
errors.

This does not address all scenarios, but for scenarios where the edits
are being applied in the wrong order this should avoid replication
halting.

Fixes #9319

The scenario that is NOT ADDRESSED by this PR is as follows:

1. create: service-defaults: name=new-web, protocol=http
2. create: service-defaults: name=old-web, protocol=http
3. create: service-resolver: name=old-web, redirect-to=new-web
4. delete: service-resolver: name=old-web
5. update: service-defaults: name=old-web, protocol=grpc
6. update: service-defaults: name=new-web, protocol=grpc
7. create: service-resolver: name=old-web, redirect-to=new-web

If you shutdown dc2 just before (4) and turn it back on after (7)
replication is impossible as there is no single edit you can make to
make forward progress.
2022-02-23 17:27:48 -06:00
Chris S. Kim 4b528edbe6
Merge pull request #12430 from hashicorp/ci/main-assetfs-build
auto-updated agent/uiserver/bindata_assetfs.go from commit 73b6687c5
2022-02-23 18:19:30 -05:00
Daniel Nephin 3639f4b551
Merge pull request #11910 from hashicorp/dnephin/ca-provider-interface-for-ica-in-primary
ca: add support for an external trusted CA
2022-02-22 13:14:52 -05:00
R.B. Boyer 11fdc70b34
configentry: make a new package to hold shared config entry structs that aren't used for RPC or the FSM (#12384)
First two candidates are ConfigEntryKindName and DiscoveryChainConfigEntries.
2022-02-22 10:36:36 -06:00
Dhia Ayachi 378f688a6a
file watcher to be used for configuration auto-reload feature (#12301)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* fix flaky test

* Apply suggestions from code review

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>

* fix a possible stack overflow

* do not reset timer on errors, rename OS specific files

* start the watcher when creating it

* fix data race in tests

* rename New func

* do not call handler when a remove event happen

* events trigger on write and rename

* fix watcher tests

* make handler async

* remove recursive call

* do not produce events for sub directories

* trim "/" at the end of a directory when adding

* add missing test

* fix logging

* add todo

* fix failing test

* fix flaking tests

* fix flaky test

* add logs

* fix log text

* increase timeout

* reconcile when remove

* check reconcile when removed

* fix reconcile move test

* fix logging

* delete invalid file

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix review comments

* fix is watched to properly catch a remove

* change test timeout

* fix test and rename id

* fix test to create files with different mod time.

* fix deadlock when stopping watcher

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix a deadlock when calling stop while emitting event is blocked

* make sure to close the event channel after the event loop is done

* add go doc

* back date file instead of sleeping

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* check error

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-02-21 11:36:52 -05:00
hc-github-team-consul-core ef5b6c8415 auto-updated agent/uiserver/bindata_assetfs.go from commit 73b6687c5 2022-02-21 12:27:52 +00:00
Evan Culver 067223337d
checks: populate interval and timeout when registering services (#11138) 2022-02-18 12:05:33 -08:00
Kyle Havlovitz 9c03b5dc3d
Merge pull request #12385 from hashicorp/tproxy-http-upstream-fix
xds: respect chain protocol on default discovery chain
2022-02-18 10:08:59 -08:00
Daniel Nephin cb1a80184f rpc: set response to nil when not found
Otherwise when the query times out we might incorrectly send a value for
the reply, when we should send an empty reply.

Also document errNotFound and how to handle the result in that case.
2022-02-18 12:26:06 -05:00
Daniel Nephin 79820738cc ca: test that original certs from secondary still verify
There's a chance this could flake if the secondary hasn't received the
update yet, but running this test many times doesn't show any flakes
yet.
2022-02-17 18:45:16 -05:00
Daniel Nephin ca4e60e09b Update TODOs to reference an issue with more details
And remove a no longer needed TODO
2022-02-17 18:21:30 -05:00
Daniel Nephin 0abaf29c10 ca: add test cases for rotating external trusted CA 2022-02-17 18:21:30 -05:00
Daniel Nephin aacc40012f ca: add a test for secondary with external CA 2022-02-17 18:21:30 -05:00
Daniel Nephin 471b2098bb ca: examine the full chain in newCARoot
make TestNewCARoot much more strict
compare the full result instead of only a few fields.
add a test case with 2 and 3 certificates in the pem
2022-02-17 18:21:30 -05:00
Daniel Nephin fc6c0ec139 ca: small docs improvements 2022-02-17 18:21:30 -05:00
Daniel Nephin af651eaaad ca: cleanup validateSetIntermediate 2022-02-17 18:21:30 -05:00
Daniel Nephin ef03f7be73 ca: only return the leaf cert from Sign in vault provider
The interface is documented as 'Sign will only return the leaf', and the other providers
only return the leaf. It seems like this was added during the initial implementation, so
is likely just something we missed. It doesn't break anything , but it does cause confusing cert chains
in the API response which could break something in the future.
2022-02-17 18:21:30 -05:00
Daniel Nephin 2d5254a73b
Merge pull request #12110 from hashicorp/dnephin/blocking-queries-not-found
rpc: make blocking queries for non-existent items more efficient
2022-02-17 18:09:39 -05:00
Ashwin Venkatesh 39be071264
Parse datacenter from request (#12370)
* Parse datacenter from request
- Parse the value of the datacenter from the create/delete requests for AuthMethods and BindingRules so that they can be created in and deleted from the datacenters specified in the request.
2022-02-17 16:41:27 -05:00
Kyle Havlovitz 58172c260b xds: respect chain protocol on default discovery chain 2022-02-17 11:47:20 -08:00
Florian Apolloner 895da50986
Support for connect native services in topology view. (#12098) 2022-02-16 16:51:54 -05:00
Chris S. Kim 18096fd2fb
Move IndexEntryName helpers to common files (#12365) 2022-02-16 12:56:38 -05:00
Daniel Nephin 06657e5be0 rpc: add errNotFound to all Get queries
Any query that returns a list of items is not part of this commit.
2022-02-15 18:24:34 -05:00
Daniel Nephin bdafa24c50 Make blockingQuery efficient with 'not found' results.
By using the query results as state.

Blocking queries are efficient when the query matches some results,
because the ModifyIndex of those results, returned as queryMeta.Mindex,
will never change unless the items themselves change.

Blocking queries for non-existent items are not efficient because the
queryMeta.Index can (and often does) change when other entities are
written.

This commit reduces the churn of these queries by using a different
comparison for "has changed". Instead of using the modified index, we
use the existence of the results. If the previous result was "not found"
and the new result is still "not found", we know we can ignore the
modified index and continue to block.

This is done by setting the minQueryIndex to the returned
queryMeta.Index, which prevents the query from returning before a state
change is observed.
2022-02-15 18:24:33 -05:00
Daniel Nephin 6e73df7dc2 Add a test for blocking query on non-existent entry
This test shows how blocking queries are not efficient when the query
returns no results.  The test fails with 100+ calls instead of the
expected 2.

This test is still a bit flaky because it depends on the timing of the
writes. It can sometimes return 3 calls.

A future commit should fix this and make blocking queries even more
optimal for not-found results.
2022-02-15 18:23:17 -05:00
Daniel Nephin a4e1c59cd8 rpc: improve docs for blockingQuery
Follow the Go convention of accepting a small interface that documents
the methods used by the function.

Clarify the rules for implementing a query function passed to
blockingQuery.
2022-02-15 14:20:14 -05:00
R.B. Boyer b216d52b66
server: conditionally avoid writing a config entry to raft if it was already the same (#12321)
This will both save on unnecessary raft operations as well as
unnecessarily incrementing the raft modify index of config entries
subject to no-op updates.
2022-02-14 14:39:12 -06:00
FFMMM 1f8fb17be7
Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311)
This commit syncs ENT changes to the OSS repo.

Original commit details in ENT:

```
commit 569d25f7f4578981c3801e6e067295668210f748
Author: FFMMM <FFMMM@users.noreply.github.com>
Date:   Thu Feb 10 10:23:33 2022 -0800

    Vendor fork net rpc (#1538)

    * replace net/rpc w consul-net-rpc/net/rpc

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * replace msgpackrpc and go-msgpack with fork from mono repo

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * gofmt all files touched

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
```

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-02-14 09:45:45 -08:00
R.B. Boyer d54a3e6aa1
missed this test adjustment (#12331) 2022-02-14 11:39:00 -06:00
R.B. Boyer 0b80f70a39
local: fixes a data race in anti-entropy sync (#12324)
The race detector noticed this initially in `TestAgentConfigWatcherSidecarProxy` but it is not restricted to just tests.

The two main changes here were:

- ensure that before we mutate the internal `agent/local` representation of a Service (for tags or VIPs) we clone those fields
- ensure that there's no function argument joint ownership between the caller of a function and the local state when calling `AddService`, `AddCheck`, and related using `copystructure` for now.
2022-02-14 10:41:33 -06:00
Dao Thanh Tung 0519a9240e
URL-encode/decode resource names for HTTP API part 5 (#12297) 2022-02-14 10:47:06 -05:00
Mark Anderson fa95afdcf6 Refactor to make ACL errors more structured. (#12308)
* First phase of refactoring PermissionDeniedError

Add extended type PermissionDeniedByACLError that captures information
about the accessor, particular permission type and the object and name
of the thing being checked.

It may be worth folding the test and error return into a single helper
function, that can happen at a later date.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-02-11 12:53:23 -08:00
Freddy f45bec7779
Merge pull request #12223 from hashicorp/proxycfg/passthrough-cleanup 2022-02-10 17:35:51 -07:00
freddygv 8eaca35df1 Account for upstream targets in another DC.
Transparent proxies typically cannot dial upstreams in remote
datacenters. However, if their upstream configures a redirect to a
remote DC then the upstream targets will be in another datacenter.

In that sort of case we should use the WAN address for the passthrough.
2022-02-10 17:01:57 -07:00
freddygv 7fba7456ec Fix race of upstreams with same passthrough ip
Due to timing, a transparent proxy could have two upstreams to dial
directly with the same address.

For example:
- The orders service can dial upstreams shipping and payment directly.
- An instance of shipping at address 10.0.0.1 is deregistered.
- Payments is scaled up and scheduled to have address 10.0.0.1.
- The orders service receives the event for the new payments instance
before seeing the deregistration for the shipping instance. At this
point two upstreams have the same passthrough address and Envoy will
reject the listener configuration.

To disambiguate this commit considers the Raft index when storing
passthrough addresses. In the example above, 10.0.0.1 would only be
associated with the newer payments service instance.
2022-02-10 17:01:57 -07:00
freddygv d5a2eb677f Ensure passthrough addresses get cleaned up
Transparent proxies can set up filter chains that allow direct
connections to upstream service instances. Services that can be dialed
directly are stored in the PassthroughUpstreams map of the proxycfg
snapshot.

Previously these addresses were not being cleaned up based on new
service health data. The list of addresses associated with an upstream
service would only ever grow.

As services scale up and down, eventually they will have instances
assigned to an IP that was previously assigned to a different service.
When IP addresses are duplicated across filter chain match rules the
listener config will be rejected by Envoy.

This commit updates the proxycfg snapshot management so that passthrough
addresses can get cleaned up when no longer associated with a given
upstream.

There is still the possibility of a race condition here where due to
timing an address is shared between multiple passthrough upstreams.
That concern is mitigated by #12195, but will be further addressed
in a follow-up.
2022-02-10 17:01:57 -07:00
Freddy bb129384b7
Prevent xDS tight loop on cfg errors (#12195) 2022-02-10 15:37:36 -07:00
Dhia Ayachi de7598f064
fix race when starting a service while the agent `serviceManager` is … (#12302)
* fix race when starting a service while the agent `serviceManager` is stopping

* add changelog
2022-02-10 13:30:49 -05:00
Daniel Nephin db4675bd1a
Merge pull request #12277 from hashicorp/dnephin/panic-in-service-register
catalog: initialize the refs map to prevent a nil panic
2022-02-09 19:48:22 -05:00
Daniel Nephin 6376141464 config-entry: fix a panic when registering a service or ingress gateway 2022-02-09 18:49:48 -05:00
R.B. Boyer 0cd0d505fa
xds: allow only one outstanding delta request at a time (#12236)
Fixes #11876

This enforces that multiple xDS mutations are not issued on the same ADS connection at once, so that we can 100% control the order that they are applied. The original code made assumptions about the way multiple in-flight mutations were applied on the Envoy side that was incorrect.
2022-02-08 10:36:48 -06:00
Daniel Nephin c20412ab14
Merge pull request #12265 from hashicorp/dnephin/logging-in-tests
sdk: add TestLogLevel for setting log level in tests
2022-02-07 16:11:23 -05:00
Daniel Nephin 5a0e6700c1 A test to reproduce the issue 2022-02-04 14:04:12 -05:00
Daniel Nephin 7b466a024b Make test more readable
And fix typo
2022-02-03 18:44:09 -05:00
Daniel Nephin 6721c1246d ca: relax and move private key type/bit validation for vault
This commit makes two changes to the validation.

Previously we would call this validation in GenerateRoot, which happens
both on initialization (when a follower becomes leader), and when a
configuration is updated. We only want to do this validation during
config update so the logic was moved to the UpdateConfiguration
function.

Previously we would compare the config values against the actual cert.
This caused problems when the cert was created manually in Vault (not
created by Consul).  Now we compare the new config against the previous
config. Using a already created CA cert should never error now.

Adding the key bit and types to the config should only error when
the previous values were not the defaults.
2022-02-03 17:21:20 -05:00
Daniel Nephin 3b78f81f9a ca: small cleanup of TestConnectCAConfig_Vault_TriggerRotation_Fails
Before adding more test cases
2022-02-03 17:21:20 -05:00
Daniel Nephin f6d7a0f7b2 testing: fix test failures caused by new log level
These two tests require debug logging enabled, because they look for log lines.

Also switched to testify assertions because the previous errors were not clear.
2022-02-03 17:07:39 -05:00
Daniel Nephin 1a9a656a7f sdk: add TestLogLevel for setting log level in tests
And default log level to WARN.
2022-02-03 13:42:28 -05:00
Daniel Nephin 44f9229b96 ca: add a test that uses an intermediate CA as the primary CA
This test found a bug in the secondary. We were appending the root cert
to the PEM, but that cert was already appended. This was failing
validation in Vault here:
https://github.com/hashicorp/vault/blob/sdk/v0.3.0/sdk/helper/certutil/types.go#L329

Previously this worked because self signed certs have the same
SubjectKeyID and AuthorityKeyID. So having the same self-signed cert
repeated doesn't fail that check.

However with an intermediate that is not self-signed, those values are
different, and so we fail the check. A test I added in a previous commit
should show that this continues to work with self-signed root certs as
well.
2022-02-02 13:41:35 -05:00
Daniel Nephin d00a9abca2 acl: un-embed ACLIdentity
This is safer than embedding two interface because there are a number of
places where we check the concrete type. If we check the concrete type
on the top-level interface it will fail. So instead expose the
ACLIdentity from a method.
2022-02-02 12:07:31 -05:00
Daniel Nephin 18ff00f985
Merge pull request #12167 from hashicorp/dnephin/acl-resolve-token-3
acl: rename ResolveTokenToIdentityAndAuthorizer to ResolveToken
2022-01-31 19:21:06 -05:00
Daniel Nephin ff64c13c3e
Merge pull request #12166 from hashicorp/dnephin/acl-resolve-token-2
acl: remove ResolveTokenToIdentity
2022-01-31 19:19:21 -05:00
Daniel Nephin aa4dbe2a17 acl: rename ResolveTokenToIdentityAndAuthorizer to ResolveToken
This change allows us to remove one of the last remaining duplicate
resolve token methods (Server.ResolveToken).

With this change we are down to only 2, where the second one also
handles setting the default EnterpriseMeta from the token.
2022-01-31 18:04:19 -05:00
Daniel Nephin 57eac90cae acl: remove unused methods on fakes, and add changelog
Also document the metric that was removed in a previous commit.
2022-01-31 17:53:53 -05:00
Daniel Nephin 1fb2d49826
Merge pull request #12165 from hashicorp/dnephin/acl-resolve-token
acl: remove some of the duplicate resolve token methods
2022-01-31 13:27:49 -05:00
Mathew Estafanous 1113a7533c
Change error-handling across handlers. (#12225) 2022-01-31 11:17:35 -05:00
Fulvio eff69b484b
URL-encode/decode resource names for HTTP API part 4 (#12190) 2022-01-28 15:01:47 -05:00
Dan Upton ebdda4848f
streaming: split event buffer by key (#12080) 2022-01-28 12:27:00 +00:00
freddygv 68dea758dd Add failing test
The updated test fails because passthrough upstream addresses are not
being cleaned up.
2022-01-27 18:56:47 -07:00
Daniel Nephin fa8ff28a63 ca/provider: remove ActiveRoot from Provider 2022-01-27 13:07:37 -05:00
Daniel Nephin 722e3a6ac4 ca: update MockProvider for new interface 2022-01-27 12:51:35 -05:00
Daniel Nephin 80f215675c ca: update GenerateRoot godoc 2022-01-27 12:51:35 -05:00
Daniel Nephin d56a1dfb2c
Merge pull request #11663 from hashicorp/dnephin/ca-remove-one-call-to-active-root-2
ca: remove second call to Provider.ActiveRoot
2022-01-27 12:41:05 -05:00
Daniel Nephin d3324d0d27
Merge pull request #12109 from hashicorp/dnephin/blocking-query-1
rpc: make blockingQuery easier to read
2022-01-26 18:13:55 -05:00
Daniel Nephin 6fe2311ce0 acl: Remove a call to aclAccessorID
I missed this on the first pass, we no longer need to look up this ID, because we have it
from the Authorizer.
2022-01-26 17:21:45 -05:00
Daniel Nephin 14a40fab1a
Merge pull request #11221 from hashicorp/dnephin/acl-resolver-5
acl: extract a backend type for the ACLResolverBackend
2022-01-26 16:57:03 -05:00
Dao Thanh Tung 42d6c61b62
URL-encode/decode resource names for HTTP API part 3 (#12103) 2022-01-26 13:12:42 -05:00
Daniel Nephin 74dc9925cc Apply suggestions from code review
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-01-26 12:24:13 -05:00
Daniel Nephin 2c311161cc acl: extract a backend type for the ACLResolverBackend
This is a small step to isolate the functionality that is used for the
ACLResolver from the large Client and Server structs.
2022-01-26 12:24:10 -05:00
R.B. Boyer b999b3edfc
xds: fix for delta xDS reconnect bug in LDS/CDS (#12174)
When a wildcard xDS type (LDS/CDS/SRDS) reconnects from a delta xDS stream,
prior to envoy `1.19.0` it would populate the `ResourceNamesSubscribe` field
with the full list of currently subscribed items, instead of simply omitting it
to infer that it wanted everything (which is what wildcard mode means).

This upstream issue was filed in envoyproxy/envoy#16063 and fixed in
envoyproxy/envoy#16153 which went out in Envoy `1.19.0` and is fixed in later
versions (later refactored in envoyproxy/envoy#16855).

This PR conditionally forces LDS/CDS to be wildcard-only even when the
connected Envoy requests a non-wildcard subscription, but only does so on
versions prior to `1.19.0`, as we should not need to do this on later versions.

This fixes the failure case as described here: #11833 (comment)

Co-authored-by: Huan Wang <fredwanghuan@gmail.com>
2022-01-25 11:24:27 -06:00
Daniel Nephin c1da07e2ea acl: remove calls to ResolveIdentityFromToken
We already have an ACLResolveResult, so we can get the accessor ID from
it.
2022-01-22 15:05:42 -05:00
Daniel Nephin ed1cc5f255 acl: remove ResolveTokenToIdentity
By exposing the AccessorID from the primary ResolveToken method we can
remove this duplication.
2022-01-22 14:47:59 -05:00
Daniel Nephin 26f0ebd96f acl: return a resposne from ResolveToken that includes the ACLIdentity
So that we can duplicate duplicate methods.
2022-01-22 14:33:09 -05:00
Daniel Nephin 314614f073 acl: remove duplicate methods
Now that ACLResolver is embedded we don't need ResolveTokenToIdentity on
Client and Server.

Moving ResolveTokenAndDefaultMeta to ACLResolver removes the duplicate
implementation.
2022-01-22 14:12:08 -05:00
Daniel Nephin 62c09b2d0a acl: embed ACLResolver in Client and Server
In preparation for removing duplicate resolve token methods.
2022-01-22 14:07:26 -05:00
Chris S. Kim 9ef448dedd
Generate bindata_assetfs.go (#12146) 2022-01-21 16:06:44 -05:00
R.B. Boyer 05c7373a28 bulk rewrite using this script
set -euo pipefail

    unset CDPATH

    cd "$(dirname "$0")"

    for f in $(git grep '\brequire := require\.New(' | cut -d':' -f1 | sort -u); do
        echo "=== require: $f ==="
        sed -i '/require := require.New(t)/d' $f
        # require.XXX(blah) but not require.XXX(tblah) or require.XXX(rblah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\([^tr]\)/require.\1(t,\2/g' $f
        # require.XXX(tblah) but not require.XXX(t, blah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/require.\1(t,\2/g' $f
        # require.XXX(rblah) but not require.XXX(r, blah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/require.\1(t,\2/g' $f
        gofmt -s -w $f
    done

    for f in $(git grep '\bassert := assert\.New(' | cut -d':' -f1 | sort -u); do
        echo "=== assert: $f ==="
        sed -i '/assert := assert.New(t)/d' $f
        # assert.XXX(blah) but not assert.XXX(tblah) or assert.XXX(rblah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\([^tr]\)/assert.\1(t,\2/g' $f
        # assert.XXX(tblah) but not assert.XXX(t, blah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/assert.\1(t,\2/g' $f
        # assert.XXX(rblah) but not assert.XXX(r, blah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/assert.\1(t,\2/g' $f
        gofmt -s -w $f
    done
2022-01-20 10:46:23 -06:00
R.B. Boyer c12b0ee3d2 test: normalize require.New and assert.New syntax 2022-01-20 10:45:56 -06:00
R.B. Boyer baf886c6f3
proxycfg: introduce explicit UpstreamID in lieu of bare string (#12125)
The gist here is that now we use a value-type struct proxycfg.UpstreamID
as the map key in ConfigSnapshot maps where we used to use "upstream
id-ish" strings. These are internal only and used just for bidirectional
trips through the agent cache keyspace (like the discovery chain target
struct).

For the few places where the upstream id needs to be projected into xDS,
that's what (proxycfg.UpstreamID).EnvoyID() is for. This lets us ALWAYS
inject the partition and namespace into these things without making
stuff like the golden testdata diverge.
2022-01-20 10:12:04 -06:00
Dan Upton 088ba2edaf
[OSS] Remove remaining references to master (#11827) 2022-01-20 12:47:50 +00:00
VictorBac 145703972a
Add GRPC and GRPCUseTLS to api.HealthCheckDefinition (#12108)
* Add GRPC to HealthCheckDefinition

* add GRPC and GRPCUseTLS
2022-01-19 16:09:15 -05:00
Evan Culver ec65890f01
connect: Upgrade Envoy 1.20 to 1.20.1 (#11895) 2022-01-18 14:35:27 -05:00
Daniel Nephin 59206e38c7 rpc: cleanup exit and blocking condition logic in blockingQuery
Remove some unnecessary comments around query_blocking metric. The only
line that needs any comments in the atomic decrement.

Cleanup the block and return comments and logic. The old comment about
AbandonCh may have been relevant before, but it is expected behaviour
now.

The logic was simplified by inverting the err condition.
2022-01-17 16:59:25 -05:00
Daniel Nephin a28d1268cb rpc: extract rpcQueryTimeout method
This helps keep the logic in blockingQuery more focused. In the future we
may have a separate struct for RPC queries which may allow us to move this
off of Server.
2022-01-17 16:59:25 -05:00
Daniel Nephin 751bc2e7d3 rpc: move the index defaulting to setQueryMeta.
This safeguard should be safe to apply in general. We are already
applying it to non-blocking queries that call blockingQuery, so it
should be fine to apply it to others.
2022-01-17 16:59:25 -05:00
Daniel Nephin 95e471052b rpc: add subtests to blockingQuery test 2022-01-17 16:59:25 -05:00
Daniel Nephin 6bf8efe607 rpc: refactor blocking query
To remove the TODO, and make it more readable.

In general this reduces the scope of variables, making them easier to reason about.
It also introduces more early returns so that we can see the flow from the structure
of the function.
2022-01-17 16:58:47 -05:00
Daniel Nephin 1971a58b29
Merge pull request #11661 from hashicorp/dnephin/ca-remove-one-call-to-active-root
ca: remove one call to Provider.ActiveRoot
2022-01-13 16:48:12 -05:00
Kyle Havlovitz 2ba76486d0 Add virtual IP generation for term gateway backed services 2022-01-12 12:08:49 -08:00
Chris S. Kim 4330a6a21a
Fix race with tags (#12041) 2022-01-12 11:24:51 -05:00
Chris S. Kim 4f0a3a997c
Fix races in anti-entropy tests (#12028) 2022-01-11 14:28:51 -05:00
Mike Morris 277c41d336
ingress: allow setting TLS min version and cipher suites in ingress gateway config entries (#11576)
* xds: refactor ingress listener SDS configuration

* xds: update resolveListenerSDS call args in listeners_test

* ingress: add TLS min, max and cipher suites to GatewayTLSConfig

* xds: implement envoyTLSVersions and envoyTLSCipherSuites

* xds: merge TLS config

* xds: configure TLS parameters with ingress TLS context from leaf

* xds: nil check in resolveListenerTLSConfig validation

* xds: nil check in makeTLSParameters* functions

* changelog: add entry for TLS params on ingress config entries

* xds: remove indirection for TLS params in TLSConfig structs

* xds: return tlsContext, nil instead of ambiguous err

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>

* xds: switch zero checks to types.TLSVersionUnspecified

* ingress: add validation for ingress config entry TLS params

* ingress: validate listener TLS config

* xds: add basic ingress with TLS params tests

* xds: add ingress listeners mixed TLS min version defaults precedence test

* xds: add more explicit tests for ingress listeners inheriting gateway defaults

* xds: add test for single TLS listener on gateway without TLS defaults

* xds: regen golden files for TLSVersionInvalid zero value, add TLSVersionAuto listener test

* types/tls: change TLSVersion to string

* types/tls: update TLSCipherSuite to string type

* types/tls: implement validation functions for TLSVersion and TLSCipherSuites, make some maps private

* api: add TLS params to GatewayTLSConfig, add tests

* api: add TLSMinVersion to ingress gateway config entry test JSON

* xds: switch to Envoy TLS cipher suite encoding from types package

* xds: fixup validation for TLSv1_3 min version with cipher suites

* add some kitchen sink tests and add a missing struct tag

* xds: check if mergedCfg.TLSVersion is in TLSVersionsWithConfigurableCipherSuites

* xds: update connectTLSEnabled comment

* xds: remove unsued resolveGatewayServiceTLSConfig function

 * xds: add makeCommonTLSContextFromLeafWithoutParams

* types/tls: add LessThan comparator function for concrete values

* types/tls: change tlsVersions validation map from string to TLSVersion keys

* types/tls: remove unused envoyTLSCipherSuites

* types/tls: enable chacha20 cipher suites for Consul agent

* types/tls: remove insecure cipher suites from allowed config

TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 and TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 are both explicitly listed as insecure and disabled in the Go source.

Refs https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/crypto/tls/cipher_suites.go;l=329-330

* types/tls: add ValidateConsulAgentCipherSuites function, make direct lookup map private

* types/tls: return all unmatched cipher suites in validation errors

* xds: check that Envoy API value matching TLS version is found when building TlsParameters

* types/tls: check that value is found in map before appending to slice in MarshalEnvoyTLSCipherSuiteStrings

* types/tls: cast to string rather than fmt.Printf in TLSCihperSuite.String()

* xds: add TLSVersionUnspecified to list of configurable cipher suites

* structs: update note about config entry warning

* xds: remove TLS min version cipher suite unconfigurable test placeholder

* types/tls: update tests to remove assumption about private map values

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2022-01-11 11:46:42 -05:00
Dao Thanh Tung 217e2dc656
URL-encode/decode resource names for HTTP API part 2 (#11957) 2022-01-11 08:52:45 -05:00
Daniel Nephin 262898e561 ca: remove unnecessary var, and slightly reduce cyclo complexity
`newIntermediate` is always equal to `needsNewIntermediate`, so we can
remove the extra variable and use the original directly.

Also remove the `activeRoot.ID != newActiveRoot.ID` case from an if,
because that case is already checked above, and `needsNewIntermediate` will
already be true in that case.

This condition now reads a lot better:

> Persist a new root if we did not have one before, or if generated a new intermediate.
2022-01-06 16:56:49 -05:00
Daniel Nephin d406f78c5c ca: remove unused provider.ActiveRoot call
In the previous commit the single use of this storedRoot was removed.

In this commit the original objective is completed. The
Provider.ActiveRoot is being removed because

1. the secondary should get the active root from the Consul primary DC,
   not the provider, so that secondary DCs do not need to communicate
   with a provider instance in a different DC.
2. so that the Provider.ActiveRoot interface can be changed without
   impacting other code paths.
2022-01-06 16:56:48 -05:00
Daniel Nephin 4d15e8a9ec ca: extract the lookup of the active primary CA
This method had only one caller, which always looked for the active
root. This commit moves the lookup into the method to reduce the logic
in the one caller.

This is being done in preparation for a larger change. Keeping this
separate so it is easier to see.

The `storedRootID != primaryRoots.ActiveRootID` is being removed because
these can never be different.

The `storedRootID` comes from `provider.ActiveRoot`, the
`primaryRoots.ActiveRootID` comes from the store `CARoot` from the
primary. In both cases the source of the data is the primary DC.

Technically they could be different if someone modified the provider
outside of Consul, but that would break many things, so is not a
supported flow.

If these were out of sync because of ordering of events then the
secondary will soon receive an update to `primaryRoots` and everything
will be sorted out again.
2022-01-06 16:56:48 -05:00
Daniel Nephin 37b09df427 ca: update godoc
To clarify what to expect from the data stored in this field, and the
behaviour of this function.
2022-01-06 16:56:48 -05:00
Daniel Nephin 1f670c22f5 ca: remove one call to provider.ActiveRoot
ActiveRoot should not be called from the secondary DC, because there
should not be a requirement to run the same Vault instance in a
secondary DC. SignIntermediate is called in a secondary DC, so it should
not call ActiveRoot

We would also like to change the interface of ActiveRoot so that we can
support using an intermediate cert as the primary CA in Consul. In
preparation for making that change I am reducing the number of calls to
ActiveRoot, so that there are fewer code paths to modify when the
interface changes.

This change required a change to the mockCAServerDelegate we use in
tests. It was returning the RootCert for SignIntermediate, but that is
not an accurate fake of production. In production this would also be a
separate cert.
2022-01-06 16:55:50 -05:00
Daniel Nephin 1f66120c20 ca: remove redundant append of an intermediate cert
Immediately above this line we are already appending the full list of
intermediates. The `provider.ActiveIntermediate` MUST be in this list of
intermediates because it must be available to all the other non-leader
Servers.  If it was not in this list of intermediates then any proxy
that received data from a non-leader would have the wrong certs.

This is being removed now because we are planning on changing the
`Provider.ActiveIntermediate` interface, and removing these extra calls ahead of
time helps make that change easier.
2022-01-06 16:55:50 -05:00
Daniel Nephin b66d259c1a ca: only generate a single private key for the whole test case
Using tracing and cpu profiling I found that the majority of the time in
these test cases is spent generating a private key. We really don't need
separate private keys, so we can generate only one and use it for all
cases.

With this change the test runs much faster.
2022-01-06 16:55:50 -05:00
Daniel Nephin 92a054cfa6 ca: cleanup a test
Fix the name to match the function it is testing

Remove unused code

Fix the signature, instead of returning (error, string) which should be (string, error)
accept a testing.T to emit errors.

Handle the error from encode.
2022-01-06 16:55:49 -05:00
Daniel Nephin 9ec7e07db4 ca: use the new leaf signing lookup func in leader metrics 2022-01-06 16:55:49 -05:00
Blake Covarrubias b13fb553ac
api: Return 404 when deregistering a non-existent check (#11950)
Update the `/agent/check/deregister/` API endpoint to return a 404
HTTP response code when an attempt is made to de-register a check ID
that does not exist on the agent.

This brings the behavior of /agent/check/deregister/ in line with the
behavior of /agent/service/deregister/ which was changed in #10632 to
similarly return a 404 when de-registering non-existent services.

Fixes #5821
2022-01-06 12:38:37 -08:00
Dhia Ayachi 7e0b8354a5
clone the service under lock to avoid a data race (#11940)
* clone the service under lock to avoid a data race

* add change log

* create a struct and copy the pointer to mutate it to avoid a data race

* fix failing test

* revert added space

* add comments, to clarify the data race.
2022-01-06 14:33:06 -05:00
Daniel Nephin d05264041e
Merge pull request #11918 from hashicorp/dnephin/tob-followup
Fix a few small bugs
2022-01-05 18:50:48 -05:00
Daniel Nephin 4983c27703 snapshot: return the error from replyFn
The only function passed to SnapshotRPC today always returns a nil error, so there's no
way to exercise this bug in practice. This change is being made for correctness so that
it doesn't become a problem in the future, if we ever pass a different function to
SnapshotRPC.
2022-01-05 17:51:03 -05:00
Daniel Nephin affe97e22d config: correctly capture all errors.
Some calls to multierror.Append were not using the existing b.err, which meant we
were losing all previous errors.
2022-01-05 17:51:03 -05:00
Chris S. Kim f7f5aca058
Fix test for ENT (#11946) 2022-01-05 15:18:08 -05:00
Chris S. Kim 407b0b8963
Fix test for ENT (#11941) 2022-01-05 12:24:44 -05:00
Dhia Ayachi 5f6bf369af
reset `coalesceTimer` to nil as soon as the event is consumed (#11924)
* reset `coalesceTimer` to nil as soon as the event is consumed

* add change log

* refactor to add relevant test.

* fix linter

* Apply suggestions from code review

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* remove non needed check

Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2022-01-05 12:17:47 -05:00
Mathew Estafanous dc18933cc2
Ensure consistency with error-handling across all handlers. (#11599) 2022-01-05 12:11:03 -05:00
Jared Kirschner a9371f18e5 Clarify service and check error messages (use ID)
Error messages related to service and check operations previously included
the following substrings:
- service %q
- check %q

From this error message, it isn't clear that the expected field is the ID for
the entity, not the name. For example, if the user has a service named test,
the error message would read 'Unknown service "test"'. This is misleading -
a service with that *name* does exist, but not with that *ID*.

The substrings above have been modified to make it clear that ID is needed,
not name:
- service with ID %q
- check with ID %q
2022-01-04 11:42:37 -08:00
Jared Kirschner fc076c02c7
Merge pull request #11335 from littlestar642/url-encoded-args
URL-encode/decode resource names for HTTP API
2022-01-04 14:00:14 -05:00
Chris S. Kim d87fe70a82
testing: Revert assertion for virtual IP flag (#11932) 2022-01-04 11:24:56 -05:00
Jared Kirschner d26f8e4529
Merge pull request #11820 from hashicorp/improve-ui-disabled-api-response
http: improve UI not enabled response message
2022-01-03 12:00:01 -05:00
littlestar642 7d1f2157eb add path escape and unescape to path params 2022-01-03 08:18:32 -08:00
Daniel Nephin 48d123e241
Merge pull request #11796 from hashicorp/dnephin/cleanup-test-server
testing: stop using an old version in testServer
2021-12-22 16:04:04 -05:00
freddygv d7975586d6 Purge chain if it shouldn't be there 2021-12-13 18:56:44 -07:00
freddygv be85ae11ca additional test fixes 2021-12-13 18:56:44 -07:00
freddygv e1d4797561 Account for new upstreams constraint in tests 2021-12-13 18:56:28 -07:00
freddygv 16d3efc4b5 Check ingress upstreams when gating chain watches 2021-12-13 18:56:28 -07:00
freddygv f4ddb5432c Use ptr receiver in all Upstream methods 2021-12-13 18:56:14 -07:00
freddygv d647141a7d Avoid storing chain without an upstream 2021-12-13 18:56:14 -07:00
freddygv 9e0958f1d2 Clean up chains separately from their watches 2021-12-13 18:56:14 -07:00
freddygv b704d4e2dd Validate chains are associated with upstreams
Previously we could get into a state where discovery chain entries were
not cleaned up after the associated watch was cancelled. These changes
add handling for that case where stray chain references are encountered.
2021-12-13 18:56:13 -07:00
freddygv ea26a7b7cf Store intention upstreams in snapshot 2021-12-13 18:56:13 -07:00
R.B. Boyer 72a81cfc4a
proxycfg: ensure all of the watches are canceled if they are cancelable (#11824) 2021-12-13 15:56:17 -06:00
Jared Kirschner 7b78ded3c7
Merge pull request #11818 from hashicorp/improve-url-not-found-response
http: improve 404 Not Found response message
2021-12-13 16:08:50 -05:00
R.B. Boyer 3dccd14d31
proxycfg: use external addresses in tproxy when crossing partition boundaries (#11823) 2021-12-13 14:34:49 -06:00
Jared Kirschner 757236007a http: improve 404 Not Found response message
When a URL path is not found, return a non-empty message with the 404 status
code to help the user understand what went wrong. If the URL path was not
prefixed with '/v1/', suggest that may be the cause of the problem (which is a
common mistake).
2021-12-13 11:03:25 -08:00
Freddy f7eeffb98d
Use anonymousToken when querying by secret ID (#11813)
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Dan Upton <daniel@floppy.co>

This query has been incorrectly querying by accessor ID since New ACLs
were added. However, the legacy token compat allowed this to continue to
work, since it made a fallback query for the anonymousToken ID.

PR #11184 removed this legacy token query, which means that the query by
accessor ID is now the only check for the anonymous token's existence.

This PR updates the GetBySecret call to use the secret ID of the token.
2021-12-13 10:56:09 -07:00
R.B. Boyer a0156785dd
various partition related todos (#11822) 2021-12-13 11:43:33 -06:00
Jared Kirschner 8b8c79ea72 http: improve UI not enabled response message
Response now clearly indicates:
- the UI is disabled
- how to enable the UI
2021-12-13 08:48:33 -08:00
Kyle Havlovitz b9e1dcde1c
Merge pull request #11812 from hashicorp/metrics-ui-acls
oss: use wildcard partition in metrics proxy ui endpoint
2021-12-10 16:24:47 -08:00
Kyle Havlovitz 9187070a93
Merge pull request #11798 from hashicorp/vip-goroutine-check
leader: move the virtual IP version check into a goroutine
2021-12-10 15:59:35 -08:00
Kyle Havlovitz ad9c104816 acl: use wildcard partition in metrics proxy ui endpoint 2021-12-10 15:58:17 -08:00
Kyle Havlovitz 45402dad63 state: fix freed VIP table id index 2021-12-10 14:41:45 -08:00
Kyle Havlovitz ccc119c549 Exit before starting the vip check routine if possible 2021-12-10 14:30:50 -08:00
Daniel Nephin 6444d1d4b3 testing: Deprecate functions for creating a server.
These helper functions actually end up hiding important setup details
that should be visible from the test case. We already have a convenient
way of setting this config when calling newTestServerWithConfig.
2021-12-09 20:09:29 -05:00
Daniel Nephin 74e92316de testing: remove old config.Build version
DefaultConfig already sets the version to version.Version, so by removing this
our tests will run with the version that matches the code.
2021-12-09 20:09:29 -05:00
Kyle Havlovitz 2a52630067 leader: move the virtual IP version check into a goroutine 2021-12-09 17:00:33 -08:00
FFMMM 336a234927
[sync ent] increase segment max limit to 4*64, make configurable (#1424) (#11795)
* commit b6eb27563e747a78b7647d2b5da405e46364cc46
Author: FFMMM <FFMMM@users.noreply.github.com>
Date:   Thu Dec 9 13:53:44 2021 -0800

    increase segment max limit to 4*64, make configurable (#1424)

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

* fix: rename ent changelog file

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2021-12-09 15:36:11 -08:00
Daniel Nephin ded49b3ab0
Merge pull request #11780 from hashicorp/dnephin/ca-test-vault-in-secondary
ca: improve test coverage for RenewIntermediate
2021-12-09 12:29:43 -05:00
R.B. Boyer 5f6bf4e756
agent: ensure service maintenance checks for matching partitions ahead of other errors (#11788)
This matches behavior in most other agent api endpoints.
2021-12-09 10:05:02 -06:00
Daniel Nephin e6615bdaa7 fix misleading errors on vault shutdown 2021-12-08 18:42:52 -05:00