Commit graph

225 commits

Author SHA1 Message Date
John Eikenberry 6d1d82e1ad
remove redundant vault api retry logic (#16143)
remove redundant vault api retry logic

We upgraded Vault API module version to a version that has built-in
retry logic. So this code is no longer necessary.
Also add mention of re-configuring the provider in comments.
2023-02-07 20:52:22 +00:00
John Eikenberry e0346dc81a
fix goroutine leak in renew testing (#16142)
fix goroutine leak in renew testing

Test overwrote the stopWatcher() function variable for the test without
keeping and calling the original value. The original value is the
function that stops the goroutine... so it needs to be called.
2023-02-03 22:09:34 +00:00
Chris Thain 892ef07715
Support Vault agent auth config for AWS/GCP CA provider auth (#15970) 2023-01-18 11:53:04 -08:00
Dan Upton 618deae657
xds: don't attempt to load-balance sessions for local proxies (#15789)
Previously, we'd begin a session with the xDS concurrency limiter
regardless of whether the proxy was registered in the catalog or in
the server's local agent state.

This caused problems for users who run `consul connect envoy` directly
against a server rather than a client agent, as the server's locally
registered proxies wouldn't be included in the limiter's capacity.

Now, the `ConfigSource` is responsible for beginning the session and we
only do so for services in the catalog.

Fixes: https://github.com/hashicorp/consul/issues/15753
2023-01-18 12:33:21 -06:00
Semir Patel 1f82e82e04
Pass remote addr of incoming HTTP requests through to RPC(..) calls (#15700) 2022-12-14 09:24:22 -06:00
R.B. Boyer ec0857075e
connect: use -dev-no-store-token for test vaults to reduce source of flakes (#15691)
It turns out that by default the dev mode vault server will attempt to interact with the
filesystem to store the provided root token. If multiple vault instances are running
they'll all awkwardly share the filesystem and if timing results in one server stopping
while another one is starting then the starting one will error with:

    Error initializing Dev mode: rename /home/circleci/.vault-token.tmp /home/circleci/.vault-token: no such file or directory

This change uses `-dev-no-store-token` to bypass that source of flakes. Also the
stdout/stderr from the vault process is included if the test fails.

The introduction of more `t.Parallel` use in https://github.com/hashicorp/consul/pull/15669
increased the likelihood of this failure, but any of the tests with multiple vaults in use
(or running multiple package tests in parallel that all use vault) were eventually going
to flake on this.
2022-12-06 13:15:13 -06:00
R.B. Boyer ba6b24babf
connect: ensure all vault connect CA tests use limited privilege tokens (#15669)
All of the current integration tests where Vault is the Connect CA now use non-root tokens for the test. This helps us detect privilege changes in the vault model so we can keep our guides up to date.

One larger change was that the RenewIntermediate function got refactored slightly so it could be used from a test, rather than the large duplicated function we were testing in a test which seemed error prone.
2022-12-06 10:06:36 -06:00
R.B. Boyer a88d1239e3
Detect Vault 1.11+ import in secondary datacenters and update default issuer (#15661)
The fix outlined and merged in #15253 fixed the issue as it occurs in the primary
DC. There is a similar issue that arises when vault is used as the Connect CA in a
secondary datacenter that is fixed by this PR.

Additionally: this PR adds support to run the existing suite of vault related integration
tests against the last 4 versions of vault (1.9, 1.10, 1.11, 1.12)
2022-12-05 15:39:21 -06:00
Chris S. Kim efffcd56d0
Fix Vault managed intermediate PKI bug (#15525) 2022-11-28 16:17:58 -05:00
Alexander Scheel 8ef3fe3812
Detect Vault 1.11+ import, update default issuer (#15253)
Consul used to rely on implicit issuer selection when calling Vault endpoints to issue new CSRs. Vault 1.11+ changed that behavior, which caused Consul to check the wrong (previous) issuer when renewing its Intermediate CA. This patch allows Consul to explicitly set a default issuer when it detects that the response from Vault is 1.11+.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-11-17 16:29:49 -05:00
Kyle Havlovitz 7be442ee63
connect: strip port from DNS SANs for ingress gateway leaf cert (#15320)
* connect: strip port from DNS SANs for ingress gateway leaf cert

* connect: format DNS SANs in CreateCSR

* connect: Test wildcard case when formatting SANs
2022-11-14 10:27:03 -08:00
Kyle Schochenmaier 2b1e5f69e2
removes ioutil usage everywhere which was deprecated in go1.16 (#15297)
* update go version to 1.18 for api and sdk, go mod tidy
* removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-11-10 10:26:01 -06:00
Chris S. Kim ae1646706f Regenerate files according to 1.19.2 formatter 2022-10-24 16:12:08 -04:00
Kyle Havlovitz 0a968e53b5 Warn instead of returning an error when intermediate mount tune permission is missing 2022-10-18 12:01:25 -07:00
freddygv a21e5799f7 Use internal server certificate for peering TLS
A previous commit introduced an internally-managed server certificate
to use for peering-related purposes.

Now the peering token has been updated to match that behavior:
- The server name matches the structure of the server cert
- The CA PEMs correspond to the Connect CA

Note that if Conect is disabled, and by extension the Connect CA, we
fall back to the previous behavior of returning the manually configured
certs and local server SNI.

Several tests were updated to use the gRPC TLS port since they enable
Connect by default. This means that the peering token will embed the
Connect CA, and the dialer will expect a TLS listener.
2022-10-07 09:05:32 -06:00
freddygv a33a014b9c Add handling in agent cache for server leaf certs 2022-09-16 17:54:34 -06:00
Kyle Havlovitz 33e616987c Update intermediate pki mount/role when reconfiguring Vault provider 2022-09-13 15:42:26 -07:00
Freddy a7f38384ae
Add SpiffeID for Consul server agents (#14485)
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>

By adding a SpiffeID for server agents, servers can now request a leaf
certificate from the Connect CA.

This new Spiffe ID has a key property: servers are identified by their
datacenter name and trust domain. All servers that share these
attributes will share a ServerURI.

The aim is to use these certificates to verify the server name of ANY
server in a Consul datacenter.
2022-09-06 17:58:13 -06:00
Eric Haberkorn 3d45306e1b
Cluster peering failover disco chain changes (#14296) 2022-08-23 09:13:43 -04:00
R.B. Boyer 93611819e2
xds: mesh gateways now have their own leaf certificate when involved in a peering (#13460)
This is only configured in xDS when a service with an L7 protocol is
exported.

They also load any relevant trust bundles for the peered services to
eventually use for L7 SPIFFE validation during mTLS termination.
2022-06-15 14:36:18 -05:00
Chris S. Kim 4cb251497f
Update RBAC to handle imported services (#13404)
When converting from Consul intentions to xds RBAC rules, services imported from other peers must encode additional data like partition (from the remote cluster) and trust domain.

This PR updates the PeeringTrustBundle to hold the sending side's local partition as ExportedPartition. It also updates RBAC code to encode SpiffeIDs of imported services with the ExportedPartition and TrustDomain.
2022-06-10 17:15:22 -04:00
Freddy 6ef38eaea7
Configure upstream TLS context with peer root certs (#13321)
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.

Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.

This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.
2022-06-01 15:53:52 -06:00
R.B. Boyer bc10055edc
peering: replicate expected SNI, SPIFFE, and service protocol to peers (#13218)
The importing peer will need to know what SNI and SPIFFE name
corresponds to each exported service. Additionally it will need to know
at a high level the protocol in use (L4/L7) to generate the appropriate
connection pool and local metrics.

For replicated connect synthetic entities we edit the `Connect{}` part
of a `NodeService` to have a new section:

    {
      "PeerMeta": {
        "SNI": [
          "web.default.default.owt.external.183150d5-1033-3672-c426-c29205a576b8.consul"
        ],
        "SpiffeID": [
          "spiffe://183150d5-1033-3672-c426-c29205a576b8.consul/ns/default/dc/dc1/svc/web"
        ],
        "Protocol": "tcp"
      }
    }

This data is then replicated and saved as-is at the importing side. Both
SNI and SpiffeID are slices for now until I can be sure we don't need
them for how mesh gateways will ultimately work.
2022-05-25 12:37:44 -05:00
Mark Anderson 18193f2916
Support vault namespaces in connect CA (#12904)
* Support vault namespaces in connect CA

Follow on to some missed items from #12655

From an internal ticket "Support standard "Vault namespace in the
path" semantics for Connect Vault CA Provider"

Vault allows the namespace to be specified as a prefix in the path of
a PKI definition, but our usage of the Vault API includes calls that
don't support a namespaced key. In particular the sys.* family of
calls simply appends the key, instead of prefixing the namespace in
front of the path.

Unfortunately it is difficult to reliably parse a path with a
namespace; only vault knows what namespaces are present, and the '/'
separator can be inside a key name, as well as separating path
elements. This is in use in the wild; for example
'dc1/intermediate-key' is a relatively common naming schema.

Instead we add two new fields: RootPKINamespace and
IntermediatePKINamespace, which are the absolute namespace paths
'prefixed' in front of the respective PKI Paths.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-05-04 19:41:55 -07:00
Chris S. Kim 829554c706
peering: Make Upstream peer-aware (#12900)
Adds DestinationPeer field to Upstream.
Adds Peer field to UpstreamID and its string conversion functions.
2022-04-29 18:12:51 -04:00
R.B. Boyer bbd38e95ce
chore: upgrade mockery to v2 and regenerate (#12836) 2022-04-21 09:48:21 -05:00
John Murret 1e118a5410 set vault namespaces on vault client prior to logging in with the vault auth method 2022-04-14 12:18:06 -06:00
Dan Upton 769d1d6e8e
ConnectCA.Sign gRPC Endpoint (#12787)
Introduces a gRPC endpoint for signing Connect leaf certificates. It's also
the first of the public gRPC endpoints to perform leader-forwarding, so
establishes the pattern of forwarding over the multiplexed internal RPC port.
2022-04-14 14:26:14 +01:00
Mark Anderson ed3e42296d Fixup acl.EnterpriseMeta
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 15:11:49 -07:00
Mark Anderson aa29324a24
Avoid using sys/mounts to enable namespaces (#12655)
* Avoid doing list of /sys/mounts

From an internal ticket "Support standard "Vault namespace in the path" semantics for Connect Vault CA Provider"

Vault allows the namespace to be specified as a prefix in the path of
a PKI definition, but this doesn't currently work for
```IntermediatePKIPath``` specifications, because we attempt to list
all of the paths to check if ours is already defined. This doesn't
really work in a namespaced world.

This changes the IntermediatePKIPath code to follow the same pattern
as the root key, where we directly get the key rather than listing.

This code is difficult to write automated tests for because it relies
on features of Vault Enterprise, which isn't currently part of our
test framework, so it was tested manually.

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* add changelog

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-03-31 23:35:38 -07:00
Connor f8fc317731
Fix leaked Vault LifetimeRenewers (#12607)
* Fix leaked Vault LifetimeRenewers

When the Vault CA Provider is reconfigured we do not stop the
LifetimeRenewers which can cause them to leak until the Consul processes
recycles. On Configure execute stopWatcher if it exists and is not nil
before starting a new renewal

* Add jitter before restarting the LifetimeWatcher

If we fail to login to Vault or our token is no longer valid we can
overwhelm a Vault instance with many requests very quickly by restarting
the LifetimeWatcher. Before restarting the LifetimeWatcher provide a
backoff time of 1 second or less.

* Use a retry.Waiter instead of RandomStagger

* changelog

* gofmt'd

* Swap out bool for atomic.Unit32 in test

* Provide some extra clarification in comment and changelog
2022-03-28 09:58:16 -05:00
Dhia Ayachi 5cb24b9bf8
split pbcommon to pbcommon and pbcommongogo (#12587)
* mogify needed pbcommon structs

* mogify needed pbconnect structs

* fix compilation errors and make config_translate_test pass

* add missing file

* remove redundant oss func declaration

* fix EnterpriseMeta to copy the right data for enterprise

* rename pbcommon package to pbcommongogo

* regenerate proto and mog files

* add missing mog files

* add pbcommon package

* pbcommon no mog

* fix enterprise meta code generation

* fix enterprise meta code generation (pbcommongogo)

* fix mog generation for gogo

* use `protoc-go-inject-tag` to inject tags

* rename proto package

* pbcommon no mog

* use `protoc-go-inject-tag` to inject tags

* add non gogo proto to make file

* fix proto get
2022-03-22 16:30:00 -04:00
Daniel Nephin bb7f2f15b3 ca: require that tests that use Vault are named correctly
Previously we were using two different criteria to decide where to run a
test.  The main `go-test` job would skip Vault tests based on the
presence of the `vault` binary, but the `test-connect-ca-providers` job
would run tests based on the name.

This led to a scenario where a test may never run in CI.

To fix this problem I added a name check to the function we use to skip
the test. This should ensure that any test that requires vault is named
correctly to be run as part of the `test-connect-ca-providers` job.

At the same time I relaxed the regex we use. I verified this runs the
same tests using `go test --list Vault`.  I made this change because a
bunch of tests in `agent/connect/ca` used `Vault` in the name, without
the underscores. Instead of changing a bunch of test names, this seemed
easier.

With this approach, the worst case is that we run a few extra tests in
the `test-connect-ca-providers` job, which doesn't seem like a problem.
2022-02-28 16:13:53 -05:00
Daniel Nephin ca4e60e09b Update TODOs to reference an issue with more details
And remove a no longer needed TODO
2022-02-17 18:21:30 -05:00
Daniel Nephin aacc40012f ca: add a test for secondary with external CA 2022-02-17 18:21:30 -05:00
Daniel Nephin 471b2098bb ca: examine the full chain in newCARoot
make TestNewCARoot much more strict
compare the full result instead of only a few fields.
add a test case with 2 and 3 certificates in the pem
2022-02-17 18:21:30 -05:00
Daniel Nephin fc6c0ec139 ca: small docs improvements 2022-02-17 18:21:30 -05:00
Daniel Nephin af651eaaad ca: cleanup validateSetIntermediate 2022-02-17 18:21:30 -05:00
Daniel Nephin ef03f7be73 ca: only return the leaf cert from Sign in vault provider
The interface is documented as 'Sign will only return the leaf', and the other providers
only return the leaf. It seems like this was added during the initial implementation, so
is likely just something we missed. It doesn't break anything , but it does cause confusing cert chains
in the API response which could break something in the future.
2022-02-17 18:21:30 -05:00
FFMMM 1f8fb17be7
Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311)
This commit syncs ENT changes to the OSS repo.

Original commit details in ENT:

```
commit 569d25f7f4578981c3801e6e067295668210f748
Author: FFMMM <FFMMM@users.noreply.github.com>
Date:   Thu Feb 10 10:23:33 2022 -0800

    Vendor fork net rpc (#1538)

    * replace net/rpc w consul-net-rpc/net/rpc

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * replace msgpackrpc and go-msgpack with fork from mono repo

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * gofmt all files touched

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
```

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-02-14 09:45:45 -08:00
Daniel Nephin 7b466a024b Make test more readable
And fix typo
2022-02-03 18:44:09 -05:00
Daniel Nephin 6721c1246d ca: relax and move private key type/bit validation for vault
This commit makes two changes to the validation.

Previously we would call this validation in GenerateRoot, which happens
both on initialization (when a follower becomes leader), and when a
configuration is updated. We only want to do this validation during
config update so the logic was moved to the UpdateConfiguration
function.

Previously we would compare the config values against the actual cert.
This caused problems when the cert was created manually in Vault (not
created by Consul).  Now we compare the new config against the previous
config. Using a already created CA cert should never error now.

Adding the key bit and types to the config should only error when
the previous values were not the defaults.
2022-02-03 17:21:20 -05:00
Daniel Nephin 44f9229b96 ca: add a test that uses an intermediate CA as the primary CA
This test found a bug in the secondary. We were appending the root cert
to the PEM, but that cert was already appended. This was failing
validation in Vault here:
https://github.com/hashicorp/vault/blob/sdk/v0.3.0/sdk/helper/certutil/types.go#L329

Previously this worked because self signed certs have the same
SubjectKeyID and AuthorityKeyID. So having the same self-signed cert
repeated doesn't fail that check.

However with an intermediate that is not self-signed, those values are
different, and so we fail the check. A test I added in a previous commit
should show that this continues to work with self-signed root certs as
well.
2022-02-02 13:41:35 -05:00
Daniel Nephin fa8ff28a63 ca/provider: remove ActiveRoot from Provider 2022-01-27 13:07:37 -05:00
Daniel Nephin 722e3a6ac4 ca: update MockProvider for new interface 2022-01-27 12:51:35 -05:00
Daniel Nephin 80f215675c ca: update GenerateRoot godoc 2022-01-27 12:51:35 -05:00
R.B. Boyer 05c7373a28 bulk rewrite using this script
set -euo pipefail

    unset CDPATH

    cd "$(dirname "$0")"

    for f in $(git grep '\brequire := require\.New(' | cut -d':' -f1 | sort -u); do
        echo "=== require: $f ==="
        sed -i '/require := require.New(t)/d' $f
        # require.XXX(blah) but not require.XXX(tblah) or require.XXX(rblah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\([^tr]\)/require.\1(t,\2/g' $f
        # require.XXX(tblah) but not require.XXX(t, blah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/require.\1(t,\2/g' $f
        # require.XXX(rblah) but not require.XXX(r, blah)
        sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/require.\1(t,\2/g' $f
        gofmt -s -w $f
    done

    for f in $(git grep '\bassert := assert\.New(' | cut -d':' -f1 | sort -u); do
        echo "=== assert: $f ==="
        sed -i '/assert := assert.New(t)/d' $f
        # assert.XXX(blah) but not assert.XXX(tblah) or assert.XXX(rblah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\([^tr]\)/assert.\1(t,\2/g' $f
        # assert.XXX(tblah) but not assert.XXX(t, blah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/assert.\1(t,\2/g' $f
        # assert.XXX(rblah) but not assert.XXX(r, blah)
        sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/assert.\1(t,\2/g' $f
        gofmt -s -w $f
    done
2022-01-20 10:46:23 -06:00
R.B. Boyer c12b0ee3d2 test: normalize require.New and assert.New syntax 2022-01-20 10:45:56 -06:00
Daniel Nephin e6615bdaa7 fix misleading errors on vault shutdown 2021-12-08 18:42:52 -05:00
Daniel Nephin be1ddc5942 ca: use the real FSM operation in tests
Previously we had a couple copies that reproduced the FSM operation.
These copies introduce risk that the test does not accurately match
production.

This PR removes the test versions of the FSM operation, and exports the
real production FSM operation so that it can be used in tests.

The consul provider tests did need to change because of this. Previously
we would return a hardcoded value of 2, but in production this value is
always incremented.
2021-12-08 17:29:44 -05:00