open-consul

Commit Graph

Author	SHA1	Message	Date
FFMMM	0fd6cdc900	introduce EmptyReadRequest for status_endpoint (#12653 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2022-03-29 18:05:45 -07:00
Eric	8fd73ede3e	remove gogo from acl protobufs	2022-03-28 16:20:56 -04:00
Kyle Havlovitz	d9f31345e0	Merge pull request #12596 from hashicorp/overview-endpoint oss: Add overview UI internal endpoint	2022-03-24 14:27:54 -07:00
Mike Morris	8020fb2098	agent: convert listener config to TLS types (#12522 ) * tlsutil: initial implementation of types/TLSVersion tlsutil: add test for parsing deprecated agent TLS version strings tlsutil: return TLSVersionInvalid with error tlsutil: start moving tlsutil cipher suite lookups over to types/tls tlsutil: rename tlsLookup to ParseTLSVersion, add cipherSuiteLookup agent: attempt to use types in runtime config agent: implement b.tlsVersion validation in config builder agent: fix tlsVersion nil check in builder tlsutil: update to renamed ParseTLSVersion and goTLSVersions tlsutil: fixup TestConfigurator_CommonTLSConfigTLSMinVersion tlsutil: disable invalid config parsing tests tlsutil: update tests auto_config: lookup old config strings from base.TLSMinVersion auto_config: update endpoint tests to use TLS types agent: update runtime_test to use TLS types agent: update TestRuntimeCinfig_Sanitize.golden agent: update config runtime tests to expect TLS types * website: update Consul agent tls_min_version values * agent: fixup TLS parsing and compilation errors * test: fixup lint issues in agent/config_runtime_test and tlsutil/config_test * tlsutil: add CHACHA20_POLY1305 cipher suites to goTLSCipherSuites * test: revert autoconfig tls min version fixtures to old format * types: add TLSVersions public function * agent: add warning for deprecated TLS version strings * agent: move agent config specific logic from tlsutil.ParseTLSVersion into agent config builder * tlsutil(BREAKING): change default TLS min version to TLS 1.2 * agent: move ParseCiphers logic from tlsutil into agent config builder * tlsutil: remove unused CipherString function * agent: fixup import for types package * Revert "tlsutil: remove unused CipherString function" This reverts commit 6ca7f6f58d268e617501b7db9500113c13bae70c. * agent: fixup config builder and runtime tests * tlsutil: fixup one remaining ListenerConfig -> ProtocolConfig * test: move TLS cipher suites parsing test from tlsutil into agent config builder tests * agent: remove parseCiphers helper from auto_config_endpoint_test * test: remove unused imports from tlsutil * agent: remove resolved FIXME comment * tlsutil: remove TODO and FIXME in cipher suite validation * agent: prevent setting inherited cipher suite config when TLS 1.3 is specified * changelog: add entry for converting agent config to TLS types * agent: remove FIXME in runtime test, this is covered in builder tests with invalid tls9 value now * tlsutil: remove config tests for values checked at agent config builder boundary * tlsutil: remove tls version check from loadProtocolConfig * tlsutil: remove tests and TODOs for logic checked in TestBuilder_tlsVersion and TestBuilder_tlsCipherSuites * website: update search link for supported Consul agent cipher suites * website: apply review suggestions for tls_min_version description * website: attempt to clean up markdown list formatting for tls_min_version * website: moar linebreaks to fix tls_min_version formatting * Revert "website: moar linebreaks to fix tls_min_version formatting" This reverts commit 38585927422f73ebf838a7663e566ac245f2a75c. * autoconfig: translate old values for TLSMinVersion * agent: rename var for translated value of deprecated TLS version value * Update agent/config/deprecated.go Co-authored-by: Dan Upton <daniel@floppy.co> * agent: fix lint issue * agent: fixup deprecated config test assertions for updated warning Co-authored-by: Dan Upton <daniel@floppy.co>	2022-03-24 15:32:25 -04:00
Kyle Havlovitz	0d5cbf6f30	Sort by partition/ns/servicename instead of the reverse	2022-03-24 12:16:05 -07:00
Kyle Havlovitz	1b654c9807	Clean up ent meta id usage in overview summary	2022-03-23 12:47:12 -07:00
Mark Anderson	28c925f6d0	Fixup dropped SecretID usage Looks like something got munged at some point. Not sure how it slipped in, but my best guess is that because TestTxn_Apply_ACLDeny is marked flaky we didn't block merge because it failed. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-22 21:20:03 -07:00
Kyle Havlovitz	04f1d9bcc9	oss: Add overview UI internal endpoint	2022-03-22 17:05:09 -07:00
Dan Upton	2fe06f663b	streaming: emit events when Connect CA Roots change (#12590 ) OSS sync of enterprise changes at 614f786d	2022-03-22 19:13:59 +00:00
Dan Upton	fb441e323a	Restructure gRPC server setup (#12586 ) OSS sync of enterprise changes at 0b44395e	2022-03-22 12:40:24 +00:00
Mark Anderson	2b367626f0	Add source of authority annotations to the PermissionDeniedError output. (#12567 ) This extends the acl.AllowAuthorizer with source of authority information. The next step is to unify the AllowAuthorizer and ACLResolveResult structures; that will be done in a separate PR. Part of #12481 Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-18 10:32:25 -07:00
Dan Upton	57f0f42733	Support per-listener TLS configuration ⚙️ (#12504 ) Introduces the capability to configure TLS differently for Consul's listeners/ports (i.e. HTTPS, gRPC, and the internal multiplexed RPC port) which is useful in scenarios where you may want the HTTPS or gRPC interfaces to present a certificate signed by a well-known/public CA, rather than the certificate used for internal communication which must have a SAN in the form `server.<dc>.consul`.	2022-03-18 10:46:58 +00:00
FFMMM	3c08843847	[sync oss] add net/rpc interceptor implementation (#12573 ) * sync ent changes from 866dcb0667 Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * update oss go.mod Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-03-17 16:02:26 -07:00
Eric	ae1cdc85b1	Remove the stdduration gogo extension	2022-03-16 12:12:29 -04:00
mrspanishviking	1ae820ea0a	Revert "[Docs] Agent configuration hierarchy "	2022-03-15 16:13:58 -07:00
trujillo-adam	667976c94f	fixing merge conflicts part 3	2022-03-15 15:25:03 -07:00
trujillo-adam	60a88bb40f	merging new hierarchy for agent configuration	2022-03-14 15:44:41 -07:00
Mark Anderson	ab099e5fcb	Refactor config checks oss (#12550 ) Currently the config_entry.go subsystem delegates authorization decisions via the ConfigEntry interface CanRead and CanWrite code. Unfortunately this returns a true/false value and loses the details of the source. This is not helpful, especially since it the config subsystem can be more complex to understand, since it covers so many domains. This refactors CanRead/CanWrite to return a structured error message (PermissionDenied or the like) with more details about the reason for denial. Part of #12241 Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-11 13:45:51 -08:00
Mark Anderson	5591cb1e11	Bulk acl message fixup oss (#12470 ) * First pass for helper for bulk changes Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Convert ACLRead and ACLWrite to new form Signed-off-by: Mark Anderson <manderson@hashicorp.com> * AgentRead and AgentWRite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fix EventWrite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * KeyRead, KeyWrite, KeyList Signed-off-by: Mark Anderson <manderson@hashicorp.com> * KeyRing Signed-off-by: Mark Anderson <manderson@hashicorp.com> * NodeRead NodeWrite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * OperatorRead and OperatorWrite Signed-off-by: Mark Anderson <manderson@hashicorp.com> * PreparedQuery Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Intention partial Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fix ServiceRead, Write ,etc Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Error check ServiceRead? Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fix Sessionread/Write Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fixup snapshot ACL Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Error fixups for txn Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Add changelog Signed-off-by: Mark Anderson <manderson@hashicorp.com> * Fixup review comments Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-03-10 18:48:27 -08:00
Eric Haberkorn	45312886fe	Code review changes	2022-03-07 14:39:33 -05:00
Eric	3d46f9ef7c	Add `Meta` to `ServiceConfigResponse`	2022-03-07 10:05:18 -05:00
R.B. Boyer	b63a0f3909	reduce flakiness/raciness of errNotFound and errNotChanged blocking query tests (#12518 ) Improves tests from #12362 These tests try to setup the following concurrent scenario: 1. (goroutine 1) execute read RPC with index=0 2. (goroutine 1) get response from (1) @ index=10 3. (goroutine 1) execute read RPC with index=10 and block 4. (goroutine 2) WHILE (3) is blocking, start slamming the system with stray writes that will cause the WatchSet to wakeup 5. (goroutine 2) after doing all writes, shut down the reader above 6. (goroutine 1) stops reading, double checks that it only ever woke up once (from 1)	2022-03-04 11:20:01 -06:00
R.B. Boyer	07b92a2855	server: fix spurious blocking query suppression for discovery chains (#12512 ) Minor fix for behavior in #12362 IsDefault sometimes returns true even if there was a proxy-defaults or service-defaults config entry that was consulted. This PR fixes that.	2022-03-03 16:54:41 -06:00
Daniel Nephin	8f4b6af68a	Merge pull request #12298 from jorgemarey/b-persistnewrootandconfig Avoid raft change when no config is provided on persistNewRootAndConfig	2022-03-03 11:03:50 -05:00
Daniel Nephin	2082bdc286	ca: make sure the test fails without the fix Also change the path used for the secondary so that both primary and secondary do not overwrite each other.	2022-03-02 18:22:49 -05:00
R.B. Boyer	679cea7171	raft: upgrade to v1.3.6 (#12496 ) Add additional protections on the Consul side to prevent NonVoters from bootstrapping raft. This should un-flake TestServer_Expect_NonVoters	2022-03-02 17:00:02 -06:00
Daniel Nephin	849d86e7f5	Merge pull request #12467 from hashicorp/dnephin/ci-vault-test-safer ca: require that tests that use Vault are named correctly	2022-03-01 12:54:02 -05:00
R.B. Boyer	033e0ed13f	test: parallelize more of TestLeader_ReapOrLeftMember_IgnoreSelf (#12468 ) before: $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf ok github.com/hashicorp/consul/agent/consul 21.147s after: $ go test ./agent/consul -run TestLeader_ReapOrLeftMember_IgnoreSelf ok github.com/hashicorp/consul/agent/consul 5.402s	2022-03-01 10:30:06 -06:00
Jorge Marey	aba9e724a8	Fix vault test with suggested changes	2022-03-01 10:20:00 +01:00
Jorge Marey	8b1b264b6f	Add test case to verify #12298	2022-03-01 09:25:52 +01:00
Jorge Marey	2ca00df0d8	Avoid raft change when no config is provided on CAmanager - This avoids a change to the raft store when no roots or config are provided to persistNewRootAndConfig	2022-03-01 09:25:52 +01:00
Daniel Nephin	dd565aa5e4	ca: fix a test This test does not use Vault, so does not need ca.SkipIfVaultNotPresent	2022-02-28 16:26:18 -05:00
R.B. Boyer	3804677570	server: suppress spurious blocking query returns where multiple config entries are involved (#12362 ) Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect: - `DiscoveryChain.Get` - `ConfigEntry.ResolveServiceConfig` - `Intentions.Match` All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism. ### No Config Entries Found In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110. ### No Change Since Last Wakeup Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110. This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report. ### Extra Endpoints Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated: - `ConfigEntry.List` - `Internal.IntentionUpstreams` (tproxy)	2022-02-25 15:46:34 -06:00
R.B. Boyer	4b0f657b31	fix flaky test panic (#12446 )	2022-02-24 17:35:46 -06:00
R.B. Boyer	a97d20cf63	catalog: compare node names case insensitively in more places (#12444 ) Many places in consul already treated node names case insensitively. The state store indexes already do it, but there are a few places that did a direct byte comparison which have now been corrected. One place of particular consideration is ensureCheckIfNodeMatches which is executed during snapshot restore (among other places). If a node check used a slightly different casing than the casing of the node during register then the snapshot restore here would deterministically fail. This has been fixed. Primary approach: git grep -i "node.[!=]=.node" -- ':!_test.go' ':!docs' git grep -i '\[[^]]member[^]]\] git grep -i '\[[^]]$member\\|name\\|node$[^]]\]' -- ':!_test.go' ':!website' ':!ui' ':!agent/proxycfg/testing.go:' ':!*.md'	2022-02-24 16:54:47 -06:00
R.B. Boyer	d860384731	server: partly fix config entry replication issue that prevents replication in some circumstances (#12307 ) There are some cross-config-entry relationships that are enforced during "graph validation" at persistence time that are required to be maintained. This means that config entries may form a digraph at times. Config entry replication procedes in a particular sorted order by kind and name. Occasionally there are some fixups to these digraphs that end up replicating in the wrong order and replicating the leaves (ingress-gateway) before the roots (service-defaults) leading to replication halting due to a graph validation error related to things like mismatched service protocol requirements. This PR changes replication to give each computed change (upsert/delete) a fair shot at being applied before deciding to terminate that round of replication in error. In the case where we've simply tried to do the operations in the wrong order at least ONE of the outstanding requests will complete in the right order, leading the subsequent round to have fewer operations to do, with a smaller likelihood of graph validation errors. This does not address all scenarios, but for scenarios where the edits are being applied in the wrong order this should avoid replication halting. Fixes #9319 The scenario that is NOT ADDRESSED by this PR is as follows: 1. create: service-defaults: name=new-web, protocol=http 2. create: service-defaults: name=old-web, protocol=http 3. create: service-resolver: name=old-web, redirect-to=new-web 4. delete: service-resolver: name=old-web 5. update: service-defaults: name=old-web, protocol=grpc 6. update: service-defaults: name=new-web, protocol=grpc 7. create: service-resolver: name=old-web, redirect-to=new-web If you shutdown dc2 just before (4) and turn it back on after (7) replication is impossible as there is no single edit you can make to make forward progress.	2022-02-23 17:27:48 -06:00
Daniel Nephin	3639f4b551	Merge pull request #11910 from hashicorp/dnephin/ca-provider-interface-for-ica-in-primary ca: add support for an external trusted CA	2022-02-22 13:14:52 -05:00
R.B. Boyer	11fdc70b34	configentry: make a new package to hold shared config entry structs that aren't used for RPC or the FSM (#12384 ) First two candidates are ConfigEntryKindName and DiscoveryChainConfigEntries.	2022-02-22 10:36:36 -06:00
Daniel Nephin	cb1a80184f	rpc: set response to nil when not found Otherwise when the query times out we might incorrectly send a value for the reply, when we should send an empty reply. Also document errNotFound and how to handle the result in that case.	2022-02-18 12:26:06 -05:00
Daniel Nephin	79820738cc	ca: test that original certs from secondary still verify There's a chance this could flake if the secondary hasn't received the update yet, but running this test many times doesn't show any flakes yet.	2022-02-17 18:45:16 -05:00
Daniel Nephin	ca4e60e09b	Update TODOs to reference an issue with more details And remove a no longer needed TODO	2022-02-17 18:21:30 -05:00
Daniel Nephin	0abaf29c10	ca: add test cases for rotating external trusted CA	2022-02-17 18:21:30 -05:00
Daniel Nephin	aacc40012f	ca: add a test for secondary with external CA	2022-02-17 18:21:30 -05:00
Daniel Nephin	471b2098bb	ca: examine the full chain in newCARoot make TestNewCARoot much more strict compare the full result instead of only a few fields. add a test case with 2 and 3 certificates in the pem	2022-02-17 18:21:30 -05:00
Daniel Nephin	2d5254a73b	Merge pull request #12110 from hashicorp/dnephin/blocking-queries-not-found rpc: make blocking queries for non-existent items more efficient	2022-02-17 18:09:39 -05:00
Florian Apolloner	895da50986	Support for connect native services in topology view. (#12098 )	2022-02-16 16:51:54 -05:00
Chris S. Kim	18096fd2fb	Move IndexEntryName helpers to common files (#12365 )	2022-02-16 12:56:38 -05:00
Daniel Nephin	06657e5be0	rpc: add errNotFound to all Get queries Any query that returns a list of items is not part of this commit.	2022-02-15 18:24:34 -05:00
Daniel Nephin	bdafa24c50	Make blockingQuery efficient with 'not found' results. By using the query results as state. Blocking queries are efficient when the query matches some results, because the ModifyIndex of those results, returned as queryMeta.Mindex, will never change unless the items themselves change. Blocking queries for non-existent items are not efficient because the queryMeta.Index can (and often does) change when other entities are written. This commit reduces the churn of these queries by using a different comparison for "has changed". Instead of using the modified index, we use the existence of the results. If the previous result was "not found" and the new result is still "not found", we know we can ignore the modified index and continue to block. This is done by setting the minQueryIndex to the returned queryMeta.Index, which prevents the query from returning before a state change is observed.	2022-02-15 18:24:33 -05:00
Daniel Nephin	6e73df7dc2	Add a test for blocking query on non-existent entry This test shows how blocking queries are not efficient when the query returns no results. The test fails with 100+ calls instead of the expected 2. This test is still a bit flaky because it depends on the timing of the writes. It can sometimes return 3 calls. A future commit should fix this and make blocking queries even more optimal for not-found results.	2022-02-15 18:23:17 -05:00
Daniel Nephin	a4e1c59cd8	rpc: improve docs for blockingQuery Follow the Go convention of accepting a small interface that documents the methods used by the function. Clarify the rules for implementing a query function passed to blockingQuery.	2022-02-15 14:20:14 -05:00
R.B. Boyer	b216d52b66	server: conditionally avoid writing a config entry to raft if it was already the same (#12321 ) This will both save on unnecessary raft operations as well as unnecessarily incrementing the raft modify index of config entries subject to no-op updates.	2022-02-14 14:39:12 -06:00
FFMMM	1f8fb17be7	Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311 ) This commit syncs ENT changes to the OSS repo. Original commit details in ENT: ``` commit 569d25f7f4578981c3801e6e067295668210f748 Author: FFMMM <FFMMM@users.noreply.github.com> Date: Thu Feb 10 10:23:33 2022 -0800 Vendor fork net rpc (#1538) * replace net/rpc w consul-net-rpc/net/rpc Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * replace msgpackrpc and go-msgpack with fork from mono repo Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * gofmt all files touched Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> ``` Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2022-02-14 09:45:45 -08:00
Mark Anderson	fa95afdcf6	Refactor to make ACL errors more structured. (#12308 ) * First phase of refactoring PermissionDeniedError Add extended type PermissionDeniedByACLError that captures information about the accessor, particular permission type and the object and name of the thing being checked. It may be worth folding the test and error return into a single helper function, that can happen at a later date. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2022-02-11 12:53:23 -08:00
Freddy	f45bec7779	Merge pull request #12223 from hashicorp/proxycfg/passthrough-cleanup	2022-02-10 17:35:51 -07:00
freddygv	7fba7456ec	Fix race of upstreams with same passthrough ip Due to timing, a transparent proxy could have two upstreams to dial directly with the same address. For example: - The orders service can dial upstreams shipping and payment directly. - An instance of shipping at address 10.0.0.1 is deregistered. - Payments is scaled up and scheduled to have address 10.0.0.1. - The orders service receives the event for the new payments instance before seeing the deregistration for the shipping instance. At this point two upstreams have the same passthrough address and Envoy will reject the listener configuration. To disambiguate this commit considers the Raft index when storing passthrough addresses. In the example above, 10.0.0.1 would only be associated with the newer payments service instance.	2022-02-10 17:01:57 -07:00
Daniel Nephin	db4675bd1a	Merge pull request #12277 from hashicorp/dnephin/panic-in-service-register catalog: initialize the refs map to prevent a nil panic	2022-02-09 19:48:22 -05:00
Daniel Nephin	6376141464	config-entry: fix a panic when registering a service or ingress gateway	2022-02-09 18:49:48 -05:00
Daniel Nephin	c20412ab14	Merge pull request #12265 from hashicorp/dnephin/logging-in-tests sdk: add TestLogLevel for setting log level in tests	2022-02-07 16:11:23 -05:00
Daniel Nephin	5a0e6700c1	A test to reproduce the issue	2022-02-04 14:04:12 -05:00
Daniel Nephin	7b466a024b	Make test more readable And fix typo	2022-02-03 18:44:09 -05:00
Daniel Nephin	6721c1246d	ca: relax and move private key type/bit validation for vault This commit makes two changes to the validation. Previously we would call this validation in GenerateRoot, which happens both on initialization (when a follower becomes leader), and when a configuration is updated. We only want to do this validation during config update so the logic was moved to the UpdateConfiguration function. Previously we would compare the config values against the actual cert. This caused problems when the cert was created manually in Vault (not created by Consul). Now we compare the new config against the previous config. Using a already created CA cert should never error now. Adding the key bit and types to the config should only error when the previous values were not the defaults.	2022-02-03 17:21:20 -05:00
Daniel Nephin	3b78f81f9a	ca: small cleanup of TestConnectCAConfig_Vault_TriggerRotation_Fails Before adding more test cases	2022-02-03 17:21:20 -05:00
Daniel Nephin	f6d7a0f7b2	testing: fix test failures caused by new log level These two tests require debug logging enabled, because they look for log lines. Also switched to testify assertions because the previous errors were not clear.	2022-02-03 17:07:39 -05:00
Daniel Nephin	1a9a656a7f	sdk: add TestLogLevel for setting log level in tests And default log level to WARN.	2022-02-03 13:42:28 -05:00
Daniel Nephin	44f9229b96	ca: add a test that uses an intermediate CA as the primary CA This test found a bug in the secondary. We were appending the root cert to the PEM, but that cert was already appended. This was failing validation in Vault here: https://github.com/hashicorp/vault/blob/sdk/v0.3.0/sdk/helper/certutil/types.go#L329 Previously this worked because self signed certs have the same SubjectKeyID and AuthorityKeyID. So having the same self-signed cert repeated doesn't fail that check. However with an intermediate that is not self-signed, those values are different, and so we fail the check. A test I added in a previous commit should show that this continues to work with self-signed root certs as well.	2022-02-02 13:41:35 -05:00
Daniel Nephin	d00a9abca2	acl: un-embed ACLIdentity This is safer than embedding two interface because there are a number of places where we check the concrete type. If we check the concrete type on the top-level interface it will fail. So instead expose the ACLIdentity from a method.	2022-02-02 12:07:31 -05:00
Daniel Nephin	18ff00f985	Merge pull request #12167 from hashicorp/dnephin/acl-resolve-token-3 acl: rename ResolveTokenToIdentityAndAuthorizer to ResolveToken	2022-01-31 19:21:06 -05:00
Daniel Nephin	ff64c13c3e	Merge pull request #12166 from hashicorp/dnephin/acl-resolve-token-2 acl: remove ResolveTokenToIdentity	2022-01-31 19:19:21 -05:00
Daniel Nephin	aa4dbe2a17	acl: rename ResolveTokenToIdentityAndAuthorizer to ResolveToken This change allows us to remove one of the last remaining duplicate resolve token methods (Server.ResolveToken). With this change we are down to only 2, where the second one also handles setting the default EnterpriseMeta from the token.	2022-01-31 18:04:19 -05:00
Daniel Nephin	57eac90cae	acl: remove unused methods on fakes, and add changelog Also document the metric that was removed in a previous commit.	2022-01-31 17:53:53 -05:00
Daniel Nephin	1fb2d49826	Merge pull request #12165 from hashicorp/dnephin/acl-resolve-token acl: remove some of the duplicate resolve token methods	2022-01-31 13:27:49 -05:00
Dan Upton	ebdda4848f	streaming: split event buffer by key (#12080 )	2022-01-28 12:27:00 +00:00
Daniel Nephin	fa8ff28a63	ca/provider: remove ActiveRoot from Provider	2022-01-27 13:07:37 -05:00
Daniel Nephin	d56a1dfb2c	Merge pull request #11663 from hashicorp/dnephin/ca-remove-one-call-to-active-root-2 ca: remove second call to Provider.ActiveRoot	2022-01-27 12:41:05 -05:00
Daniel Nephin	d3324d0d27	Merge pull request #12109 from hashicorp/dnephin/blocking-query-1 rpc: make blockingQuery easier to read	2022-01-26 18:13:55 -05:00
Daniel Nephin	74dc9925cc	Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-01-26 12:24:13 -05:00
Daniel Nephin	2c311161cc	acl: extract a backend type for the ACLResolverBackend This is a small step to isolate the functionality that is used for the ACLResolver from the large Client and Server structs.	2022-01-26 12:24:10 -05:00
Daniel Nephin	c1da07e2ea	acl: remove calls to ResolveIdentityFromToken We already have an ACLResolveResult, so we can get the accessor ID from it.	2022-01-22 15:05:42 -05:00
Daniel Nephin	ed1cc5f255	acl: remove ResolveTokenToIdentity By exposing the AccessorID from the primary ResolveToken method we can remove this duplication.	2022-01-22 14:47:59 -05:00
Daniel Nephin	26f0ebd96f	acl: return a resposne from ResolveToken that includes the ACLIdentity So that we can duplicate duplicate methods.	2022-01-22 14:33:09 -05:00
Daniel Nephin	314614f073	acl: remove duplicate methods Now that ACLResolver is embedded we don't need ResolveTokenToIdentity on Client and Server. Moving ResolveTokenAndDefaultMeta to ACLResolver removes the duplicate implementation.	2022-01-22 14:12:08 -05:00
Daniel Nephin	62c09b2d0a	acl: embed ACLResolver in Client and Server In preparation for removing duplicate resolve token methods.	2022-01-22 14:07:26 -05:00
R.B. Boyer	05c7373a28	bulk rewrite using this script set -euo pipefail unset CDPATH cd "$(dirname "$0")" for f in $(git grep '\brequire := require\.New(' \| cut -d':' -f1 \| sort -u); do echo "=== require: $f ===" sed -i '/require := require.New(t)/d' $f # require.XXX(blah) but not require.XXX(tblah) or require.XXX(rblah) sed -i 's/\brequire\.$[a-zA-Z0-9_]$($[^tr]$/require.\1(t,\2/g' $f # require.XXX(tblah) but not require.XXX(t, blah) sed -i 's/\brequire\.$[a-zA-Z0-9_]$($t[^,]$/require.\1(t,\2/g' $f # require.XXX(rblah) but not require.XXX(r, blah) sed -i 's/\brequire\.$[a-zA-Z0-9_]$($r[^,]$/require.\1(t,\2/g' $f gofmt -s -w $f done for f in $(git grep '\bassert := assert\.New(' \| cut -d':' -f1 \| sort -u); do echo "=== assert: $f ===" sed -i '/assert := assert.New(t)/d' $f # assert.XXX(blah) but not assert.XXX(tblah) or assert.XXX(rblah) sed -i 's/\bassert\.$[a-zA-Z0-9_]$($[^tr]$/assert.\1(t,\2/g' $f # assert.XXX(tblah) but not assert.XXX(t, blah) sed -i 's/\bassert\.$[a-zA-Z0-9_]$($t[^,]$/assert.\1(t,\2/g' $f # assert.XXX(rblah) but not assert.XXX(r, blah) sed -i 's/\bassert\.$[a-zA-Z0-9_]$($r[^,]$/assert.\1(t,\2/g' $f gofmt -s -w $f done	2022-01-20 10:46:23 -06:00
R.B. Boyer	c12b0ee3d2	test: normalize require.New and assert.New syntax	2022-01-20 10:45:56 -06:00
Dan Upton	088ba2edaf	[OSS] Remove remaining references to master (#11827 )	2022-01-20 12:47:50 +00:00
Daniel Nephin	59206e38c7	rpc: cleanup exit and blocking condition logic in blockingQuery Remove some unnecessary comments around query_blocking metric. The only line that needs any comments in the atomic decrement. Cleanup the block and return comments and logic. The old comment about AbandonCh may have been relevant before, but it is expected behaviour now. The logic was simplified by inverting the err condition.	2022-01-17 16:59:25 -05:00
Daniel Nephin	a28d1268cb	rpc: extract rpcQueryTimeout method This helps keep the logic in blockingQuery more focused. In the future we may have a separate struct for RPC queries which may allow us to move this off of Server.	2022-01-17 16:59:25 -05:00
Daniel Nephin	751bc2e7d3	rpc: move the index defaulting to setQueryMeta. This safeguard should be safe to apply in general. We are already applying it to non-blocking queries that call blockingQuery, so it should be fine to apply it to others.	2022-01-17 16:59:25 -05:00
Daniel Nephin	95e471052b	rpc: add subtests to blockingQuery test	2022-01-17 16:59:25 -05:00
Daniel Nephin	6bf8efe607	rpc: refactor blocking query To remove the TODO, and make it more readable. In general this reduces the scope of variables, making them easier to reason about. It also introduces more early returns so that we can see the flow from the structure of the function.	2022-01-17 16:58:47 -05:00
Daniel Nephin	1971a58b29	Merge pull request #11661 from hashicorp/dnephin/ca-remove-one-call-to-active-root ca: remove one call to Provider.ActiveRoot	2022-01-13 16:48:12 -05:00
Kyle Havlovitz	2ba76486d0	Add virtual IP generation for term gateway backed services	2022-01-12 12:08:49 -08:00
Daniel Nephin	262898e561	ca: remove unnecessary var, and slightly reduce cyclo complexity `newIntermediate` is always equal to `needsNewIntermediate`, so we can remove the extra variable and use the original directly. Also remove the `activeRoot.ID != newActiveRoot.ID` case from an if, because that case is already checked above, and `needsNewIntermediate` will already be true in that case. This condition now reads a lot better: > Persist a new root if we did not have one before, or if generated a new intermediate.	2022-01-06 16:56:49 -05:00
Daniel Nephin	d406f78c5c	ca: remove unused provider.ActiveRoot call In the previous commit the single use of this storedRoot was removed. In this commit the original objective is completed. The Provider.ActiveRoot is being removed because 1. the secondary should get the active root from the Consul primary DC, not the provider, so that secondary DCs do not need to communicate with a provider instance in a different DC. 2. so that the Provider.ActiveRoot interface can be changed without impacting other code paths.	2022-01-06 16:56:48 -05:00
Daniel Nephin	4d15e8a9ec	ca: extract the lookup of the active primary CA This method had only one caller, which always looked for the active root. This commit moves the lookup into the method to reduce the logic in the one caller. This is being done in preparation for a larger change. Keeping this separate so it is easier to see. The `storedRootID != primaryRoots.ActiveRootID` is being removed because these can never be different. The `storedRootID` comes from `provider.ActiveRoot`, the `primaryRoots.ActiveRootID` comes from the store `CARoot` from the primary. In both cases the source of the data is the primary DC. Technically they could be different if someone modified the provider outside of Consul, but that would break many things, so is not a supported flow. If these were out of sync because of ordering of events then the secondary will soon receive an update to `primaryRoots` and everything will be sorted out again.	2022-01-06 16:56:48 -05:00
Daniel Nephin	37b09df427	ca: update godoc To clarify what to expect from the data stored in this field, and the behaviour of this function.	2022-01-06 16:56:48 -05:00
Daniel Nephin	1f670c22f5	ca: remove one call to provider.ActiveRoot ActiveRoot should not be called from the secondary DC, because there should not be a requirement to run the same Vault instance in a secondary DC. SignIntermediate is called in a secondary DC, so it should not call ActiveRoot We would also like to change the interface of ActiveRoot so that we can support using an intermediate cert as the primary CA in Consul. In preparation for making that change I am reducing the number of calls to ActiveRoot, so that there are fewer code paths to modify when the interface changes. This change required a change to the mockCAServerDelegate we use in tests. It was returning the RootCert for SignIntermediate, but that is not an accurate fake of production. In production this would also be a separate cert.	2022-01-06 16:55:50 -05:00
Daniel Nephin	1f66120c20	ca: remove redundant append of an intermediate cert Immediately above this line we are already appending the full list of intermediates. The `provider.ActiveIntermediate` MUST be in this list of intermediates because it must be available to all the other non-leader Servers. If it was not in this list of intermediates then any proxy that received data from a non-leader would have the wrong certs. This is being removed now because we are planning on changing the `Provider.ActiveIntermediate` interface, and removing these extra calls ahead of time helps make that change easier.	2022-01-06 16:55:50 -05:00
Daniel Nephin	b66d259c1a	ca: only generate a single private key for the whole test case Using tracing and cpu profiling I found that the majority of the time in these test cases is spent generating a private key. We really don't need separate private keys, so we can generate only one and use it for all cases. With this change the test runs much faster.	2022-01-06 16:55:50 -05:00
Daniel Nephin	92a054cfa6	ca: cleanup a test Fix the name to match the function it is testing Remove unused code Fix the signature, instead of returning (error, string) which should be (string, error) accept a testing.T to emit errors. Handle the error from encode.	2022-01-06 16:55:49 -05:00
Daniel Nephin	9ec7e07db4	ca: use the new leaf signing lookup func in leader metrics	2022-01-06 16:55:49 -05:00
Daniel Nephin	4983c27703	snapshot: return the error from replyFn The only function passed to SnapshotRPC today always returns a nil error, so there's no way to exercise this bug in practice. This change is being made for correctness so that it doesn't become a problem in the future, if we ever pass a different function to SnapshotRPC.	2022-01-05 17:51:03 -05:00
Jared Kirschner	a9371f18e5	Clarify service and check error messages (use ID) Error messages related to service and check operations previously included the following substrings: - service %q - check %q From this error message, it isn't clear that the expected field is the ID for the entity, not the name. For example, if the user has a service named test, the error message would read 'Unknown service "test"'. This is misleading - a service with that name does exist, but not with that ID. The substrings above have been modified to make it clear that ID is needed, not name: - service with ID %q - check with ID %q	2022-01-04 11:42:37 -08:00
Chris S. Kim	d87fe70a82	testing: Revert assertion for virtual IP flag (#11932 )	2022-01-04 11:24:56 -05:00
Daniel Nephin	48d123e241	Merge pull request #11796 from hashicorp/dnephin/cleanup-test-server testing: stop using an old version in testServer	2021-12-22 16:04:04 -05:00
Freddy	f7eeffb98d	Use anonymousToken when querying by secret ID (#11813 ) Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Dan Upton <daniel@floppy.co> This query has been incorrectly querying by accessor ID since New ACLs were added. However, the legacy token compat allowed this to continue to work, since it made a fallback query for the anonymousToken ID. PR #11184 removed this legacy token query, which means that the query by accessor ID is now the only check for the anonymous token's existence. This PR updates the GetBySecret call to use the secret ID of the token.	2021-12-13 10:56:09 -07:00
R.B. Boyer	a0156785dd	various partition related todos (#11822 )	2021-12-13 11:43:33 -06:00
Kyle Havlovitz	9187070a93	Merge pull request #11798 from hashicorp/vip-goroutine-check leader: move the virtual IP version check into a goroutine	2021-12-10 15:59:35 -08:00
Kyle Havlovitz	45402dad63	state: fix freed VIP table id index	2021-12-10 14:41:45 -08:00
Kyle Havlovitz	ccc119c549	Exit before starting the vip check routine if possible	2021-12-10 14:30:50 -08:00
Daniel Nephin	6444d1d4b3	testing: Deprecate functions for creating a server. These helper functions actually end up hiding important setup details that should be visible from the test case. We already have a convenient way of setting this config when calling newTestServerWithConfig.	2021-12-09 20:09:29 -05:00
Daniel Nephin	74e92316de	testing: remove old config.Build version DefaultConfig already sets the version to version.Version, so by removing this our tests will run with the version that matches the code.	2021-12-09 20:09:29 -05:00
Kyle Havlovitz	2a52630067	leader: move the virtual IP version check into a goroutine	2021-12-09 17:00:33 -08:00
Daniel Nephin	15c4de0c15	ca: prune some unnecessary lookups in the tests	2021-12-08 18:42:52 -05:00
Daniel Nephin	bf798094d5	ca: remove duplicate WaitFor function	2021-12-08 18:42:52 -05:00
Daniel Nephin	984986f007	ca: fix flakes in RenewIntermediate tests I suspect one problem was that we set structs.IntermediateCertRenewInterval to 1ms, which meant that in some cases the intermediate could renew before we stored the original value. Another problem was that the 'wait for intermediate' loop was calling the provider.ActiveIntermediate, but the comparison needs to use the RPC endpoint to accurately represent a user request. So changing the 'wait for' to use the state store ensures we don't race. Also moves the patching into a separate function. Removes the addition of ca.CertificateTimeDriftBuffer as part of calculating halfTime. This was added in a previous commit to attempt to fix the flake, but it did not appear to fix the problem. Adding the time here was making the tests fail when using the shared patch function. It's not clear to me why, but there's no reason we should be including this time in the halfTime calculation.	2021-12-08 18:42:52 -05:00
Daniel Nephin	bc7ec4455f	ca: improve RenewIntermediate tests Use the new verifyLearfCert to show the cert verifies with intermediates from both sources. This required using the RPC interface so that the leaf pem was constructed correctly. Add IndexedCARoots.Active since that is a common operation we see in a few places.	2021-12-08 18:42:52 -05:00
Daniel Nephin	0784073d5e	ca: add a test for Vault in secondary DC	2021-12-08 18:42:51 -05:00
Daniel Nephin	373f445db5	ca: Add CARoots.Active method Which will be used in the next commit.	2021-12-08 18:41:51 -05:00
R.B. Boyer	2f345cca33	acl: ensure that the agent recovery token is properly partitioned (#11782 )	2021-12-08 17:11:55 -06:00
Daniel Nephin	0f95a2c3b1	Merge pull request #11721 from hashicorp/dnephin/ca-export-fsm-operation ca: use the real FSM operation in tests	2021-12-08 17:49:00 -05:00
Daniel Nephin	be1ddc5942	ca: use the real FSM operation in tests Previously we had a couple copies that reproduced the FSM operation. These copies introduce risk that the test does not accurately match production. This PR removes the test versions of the FSM operation, and exports the real production FSM operation so that it can be used in tests. The consul provider tests did need to change because of this. Previously we would return a hardcoded value of 2, but in production this value is always incremented.	2021-12-08 17:29:44 -05:00
R.B. Boyer	957758cb61	test: test server should auto cleanup (#11779 )	2021-12-08 13:26:06 -06:00
Evan Culver	32a04317bf	rpc: Unset partition before forwarding to remote datacenter (#11758 )	2021-12-08 11:02:14 -08:00
Daniel Nephin	52c8b4994b	Merge remote-tracking branch 'origin/main' into serve-panic-recovery	2021-12-07 16:30:41 -05:00
Chris S. Kim	b74ddd7b70	Godocs updates for catalog endpoints (#11716 )	2021-12-07 10:18:28 -05:00
Dan Upton	8bc11b08dc	Rename `ACLMasterToken` => `ACLInitialManagementToken` (#11746 )	2021-12-07 12:39:28 +00:00
Dan Upton	0230ebb4ef	agent/token: rename `agent_master` to `agent_recovery` (internally) (#11744 )	2021-12-07 12:12:47 +00:00
R.B. Boyer	89e90d1ffc	return the max	2021-12-06 15:36:52 -06:00
freddygv	65875a7c69	Remove support for failover to partition Failing over to a partition is more siimilar to failing over to another datacenter than it is to failing over to a namespace. In a future release we should update how localities for failover are specified. We should be able to accept a list of localities which can include both partition and datacenter.	2021-12-06 12:32:24 -07:00
freddygv	a1c1e36be7	Allow cross-partition references in disco chain * Add partition fields to targets like service route destinations * Update validation to prevent cross-DC + cross-partition references * Handle partitions when reading config entries for disco chain * Encode partition in compiled targets	2021-12-06 12:32:19 -07:00
R.B. Boyer	5ea4b82940	light refactors to support making partitions and serf-based wan federation are mutually exclusive (#11755 )	2021-12-06 13:18:02 -06:00
Freddy	d86b98c503	Merge pull request #11739 from hashicorp/ap/exports-rename	2021-12-06 08:20:50 -07:00
freddygv	a2fd30e514	Clean up additional refs to partition exports	2021-12-04 15:16:40 -07:00
freddygv	02fb323652	Rename partition-exports to exported-services Using a name less tied to partitions gives us more flexibility to use this config entry in OSS for exports between datacenters/meshes.	2021-12-03 17:47:31 -07:00
freddygv	fcfed67246	Update intention topology to use new table	2021-12-03 17:28:31 -07:00
freddygv	4acbdc4618	Avoid updating default decision from wildcard ixn Given that we do not allow wildcard partitions in intentions, no one ixn can override the DefaultAllow setting. Only the default ACL policy applies across all partitions.	2021-12-03 17:28:12 -07:00
freddygv	142d8193e5	Add a new table to query service names by kind This table purposefully does not index by partition/namespace. It's a global view into all service names. This table is intended to replace the current serviceListTxn watch in intentionTopologyTxn. For cross-partition transparent proxying we need to be able to calculate upstreams from intentions in any partition. This means that the existing serviceListTxn function is insufficient since it's scoped to a partition. Moving away from that function is also beneficial because it watches the main "services" table, so watchers will wake up when any instance is registered or deregistered.	2021-12-03 17:28:12 -07:00
Freddy	3eddf98e62	Merge pull request #11680 from hashicorp/ap/partition-exports-oss	2021-12-03 16:57:50 -07:00
Dan Upton	2f4b8d7a7d	internal: support `ResultsFilteredByACLs` flag/header (#11643 )	2021-12-03 23:04:24 +00:00
Dan Upton	43e28a3af6	query: support `ResultsFilteredByACLs` in query list endpoint (#11620 )	2021-12-03 23:04:09 +00:00
Dhia Ayachi	e38ccf0a22	port oss changes (#11736 )	2021-12-03 17:23:55 -05:00
Freddy	3791d6d7da	Merge pull request #11720 from hashicorp/bbolt	2021-12-03 14:44:36 -07:00
Dan Upton	1d694df02b	fedstate: support `ResultsFilteredByACLs` in `ListMeshGateways` endpoint (#11644 )	2021-12-03 20:56:55 +00:00
Dan Upton	0489ea187d	catalog: support `ResultsFilteredByACLs` flag/header (#11594 )	2021-12-03 20:56:14 +00:00
Dan Upton	8bb1b89554	coordinate: support `ResultsFilteredByACLs` flag/header (#11617 )	2021-12-03 20:51:02 +00:00
Dan Upton	a62aa3847d	sessions: support `ResultsFilteredByACLs` flag/header (#11606 )	2021-12-03 20:43:43 +00:00
Dan Upton	0a7ba5162e	txn: support `ResultsFilteredByACLs` flag in `Read` endpoint (#11632 )	2021-12-03 20:41:03 +00:00
Dhia Ayachi	a8874c65f7	sessions partitioning tests (#11734 ) * state: port KV and Tombstone tables to new pattern * go fmt'ed * handle wildcards for tombstones * Fix graveyard ent vs oss * fix oss compilation error * add partition to tombstones and kv state store indexes * refactor to use `indexWithEnterpriseIndexable` * Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * add `singleValueID` implementation assertions * partition `tableSessions` table * fix sessions to use UUID and fix prefix index * fix oss build * clean up unused functions * fix oss compilation * add a partition indexer for sessions * Fix oss to not have partition index * fix oss tests * remove unused operations_ent.go and operations_oss.go func * remove unused const * convert `IndexID` of `session_checks` table * convert `indexSession` of `session_checks` table * convert `indexNodeCheck` of `session_checks` table * partition `indexID` and `indexSession` of `tableSessionChecks` * fix oss linter * fix review comments * remove partition for Checks as it's always use the session partition * fix tests * fix tests * do not namespace nodeChecks index Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2021-12-03 15:36:07 -05:00
Dan Upton	b10e69ffda	intention: support `ResultsFilteredByACLs` flag/header (#11612 )	2021-12-03 20:35:54 +00:00
Mark Anderson	e8f542030e	Cross port of ent #1383 (#11726 ) Cross port of ent #1383 "Reject non-default datacenter when making partitioned ACLs" On the OSS side this is a minor refactor to add some more checks that are only applicable to enterprise code. Signed-off-by: Mark Anderson <manderson@hashicorp.com>	2021-12-03 10:20:25 -08:00
Dan Upton	1d571bb503	config: support `ResultsFilteredByACLs` in list/list all endpoints (#11621 )	2021-12-03 17:39:47 +00:00
Dan Upton	44bc833318	kv: support `ResultsFilteredByACLs` in list/list keys (#11593 )	2021-12-03 17:31:48 +00:00
Dan Upton	3ad8540d23	health: support `ResultsFilteredByACLs` flag/header (#11602 )	2021-12-03 17:31:32 +00:00
Dan Upton	0efe478044	Groundwork for exposing when queries are filtered by ACLs (#11569 )	2021-12-03 17:11:26 +00:00
Kyle Havlovitz	dbb58b726a	Merge pull request #11724 from hashicorp/service-virtual-ips oss: add virtual IP generation for connect services	2021-12-02 16:16:57 -08:00
Kyle Havlovitz	db88f95fbe	consul: add virtual IP generation for connect services	2021-12-02 15:42:47 -08:00
R.B. Boyer	6ec84cfbe2	agent: add variation of force-leave that exclusively works on the WAN (#11722 ) Fixes #6548	2021-12-02 17:15:10 -06:00
Matt Keeler	68e629a476	Emit raft-boltdb metrics	2021-12-02 16:56:15 -05:00
Daniel Nephin	8e2c71528f	config: add NoFreelistSync option # Conflicts: # agent/config/testdata/TestRuntimeConfig_Sanitize-enterprise.golden # agent/consul/server.go	2021-12-02 16:56:15 -05:00
Matt Keeler	1f49738167	Use raft-boltdb/v2	2021-12-02 16:56:15 -05:00
Daniel Nephin	fa32c78429	ca: set the correct SigningKeyID after config update with Vault provider The test added in this commit shows the problem. Previously the SigningKeyID was set to the RootCert not the local leaf signing cert. This same bug was fixed in two other places back in 2019, but this last one was missed. While fixing this bug I noticed I had the same few lines of code in 3 places, so I extracted a new function for them. There would be 4 places, but currently the InitializeCA flow sets this SigningKeyID in a different way, so I've left that alone for now.	2021-12-02 16:07:11 -05:00
Daniel Nephin	a0014e13fd	Merge pull request #11713 from hashicorp/dnephin/ca-test-names ca: make test naming consistent	2021-12-02 16:05:42 -05:00
Daniel Nephin	720d782225	Merge pull request #11671 from hashicorp/dnephin/ca-fix-storing-vault-intermediate ca: fix storing the leaf signing cert with Vault provider	2021-12-02 16:02:24 -05:00
Daniel Nephin	a0160f7426	Merge pull request #11677 from hashicorp/dnephin/freeport-interface sdk: use t.Cleanup in freeport and remove unnecessary calls	2021-12-02 15:58:41 -05:00
Daniel Nephin	c1cb77b829	ca: make test naming consistent While working on the CA system it is important to be able to run all the tests related to the system, without having to wait for unrelated tests. There are many slow and unrelated tests in agent/consul, so we need some way to filter to only the relevant tests. This PR renames all the CA system related tests to start with either `TestCAMananger` for tests of internal operations that don't have RPC endpoint, or `TestConnectCA` for tests of RPC endpoints. This allows us to run all the test with: go test -run 'TestCAMananger\|TestConnectCA' ./agent/consul The test naming follows an undocumented convention of naming tests as follows: Test[<struct name>_]<function name>[_<test case description>] I tried to always keep Primary/Secondary at the end of the description, and _Vault_ has to be in the middle because of our regex to run those tests as a separate CI job. You may notice some of the test names changed quite a bit. I did my best to identify the underlying method being tested, but I may have been slightly off in some cases.	2021-12-02 14:57:09 -05:00
Daniel Nephin	460f8919c9	ca: make getLeafSigningCertFromRoot safer As a method on the struct type this would not be safe to call without first checking c.isIntermediateUsedToSignLeaf. So for now, move this logic to the CAMananger, so that it is always correct.	2021-12-02 12:42:49 -05:00
Daniel Nephin	64532ef636	ca: fix stored CARoot representation with Vault provider We were not adding the local signing cert to the CARoot. This commit fixes that bug, and also adds support for fixing existing CARoot on upgrade. Also update the tests for both primary and secondary to be more strict. Check the SigningKeyID is correct after initialization and rotation.	2021-12-02 12:42:49 -05:00
Chris S. Kim	67eacee31e	ENT to OSS sync (#11703 )	2021-12-01 14:56:10 -05:00
R.B. Boyer	70b143ddc5	auto-config: ensure the feature works properly with partitions (#11699 )	2021-12-01 13:32:34 -06:00
Daniel Nephin	963a9819d0	ca: add some godoc and func for finding leaf signing cert This will be used in a follow up commit.	2021-11-30 18:36:41 -05:00
Daniel Nephin	056a52ba64	sdk/freeport: rename Port to GetOne For better consistency with GetN	2021-11-30 17:32:41 -05:00
Chris S. Kim	e9c661db7f	Refactor test helper (#11689 ) Allow custom ACL root tokens to be passed	2021-11-30 13:22:07 -05:00
Chris S. Kim	0ec67cc2d1	acl: Fill authzContext from token in Coordinate endpoints (#11688 )	2021-11-30 13:17:41 -05:00
freddygv	76146dfc5b	Move ent config test to ent file	2021-11-29 12:15:17 -07:00
Daniel Nephin	4f0d092c95	testing: remove unnecessary calls to freeport Previously we believe it was necessary for all code that required ports to use freeport to prevent conflicts. https://github.com/dnephin/freeport-test shows that it is actually save to use port 0 (`127.0.0.1:0`) as long as it is passed directly to `net.Listen`, and the listener holds the port for as long as it is needed. This works because freeport explicitly avoids the ephemeral port range, and port 0 always uses that range. As you can see from the test output of https://github.com/dnephin/freeport-test, the two systems never use overlapping ports. This commit converts all uses of freeport that were being passed directly to a net.Listen to use port 0 instead. This allows us to remove a bit of wrapping we had around httptest, in a couple places.	2021-11-29 12:19:43 -05:00
Daniel Nephin	20a8e11bf2	testing: use the new freeport interfaces	2021-11-27 15:39:46 -05:00
Daniel Nephin	2cf41e4dc8	go-sso: remove returnFunc now that freeport handles return	2021-11-27 15:29:38 -05:00
Daniel Nephin	772d8f7381	ca: clean up unnecessary raft.Apply response checking In d2ab767fef21244e9fe3b9887ea70fc177912381 raftApply was changed to handle this check in a single place, instad of having every caller check it. It looks like these few places were missed when I did that clean up. This commit removes the remaining resp.(error) checks, since they are all no-ops now.	2021-11-26 17:57:55 -05:00
Daniel Nephin	48954adfdc	Merge pull request #11339 from hashicorp/dnephin/ca-manager-isolate-secondary-2 ca: reduce use of state in the secondary	2021-11-26 14:41:45 -05:00
Daniel Nephin	8240286956	ca: remove state check in secondarySetPrimaryRoots This function is only ever called from operations that have already acquired the state lock, so checking the value of state can never fail. This change is being made in preparation for splitting out a separate type for the secondary logic. The state can't easily be shared, so really only the expored top-level functions should acquire the 'state lock'.	2021-11-26 14:14:47 -05:00
Daniel Nephin	877094e2fa	ca: remove actingSecondaryCA This commit removes the actingSecondaryCA field, and removes the stateLock around it. This field was acting as a proxy for providerRoot != nil, so replace it with that check instead. The two methods which called secondarySetCAConfigured already set the state, so checking the state again at this point will not catch runtime errors (only programming errors, which we can catch with tests). In general, handling state transitions should be done on the "entrypoint" methods where execution starts, not in every internal method. This is being done to remove some unnecessary references to c.state, in preparations for extracting types for primary/secondary.	2021-11-26 14:14:47 -05:00
Daniel Nephin	cd5f6b2dfb	ca: reduce consul provider backend interface a bit This makes it easier to fake, which will allow me to use the ConsulProvider as an 'external PKI' to test a customer setup where the actual root CA is not the root we use for the Consul CA. Replaces a call to the state store to fetch the clusterID with the clusterID field already available on the built-in provider.	2021-11-25 11:46:06 -05:00
Dhia Ayachi	f605689154	Partition/kv indexid sessions (#11639 ) * state: port KV and Tombstone tables to new pattern * go fmt'ed * handle wildcards for tombstones * Fix graveyard ent vs oss * fix oss compilation error * add partition to tombstones and kv state store indexes * refactor to use `indexWithEnterpriseIndexable` * Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * partition `tableSessions` table * fix sessions to use UUID and fix prefix index * fix oss build * clean up unused functions * fix oss compilation * add a partition indexer for sessions * Fix oss to not have partition index * fix oss tests * remove unused operations_ent.go and operations_oss.go func * convert `indexNodeCheck` of `session_checks` table * partition `indexID` and `indexSession` of `tableSessionChecks` * remove partition for Checks as it's always use the session partition * partition sessions index id table * fix rebase issues Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2021-11-24 11:34:36 -05:00
Dhia Ayachi	b1c4be3da0	Partition session checks store (#11638 ) * state: port KV and Tombstone tables to new pattern * go fmt'ed * handle wildcards for tombstones * Fix graveyard ent vs oss * fix oss compilation error * add partition to tombstones and kv state store indexes * refactor to use `indexWithEnterpriseIndexable` * Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * add `singleValueID` implementation assertions * partition `tableSessions` table * fix sessions to use UUID and fix prefix index * fix oss build * clean up unused functions * fix oss compilation * add a partition indexer for sessions * Fix oss to not have partition index * fix oss tests * remove unused operations_ent.go and operations_oss.go func * remove unused const * convert `IndexID` of `session_checks` table * convert `indexSession` of `session_checks` table * convert `indexNodeCheck` of `session_checks` table * partition `indexID` and `indexSession` of `tableSessionChecks` * fix oss linter * fix review comments * remove partition for Checks as it's always use the session partition Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2021-11-24 09:10:38 -05:00
Daniel Nephin	07a33a1526	ca: accept only the cluster ID to SpiffeIDSigningForCluster To make it more obivous where ClusterID is used, and remove the need to create a struct when only one field is used.	2021-11-16 16:57:21 -05:00
R.B. Boyer	83bf7ab3ff	re-run gofmt on 1.17 (#11579 ) This should let freshly recompiled golangci-lint binaries using Go 1.17 pass 'make lint'	2021-11-16 12:04:01 -06:00
R.B. Boyer	086ff42b56	partitions: various refactors to support partitioning the serf LAN pool (#11568 )	2021-11-15 09:51:14 -06:00
freddygv	5ac1ab359b	Move assertion to after config fetch	2021-11-10 10:50:08 -07:00
freddygv	2261d51515	Use ClusterID to check for readiness The TrustDomain is populated from the Host() method which includes the hard-coded "consul" domain. This means that despite having an empty cluster ID, the TrustDomain won't be empty.	2021-11-10 10:45:22 -07:00
freddygv	482d3bc610	Prevent replicating partition-exports	2021-11-09 16:42:42 -07:00
freddygv	739490df12	handle error scenario of empty local DC	2021-11-09 16:42:42 -07:00
freddygv	b9b41625b9	Restrict DC for partition-exports writes There are two restrictions: - Writes from the primary DC which explicitly target a secondary DC. - Writes to a secondary DC that do not explicitly target the primary DC. The first restriction is because the config entry is not supported in secondary datacenters. The second restriction is to prevent the scenario where a user writes the config entry to a secondary DC, the write gets forwarded to the primary, but then the config entry does not apply in the secondary. This makes the scope more explicit.	2021-11-09 16:42:42 -07:00
Freddy	0ad360fadf	Merge pull request #11514 from hashicorp/dnephin/ca-fix-secondary-init ca: properly handle the case where the secondary initializes after the primary	2021-11-08 17:16:16 -07:00
freddygv	e6622ab0ab	Avoid returning empty roots with uninitialized CA Currently getCARoots could return an empty object with an empty trust domain before the CA is initialized. This commit returns an error while there is no CA config or no trust domain. There could be a CA config and no trust domain because the CA config can be created in InitializeCA before initialization succeeds.	2021-11-08 16:51:49 -07:00
Dhia Ayachi	f61892393f	refactor session state store tables to use the new index pattern (#11525 ) * state: port KV and Tombstone tables to new pattern * go fmt'ed * handle wildcards for tombstones * Fix graveyard ent vs oss * fix oss compilation error * add partition to tombstones and kv state store indexes * refactor to use `indexWithEnterpriseIndexable` * Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * add `singleValueID` implementation assertions * partition `tableSessions` table * fix sessions to use UUID and fix prefix index * fix oss build * clean up unused functions * fix oss compilation * add a partition indexer for sessions * Fix oss to not have partition index * fix oss tests * remove unused func `prefixIndexFromServiceNameAsString` * fix test error check * remove unused operations_ent.go and operations_oss.go func * remove unused const Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2021-11-08 16:20:50 -05:00
Dhia Ayachi	dfafd4e38c	KV refactoring, part 2 (#11512 ) * add partition to the kv get pretty print * fix failing test * add test for kvs RPC endpoint	2021-11-08 11:43:21 -05:00
Dhia Ayachi	17190c0076	KV state store refactoring and partitioning (#11510 ) * state: port KV and Tombstone tables to new pattern * go fmt'ed * handle wildcards for tombstones * Fix graveyard ent vs oss * fix oss compilation error * add partition to tombstones and kv state store indexes * refactor to use `indexWithEnterpriseIndexable` * partition kvs indexID table * add `partitionedIndexEntryName` in oss for test purpose * Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> * add `singleValueID` implementation assertions * remove entmeta reference from oss Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com> Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2021-11-08 09:35:56 -05:00
Giulio Micheloni	10cdc0a5c8	Merge branch 'main' into serve-panic-recovery	2021-11-06 16:12:06 +01:00
Daniel Nephin	69ad7c0544	ca: Only initialize clusterID in the primary The secondary must get the clusterID from the primary	2021-11-05 18:08:44 -04:00
Daniel Nephin	3173582b75	ca: return an error when secondary fails to initialize Previously secondaryInitialize would return nil in this case, which prevented the deferred initialize from happening, and left the CA in an uninitialized state until a config update or root rotation. To fix this I extracted the common parts into the delegate implementation. However looking at this again, it seems like the handling in secondaryUpdateRoots is impossible, because that function should never be called before the secondary is initialzied. I beleive we can remove some of that logic in a follow up.	2021-11-05 18:02:51 -04:00
Daniel Nephin	db29ad346b	acl: remove id and revision from Policy constructors The fields were removed in a previous commit. Also remove an unused constructor for PolicyMerger	2021-11-05 15:45:08 -04:00
R.B. Boyer	1d8e7bb565	rename helper method to reflect the non-deprecated terminology (#11509 )	2021-11-05 13:51:50 -05:00
FFMMM	27227c0fd2	add root_cert_ttl option for consul connect, vault ca providers (#11428 ) * add root_cert_ttl option for consul connect, vault ca providers Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Chris S. Kim <ckim@hashicorp.com> * add changelog, pr feedback Signed-off-by: FFMMM <FFMMM@users.noreply.github.com> * Update .changelog/11428.txt, more docs Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * Update website/content/docs/agent/options.mdx Co-authored-by: Kyle Havlovitz <kylehav@gmail.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Kyle Havlovitz <kylehav@gmail.com>	2021-11-02 11:02:10 -07:00
Daniel Nephin	00ed2b243f	Merge pull request #10771 from hashicorp/dnephin/emit-telemetry-metrics-immediately telemetry: improve cert expiry metrics	2021-11-01 18:31:03 -04:00
Daniel Nephin	eaaceedf31	Merge pull request #11338 from hashicorp/dnephin/ca-manager-isolate-secondary ca: clearly identify methods that are primary-only or secondary-only	2021-11-01 14:10:31 -04:00
Daniel Upton	a620b6be2e	Support Check-And-Set deletion of config entries (#11419 ) Implements #11372	2021-11-01 16:42:01 +00:00
Dhia Ayachi	4d763ef9e6	regenerate expired certs (#11462 ) * regenerate expired certs * add documentation to generate tests certificates	2021-11-01 11:40:16 -04:00
R.B. Boyer	017e9d5ae4	agent: add a clone function for duplicating the serf lan configuration (#11443 )	2021-10-28 16:11:26 -05:00
Daniel Nephin	0a19d7fd76	agent: move agent tls metric monitor to a more appropriate place And add a test for it	2021-10-27 16:26:09 -04:00
Daniel Nephin	1b2144c982	telemetry: set cert expiry metrics to NaN on start So that followers do not report 0, which would make alerting difficult.	2021-10-27 15:19:25 -04:00
Daniel Nephin	a7fcf14c5c	telemetry: fix cert expiry metrics by removing labels These labels should be set by whatever process scrapes Consul (for prometheus), or by the agent that receives them (for datadog/statsd). We need to remove them here because the labels are part of the "metric key", so we'd have to pre-declare the metrics with the labels. We could do that, but that is extra work for labels that should be added from elsewhere. Also renames the closure to be more descriptive.	2021-10-27 15:19:25 -04:00
Daniel Nephin	4300daa2e6	telemetry: only emit leader cert expiry metrics on the servers	2021-10-27 15:19:25 -04:00
Daniel Nephin	9de725c17d	telemetry: prevent stale values from cert monitors Prometheus scrapes metrics from each process, so when leadership transfers to a different node the previous leader would still be reporting the old cached value. By setting NaN, I believe we should zero-out the value, so that prometheus should only consider the value from the new leader.	2021-10-27 15:19:25 -04:00
Daniel Nephin	616cc9b6f8	telemetry: improve cert expiry metrics Emit the metric immediately so that after restarting an agent, the new expiry time will be emitted. This is particularly important when this metric is being monitored, because we want the alert to resovle itself immediately. Also fixed a bug that was exposed in one of these metrics. The CARoot can be nil, so we have to handle that case.	2021-10-27 15:19:25 -04:00
Daniel Nephin	24951f0c7e	subscribe: attempt to fix a flaky test TestSubscribeBackend_IntegrationWithServer_DeliversAllMessages has been flaking a few times. This commit cleans up the test a bit, and improves the failure output. I don't believe this actually fixes the flake, but I'm not able to reproduce it reliably. The failure appears to be that the event with Port=0 is being sent in both the snapshot and as the first event after the EndOfSnapshot event. Hopefully the improved logging will show us if these are really duplicate events, or actually different events with different indexes.	2021-10-27 15:09:09 -04:00
freddygv	592965d61e	Rework acl exports interface	2021-10-27 12:50:39 -06:00
Freddy	9bbeea0432	Merge pull request #11433 from hashicorp/exported-service-acls [OSS] acl: Expand ServiceRead and NodeRead to account for partition exports	2021-10-27 12:48:08 -06:00
Freddy	d8ae915160	Merge pull request #11431 from hashicorp/ap/exports-proxycfg [OSS] Update partitioned mesh gw handling for connect proxies	2021-10-27 11:27:43 -06:00
Freddy	8e23a6a0cc	Merge pull request #11416 from hashicorp/ap/exports-update Rename service-exports to partition-exports	2021-10-27 11:27:31 -06:00
freddygv	af662c8c1c	Avoid mixing named and unnamed params	2021-10-26 23:42:25 -06:00
freddygv	1de62bb0a2	Avoid passing nil config pointer	2021-10-26 23:42:25 -06:00
freddygv	4a2e40aa3c	Avoid panic on nil partitionAuthorizer config partitionAuthorizer.config can be nil if it wasn't provided on calls to newPartitionAuthorizer outside of the ACLResolver. This usage happens often in tests. This commit: adds a nil check when the config is going to be used, updates non-test usage of NewPolicyAuthorizerWithDefaults to pass a non-nil config, and dettaches setEnterpriseConf from the ACLResolver.	2021-10-26 23:42:25 -06:00
freddygv	015d85cd74	Update NodeRead for partition-exports When issuing cross-partition service discovery requests, ACL filtering often checks for NodeRead privileges. This is because the common return type is a CheckServiceNode, which contains node data.	2021-10-26 23:42:11 -06:00
Kyle Havlovitz	afb0976eac	acl: pass PartitionInfo through ent ACLConfig	2021-10-26 23:41:52 -06:00
Kyle Havlovitz	56d1858c4a	acl: Expand ServiceRead logic to look at service-exports for cross-partition	2021-10-26 23:41:32 -06:00
freddygv	3966677aaf	Finish removing useInDatacenter	2021-10-26 23:36:01 -06:00
freddygv	feaebde1f1	Remove useInDatacenter from disco chain requests useInDatacenter was used to determine whether the mesh gateway mode of the upstream should be returned in the discovery chain target. This commit makes it so that the mesh gateway mode is returned every time, and it is up to the caller to decide whether mesh gateways should be watched or used.	2021-10-26 23:35:21 -06:00
R.B. Boyer	e27e58c6cc	agent: refactor the agent delegate interface to be partition friendly (#11429 )	2021-10-26 15:08:55 -05:00
Chris S. Kim	27f8a85664	agent: Ensure partition is considered in agent endpoints (#11427 )	2021-10-26 15:20:57 -04:00
freddygv	c3e381b4c1	Rename service-exports to partition-exports Existing config entries prefixed by service- are specific to individual services. Since this config entry applies to partitions it is being renamed. Additionally, the Partition label was changed to Name because using Partition at the top-level and in the enterprise meta was leading to the enterprise meta partition being dropped by msgpack.	2021-10-25 17:58:48 -06:00
Daniel Nephin	f24bad2a52	Merge pull request #11232 from hashicorp/dnephin/acl-legacy-remove-docs acl: add docs and changelog for the removal of the legacy ACL system	2021-10-25 18:38:00 -04:00
Daniel Nephin	f7cdd210fe	Update agent/consul/acl_client.go Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2021-10-25 17:25:14 -04:00
Daniel Nephin	732b841dd7	state: remove support for updating legacy ACL tokens	2021-10-25 17:25:14 -04:00
Daniel Nephin	76b007dacd	acl: remove init check for legacy anon token This token should always already be migrated from a previous version.	2021-10-25 17:25:14 -04:00
Daniel Nephin	8ae6ee4e36	acl: remove legacy parameter to ACLDatacenter It is no longer used now that legacy ACLs have been removed.	2021-10-25 17:25:14 -04:00
Daniel Nephin	d778113773	acl: remove ACLTokenTypeManagement	2021-10-25 17:25:14 -04:00
Daniel Nephin	88c6aeea34	acl: remove legacy arg to store.ACLTokenSet And remove the tests for legacy=true	2021-10-25 17:25:14 -04:00
Daniel Nephin	b31a7fc498	acl: remove EmbeddedPolicy This method is no longer. It only existed for legacy tokens, which are no longer supported.	2021-10-25 17:25:14 -04:00
Daniel Nephin	ceaa36f983	acl: remove tests for resolving legacy tokens The code for this was already removed, which suggests this is not actually testing what it claims. I'm guessing these are still resolving because the tokens are converted to non-legacy tokens?	2021-10-25 17:25:14 -04:00
Daniel Nephin	a46e3bd2fc	acl: stop replication on leadership lost It seems like this was missing. Previously this was only called by init of ACLs during an upgrade. Now that legacy ACLs are removed, nothing was calling stop. Also remove an unused method from client.	2021-10-25 17:24:12 -04:00
Daniel Nephin	15cd8c7ab8	Remove incorrect TODO	2021-10-25 17:20:06 -04:00
Daniel Nephin	589b238374	acl: move the legacy ACL struct to the one package where it is used It is now only used for restoring snapshots. We can remove it in phase 2.	2021-10-25 17:20:06 -04:00
Daniel Nephin	0ba5d0afcd	acl: remove most of the rest of structs/acl_legacy.go	2021-10-25 17:20:06 -04:00
FFMMM	6433a57d3c	fix autopilot_failure_tolerance, add autopilot metrics test case (#11399 ) Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>	2021-10-25 10:55:59 -07:00
Dhia Ayachi	75f69a98a2	fix leadership transfer on leave suggestions (#11387 ) * add suggestions * set isLeader to false when leadership transfer succeed	2021-10-21 14:02:26 -04:00
Dhia Ayachi	2d1ac1f7d0	try to perform a leadership transfer when leaving (#11376 ) * try to perform a leadership transfer when leaving * add a changelog	2021-10-21 12:44:31 -04:00
Kyle Havlovitz	752a285552	Add new service-exports config entry	2021-10-20 12:24:18 -07:00
Giulio Micheloni	10814d934e	Merge branch 'main' of https://github.com/hashicorp/consul into hashicorp-main	2021-10-16 16:59:32 +01:00

... 3 4 5 6 7 ...

1931 Commits