open-consul

Author	SHA1	Message	Date
Daniel Nephin	ca4e60e09b	Update TODOs to reference an issue with more details And remove a no longer needed TODO	2022-02-17 18:21:30 -05:00
Daniel Nephin	471b2098bb	ca: examine the full chain in newCARoot make TestNewCARoot much more strict compare the full result instead of only a few fields. add a test case with 2 and 3 certificates in the pem	2022-02-17 18:21:30 -05:00
Daniel Nephin	6721c1246d	ca: relax and move private key type/bit validation for vault This commit makes two changes to the validation. Previously we would call this validation in GenerateRoot, which happens both on initialization (when a follower becomes leader), and when a configuration is updated. We only want to do this validation during config update so the logic was moved to the UpdateConfiguration function. Previously we would compare the config values against the actual cert. This caused problems when the cert was created manually in Vault (not created by Consul). Now we compare the new config against the previous config. Using a already created CA cert should never error now. Adding the key bit and types to the config should only error when the previous values were not the defaults.	2022-02-03 17:21:20 -05:00
Daniel Nephin	44f9229b96	ca: add a test that uses an intermediate CA as the primary CA This test found a bug in the secondary. We were appending the root cert to the PEM, but that cert was already appended. This was failing validation in Vault here: https://github.com/hashicorp/vault/blob/sdk/v0.3.0/sdk/helper/certutil/types.go#L329 Previously this worked because self signed certs have the same SubjectKeyID and AuthorityKeyID. So having the same self-signed cert repeated doesn't fail that check. However with an intermediate that is not self-signed, those values are different, and so we fail the check. A test I added in a previous commit should show that this continues to work with self-signed root certs as well.	2022-02-02 13:41:35 -05:00
Daniel Nephin	fa8ff28a63	ca/provider: remove ActiveRoot from Provider	2022-01-27 13:07:37 -05:00
Daniel Nephin	262898e561	ca: remove unnecessary var, and slightly reduce cyclo complexity `newIntermediate` is always equal to `needsNewIntermediate`, so we can remove the extra variable and use the original directly. Also remove the `activeRoot.ID != newActiveRoot.ID` case from an if, because that case is already checked above, and `needsNewIntermediate` will already be true in that case. This condition now reads a lot better: > Persist a new root if we did not have one before, or if generated a new intermediate.	2022-01-06 16:56:49 -05:00
Daniel Nephin	d406f78c5c	ca: remove unused provider.ActiveRoot call In the previous commit the single use of this storedRoot was removed. In this commit the original objective is completed. The Provider.ActiveRoot is being removed because 1. the secondary should get the active root from the Consul primary DC, not the provider, so that secondary DCs do not need to communicate with a provider instance in a different DC. 2. so that the Provider.ActiveRoot interface can be changed without impacting other code paths.	2022-01-06 16:56:48 -05:00
Daniel Nephin	4d15e8a9ec	ca: extract the lookup of the active primary CA This method had only one caller, which always looked for the active root. This commit moves the lookup into the method to reduce the logic in the one caller. This is being done in preparation for a larger change. Keeping this separate so it is easier to see. The `storedRootID != primaryRoots.ActiveRootID` is being removed because these can never be different. The `storedRootID` comes from `provider.ActiveRoot`, the `primaryRoots.ActiveRootID` comes from the store `CARoot` from the primary. In both cases the source of the data is the primary DC. Technically they could be different if someone modified the provider outside of Consul, but that would break many things, so is not a supported flow. If these were out of sync because of ordering of events then the secondary will soon receive an update to `primaryRoots` and everything will be sorted out again.	2022-01-06 16:56:48 -05:00
Daniel Nephin	37b09df427	ca: update godoc To clarify what to expect from the data stored in this field, and the behaviour of this function.	2022-01-06 16:56:48 -05:00
Daniel Nephin	1f670c22f5	ca: remove one call to provider.ActiveRoot ActiveRoot should not be called from the secondary DC, because there should not be a requirement to run the same Vault instance in a secondary DC. SignIntermediate is called in a secondary DC, so it should not call ActiveRoot We would also like to change the interface of ActiveRoot so that we can support using an intermediate cert as the primary CA in Consul. In preparation for making that change I am reducing the number of calls to ActiveRoot, so that there are fewer code paths to modify when the interface changes. This change required a change to the mockCAServerDelegate we use in tests. It was returning the RootCert for SignIntermediate, but that is not an accurate fake of production. In production this would also be a separate cert.	2022-01-06 16:55:50 -05:00
Daniel Nephin	1f66120c20	ca: remove redundant append of an intermediate cert Immediately above this line we are already appending the full list of intermediates. The `provider.ActiveIntermediate` MUST be in this list of intermediates because it must be available to all the other non-leader Servers. If it was not in this list of intermediates then any proxy that received data from a non-leader would have the wrong certs. This is being removed now because we are planning on changing the `Provider.ActiveIntermediate` interface, and removing these extra calls ahead of time helps make that change easier.	2022-01-06 16:55:50 -05:00
Daniel Nephin	984986f007	ca: fix flakes in RenewIntermediate tests I suspect one problem was that we set structs.IntermediateCertRenewInterval to 1ms, which meant that in some cases the intermediate could renew before we stored the original value. Another problem was that the 'wait for intermediate' loop was calling the provider.ActiveIntermediate, but the comparison needs to use the RPC endpoint to accurately represent a user request. So changing the 'wait for' to use the state store ensures we don't race. Also moves the patching into a separate function. Removes the addition of ca.CertificateTimeDriftBuffer as part of calculating halfTime. This was added in a previous commit to attempt to fix the flake, but it did not appear to fix the problem. Adding the time here was making the tests fail when using the shared patch function. It's not clear to me why, but there's no reason we should be including this time in the halfTime calculation.	2021-12-08 18:42:52 -05:00
Daniel Nephin	fa32c78429	ca: set the correct SigningKeyID after config update with Vault provider The test added in this commit shows the problem. Previously the SigningKeyID was set to the RootCert not the local leaf signing cert. This same bug was fixed in two other places back in 2019, but this last one was missed. While fixing this bug I noticed I had the same few lines of code in 3 places, so I extracted a new function for them. There would be 4 places, but currently the InitializeCA flow sets this SigningKeyID in a different way, so I've left that alone for now.	2021-12-02 16:07:11 -05:00
Daniel Nephin	a0014e13fd	Merge pull request #11713 from hashicorp/dnephin/ca-test-names ca: make test naming consistent	2021-12-02 16:05:42 -05:00
Daniel Nephin	720d782225	Merge pull request #11671 from hashicorp/dnephin/ca-fix-storing-vault-intermediate ca: fix storing the leaf signing cert with Vault provider	2021-12-02 16:02:24 -05:00
Daniel Nephin	c1cb77b829	ca: make test naming consistent While working on the CA system it is important to be able to run all the tests related to the system, without having to wait for unrelated tests. There are many slow and unrelated tests in agent/consul, so we need some way to filter to only the relevant tests. This PR renames all the CA system related tests to start with either `TestCAMananger` for tests of internal operations that don't have RPC endpoint, or `TestConnectCA` for tests of RPC endpoints. This allows us to run all the test with: go test -run 'TestCAMananger\|TestConnectCA' ./agent/consul The test naming follows an undocumented convention of naming tests as follows: Test[<struct name>_]<function name>[_<test case description>] I tried to always keep Primary/Secondary at the end of the description, and _Vault_ has to be in the middle because of our regex to run those tests as a separate CI job. You may notice some of the test names changed quite a bit. I did my best to identify the underlying method being tested, but I may have been slightly off in some cases.	2021-12-02 14:57:09 -05:00
Daniel Nephin	460f8919c9	ca: make getLeafSigningCertFromRoot safer As a method on the struct type this would not be safe to call without first checking c.isIntermediateUsedToSignLeaf. So for now, move this logic to the CAMananger, so that it is always correct.	2021-12-02 12:42:49 -05:00
Daniel Nephin	64532ef636	ca: fix stored CARoot representation with Vault provider We were not adding the local signing cert to the CARoot. This commit fixes that bug, and also adds support for fixing existing CARoot on upgrade. Also update the tests for both primary and secondary to be more strict. Check the SigningKeyID is correct after initialization and rotation.	2021-12-02 12:42:49 -05:00
Daniel Nephin	963a9819d0	ca: add some godoc and func for finding leaf signing cert This will be used in a follow up commit.	2021-11-30 18:36:41 -05:00
Daniel Nephin	772d8f7381	ca: clean up unnecessary raft.Apply response checking In d2ab767fef21244e9fe3b9887ea70fc177912381 raftApply was changed to handle this check in a single place, instad of having every caller check it. It looks like these few places were missed when I did that clean up. This commit removes the remaining resp.(error) checks, since they are all no-ops now.	2021-11-26 17:57:55 -05:00
Daniel Nephin	48954adfdc	Merge pull request #11339 from hashicorp/dnephin/ca-manager-isolate-secondary-2 ca: reduce use of state in the secondary	2021-11-26 14:41:45 -05:00
Daniel Nephin	8240286956	ca: remove state check in secondarySetPrimaryRoots This function is only ever called from operations that have already acquired the state lock, so checking the value of state can never fail. This change is being made in preparation for splitting out a separate type for the secondary logic. The state can't easily be shared, so really only the expored top-level functions should acquire the 'state lock'.	2021-11-26 14:14:47 -05:00
Daniel Nephin	877094e2fa	ca: remove actingSecondaryCA This commit removes the actingSecondaryCA field, and removes the stateLock around it. This field was acting as a proxy for providerRoot != nil, so replace it with that check instead. The two methods which called secondarySetCAConfigured already set the state, so checking the state again at this point will not catch runtime errors (only programming errors, which we can catch with tests). In general, handling state transitions should be done on the "entrypoint" methods where execution starts, not in every internal method. This is being done to remove some unnecessary references to c.state, in preparations for extracting types for primary/secondary.	2021-11-26 14:14:47 -05:00
Daniel Nephin	cd5f6b2dfb	ca: reduce consul provider backend interface a bit This makes it easier to fake, which will allow me to use the ConsulProvider as an 'external PKI' to test a customer setup where the actual root CA is not the root we use for the Consul CA. Replaces a call to the state store to fetch the clusterID with the clusterID field already available on the built-in provider.	2021-11-25 11:46:06 -05:00
Daniel Nephin	07a33a1526	ca: accept only the cluster ID to SpiffeIDSigningForCluster To make it more obivous where ClusterID is used, and remove the need to create a struct when only one field is used.	2021-11-16 16:57:21 -05:00
Daniel Nephin	69ad7c0544	ca: Only initialize clusterID in the primary The secondary must get the clusterID from the primary	2021-11-05 18:08:44 -04:00
Daniel Nephin	3173582b75	ca: return an error when secondary fails to initialize Previously secondaryInitialize would return nil in this case, which prevented the deferred initialize from happening, and left the CA in an uninitialized state until a config update or root rotation. To fix this I extracted the common parts into the delegate implementation. However looking at this again, it seems like the handling in secondaryUpdateRoots is impossible, because that function should never be called before the secondary is initialzied. I beleive we can remove some of that logic in a follow up.	2021-11-05 18:02:51 -04:00
Daniel Nephin	eaaceedf31	Merge pull request #11338 from hashicorp/dnephin/ca-manager-isolate-secondary ca: clearly identify methods that are primary-only or secondary-only	2021-11-01 14:10:31 -04:00
freddygv	bdf3e951f8	Ensure partition is handled by auto-encrypt	2021-10-14 08:32:45 -06:00
Daniel Nephin	571acb872e	ca: extract primaryUpdateRootCA This function is only run when the CAManager is a primary. Extracting this function makes it clear which parts of UpdateConfiguration are run only in the primary and also makes the cleanup logic simpler. Instead of both a defer and a local var we can call the cleanup function in two places.	2021-10-10 15:26:55 -04:00
Daniel Nephin	a65594d8ec	ca: rename functions to use a primary or secondary prefix This commit renames functions to use a consistent pattern for identifying the functions that can only be called when the Manager is run as the primary or secondary. This is a step toward eventually creating separate types and moving these methods off of CAManager.	2021-10-10 15:26:55 -04:00
Daniel Nephin	20f0efd8c1	ca: make receiver variable name consistent Every other method uses c not ca	2021-10-10 15:26:55 -04:00
R.B. Boyer	6b5a58de50	acl: some acl authz refactors for nodes (#10909 )	2021-08-25 13:43:11 -05:00
Dhia Ayachi	40baf98159	defer setting the state before returning to avoid stuck in `INITIALIZING` state (#10630 ) * defer setting the state before returning to avoid being stuck in `INITIALIZING` state * add changelog * move comment with the right if statement * ca: report state transition error from setSTate * update comment to reflect state transition Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2021-08-05 14:51:19 -04:00
R.B. Boyer	62ac98b564	agent/structs: add a bunch more EnterpriseMeta helper functions to help with partitioning (#10669 )	2021-07-22 13:20:45 -05:00
Dhia Ayachi	53b45a8441	check expiry date of the root/intermediate before using it to sign a leaf (#10500 ) * ca: move provider creation into CAManager This further decouples the CAManager from Server. It reduces the interface between them and removes the need for the SetLogger method on providers. * ca: move SignCertificate to CAManager To reduce the scope of Server, and keep all the CA logic together * ca: move SignCertificate to the file where it is used * auto-config: move autoConfigBackend impl off of Server Most of these methods are used exclusively for the AutoConfig RPC endpoint. This PR uses a pattern that we've used in other places as an incremental step to reducing the scope of Server. * fix linter issues * check error when `raftApplyMsgpack` * ca: move SignCertificate to CAManager To reduce the scope of Server, and keep all the CA logic together * check expiry date of the intermediate before using it to sign a leaf * fix typo in comment Co-authored-by: Kyle Havlovitz <kylehav@gmail.com> * Fix test name * do not check cert start date * wrap error to mention it is the intermediate expired * Fix failing test * update comment Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * use shim to avoid sleep in test * add root cert validation * remove duplicate code * Revert "fix linter issues" This reverts commit 6356302b54f06c8f2dee8e59740409d49e84ef24. * fix import issue * gofmt leader_connect_ca * add changelog entry * update error message Co-authored-by: Freddy <freddygv@users.noreply.github.com> * fix error message in test Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Kyle Havlovitz <kylehav@gmail.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2021-07-13 12:15:06 -04:00
Daniel Nephin	58cf5767a8	Merge pull request #10479 from hashicorp/dnephin/ca-provider-explore-2 ca: move Server.SignIntermediate to CAManager	2021-07-12 19:03:43 -04:00
Daniel Nephin	fdb0ba8041	ca: use provider constructors to be more consistent Adds a contructor for the one provider that did not have one.	2021-07-12 14:04:34 -04:00
Dhia Ayachi	3eac4ffda4	check error when `raftApplyMsgpack`	2021-07-12 13:42:51 -04:00
Daniel Nephin	34c8585b29	auto-config: move autoConfigBackend impl off of Server Most of these methods are used exclusively for the AutoConfig RPC endpoint. This PR uses a pattern that we've used in other places as an incremental step to reducing the scope of Server.	2021-07-12 13:42:40 -04:00
Daniel Nephin	605275b4dc	ca: move SignCertificate to the file where it is used	2021-07-12 13:42:39 -04:00
Daniel Nephin	c2e85f25d4	ca: move SignCertificate to CAManager To reduce the scope of Server, and keep all the CA logic together	2021-07-12 13:42:39 -04:00
Dhia Ayachi	a0320169fe	add missing state reset when stopping ca manager	2021-07-12 09:32:36 -04:00
Daniel Nephin	68d5f7769a	ca: fix mockCAServerDelegate to work with the new interface raftApply was removed so ApplyCARequest needs to handle all the possible operations Also set the providerShim to use the mock provider. other changes are small test improvements that were necessary to debug the failures.	2021-07-12 09:32:36 -04:00
Daniel Nephin	6d4b0ce194	ca: remove unused method and small refactor to getCAProvider so that GoLand is less confused about what it is doing. Previously it was reporting that the for condition was always true, which was not the case.	2021-07-12 09:32:35 -04:00
Daniel Nephin	4330122d9a	ca: remove raftApply from delegate interface After moving ca.ConsulProviderStateDelegate into the interface we now have the ApplyCARequest method which does the same thing. Use this more specific method instead of raftApply.	2021-07-12 09:32:35 -04:00
Daniel Nephin	fae0a8f851	ca: move generateCASignRequest to the delegate This method on Server was only used by the caDelegateWithState, so move it there until we can move it entirely into CAManager.	2021-07-12 09:32:35 -04:00
Daniel Nephin	d4bb9fd97a	ca: move provider creation into CAManager This further decouples the CAManager from Server. It reduces the interface between them and removes the need for the SetLogger method on providers.	2021-07-12 09:32:33 -04:00
Daniel Nephin	fc629d9eaa	ca-manager: move provider shutdown into CAManager Reducing the coupling between Server and CAManager	2021-07-12 09:27:28 -04:00
Daniel Nephin	72b30174fa	ca: replace ca.PrimaryIntermediateProviders With an optional interface that providers can use to indicate if they use an intermediate cert in the primary DC. This removes the need to look up the provider config when renewing the intermediate.	2021-06-23 15:47:30 -04:00

1 2

62 commits