Commit Graph

437 Commits

Author SHA1 Message Date
Kyle Havlovitz 880eccb502
fsm: add missing CA config to snapshot/restore logic 2018-08-16 11:58:50 -07:00
Kyle Havlovitz fd83063686
autopilot: don't follow the normal server removal rules for nonvoters 2018-08-14 14:24:51 -07:00
Kyle Havlovitz aa19559cc7
Fix stats fetcher healthcheck RPCs not being independent 2018-08-14 14:23:52 -07:00
Pierre Souchay a16f34058b Display more information about check being not properly added when it fails (#4405)
* Display more information about check being not properly added when it fails

It follows an incident where we add lots of error messages:

  [WARN] consul.fsm: EnsureRegistration failed: failed inserting check: Missing service registration

That seems related to Consul failing to restart on respective agents.

Having Node information as well as service information would help diagnose the issue.

* Renamed ensureCheckIfNodeMatches() as requested by @banks
2018-08-14 17:45:33 +01:00
Pierre Souchay 821a91ca31 Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415)
* Allow to rename nodes with IDs, will fix #3974 and #4413

This change allow to rename any well behaving recent agent with an
ID to be renamed safely, ie: without taking the name of another one
with case insensitive comparison.

Deprecated behaviour warning
----------------------------

Due to asceding compatibility, it is still possible however to
"take" the name of another name by not providing any ID.

Note that when not providing any ID, it is possible to have 2 nodes
having similar names with case differences, ie: myNode and mynode
which might lead to DB corruption on Consul server side and
lead to server not properly restarting.

See #3983 and #4399 for Context about this change.

Disabling registration of nodes without IDs as specified in #4414
should probably be the way to go eventually.

* Removed the case-insensitive search when adding a node within the else
block since it breaks the test TestAgentAntiEntropy_Services

While the else case is probably legit, it will be fixed with #4414 in
a later release.

* Added again the test in the else to avoid duplicated names, but
enforce this test only for nodes having IDs.

Thus most tests without any ID will work, and allows us fixing

* Added more tests regarding request with/without IDs.

`TestStateStore_EnsureNode` now test registration and renaming with IDs

`TestStateStore_EnsureNodeDeprecated` tests registration without IDs
and tests removing an ID from a node as well as updated a node
without its ID (deprecated behaviour kept for backwards compatibility)

* Do not allow renaming in case of conflict, including when other node has no ID

* Fixed function GetNodeID that was not working due to wrong type when searching node from its ID

Thus, all tests about renaming were not working properly.

Added the full test cas that allowed me to detect it.

* Better error messages, more tests when nodeID is not a valid UUID in GetNodeID()

* Added separate TestStateStore_GetNodeID to test GetNodeID.

More complete test coverage for GetNodeID

* Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn`

Also fixed comments to be clearer after remarks from @banks

* Fixed error message in unit test to match test case

* Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler
2018-08-10 11:30:45 -04:00
Siva Prasad d98d02777f
PR to fix TestAgent_IndexChurn and TestPreparedQuery_Wrapper. (#4512)
* Fixes TestAgent_IndexChurn

* Fixes TestPreparedQuery_Wrapper

* Increased sleep in agent_test for IndexChurn to 500ms

* Made the comment about joinWAN operation much less of a cliffhanger
2018-08-09 12:40:07 -04:00
Armon Dadgar a343392f63 consul: Update buffer sizes 2018-08-08 10:26:58 -07:00
Siva Prasad cfa436dc16
Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix." (#4497)
* Revert "BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)"

This reverts commit cec5d7239621e0732b3f70158addb1899442acb3.

* Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)"

This reverts commit 589b589b53e56af38de25db9b56967bdf1f2c069.
2018-08-07 08:29:48 -04:00
Pierre Souchay fd927ea110 BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)
- Improve resilience of testrpc.WaitForLeader()

- Add additionall retry to CI

- Increase "go test" timeout to 8m

- Add wait for cluster leader to several tests in the agent package

- Add retry to some tests in the api and command packages
2018-08-06 19:46:09 -04:00
Siva Prasad 29c181f5fa
CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)
* connect: fix an issue with Consul CA bootstrapping being interrupted

* streamline change server id test
2018-08-06 16:15:24 -04:00
Kyle Havlovitz 42ab07b398
fix inconsistency in TestConnectCAConfig_GetSet 2018-07-26 07:46:47 -07:00
Kyle Havlovitz ecc02c6aee
Merge pull request #4400 from hashicorp/leaf-cert-ttl
Add configurable leaf cert TTL to Connect CA
2018-07-25 17:53:25 -07:00
Paul Banks 217137b775
Fixes #4421: General solution to stop blocking queries with index 0 (#4437)
* Fix theoretical cache collision bug if/when we use more cache types with same result type

* Generalized fix for blocking query handling when state store methods return zero index

* Refactor test retry to only affect CI

* Undo make file merge

* Add hint to error message returned to end-user requests if Connect is not enabled when they try to request cert

* Explicit error for Roots endpoint if connect is disabled

* Fix tests that were asserting old behaviour
2018-07-25 20:26:27 +01:00
Kyle Havlovitz a125735d76
connect/ca: check LeafCertTTL when rotating expired roots 2018-07-20 16:04:04 -07:00
Kyle Havlovitz 45ec8849f3
connect/ca: add configurable leaf cert TTL 2018-07-16 13:33:37 -07:00
Matt Keeler cc46d59269
Merge pull request #4379 from hashicorp/persist-intermediates
connect: persist intermediate CAs on leader change
2018-07-12 12:09:13 -04:00
Matt Keeler 965fc9cf62
Revert "Allow changing Node names since Node now have IDs" 2018-07-12 11:19:21 -04:00
Matt Keeler d8a4d9137b Fixup formatting 2018-07-12 10:14:26 -04:00
Matt Keeler d63c5807cf Revert PR 4294 - Catalog Register: Generate UUID for services registered without one
UUID auto-generation here causes trouble in a few cases. The biggest being older
nodes reregistering will fail when the UUIDs are different and the names match

This reverts commit 0f700340828f464449c2e0d5a82db0bc5456d385.
This reverts commit d1a8f9cb3f6f48dd9c8d0bc858031ff6ccff51d0.
This reverts commit cf69ec42a418ab6594a6654e9545e12160f30970.
2018-07-12 10:06:50 -04:00
Kyle Havlovitz 2a40f93ac8
connect: use reflect.DeepEqual instead for test 2018-07-11 13:10:58 -07:00
Matt Keeler 42729d5aff
Merge pull request #3983 from pierresouchay/node_renaming
Allow changing Node names since Node now have IDs
2018-07-11 16:03:02 -04:00
Kyle Havlovitz f9a35a9338
connect: add provider state to snapshots 2018-07-11 11:34:49 -07:00
Kyle Havlovitz 9c21cc7ac9
connect: update leader initializeCA comment 2018-07-11 10:00:42 -07:00
Kyle Havlovitz db254f0991
connect: persist intermediate CAs on leader change 2018-07-11 09:44:30 -07:00
Pierre Souchay 3d0a960470 When renaming a node, ensure the name is not taken by another node.
Since DNS is case insensitive and DB as issues when similar names with different
cases are added, check for unicity based on case insensitivity.

Following another big incident we had in our cluster, we also validate
that adding/renaming a not does not conflicts with case insensitive
matches.

We had the following error once:

 - one node called: mymachine.MYDC.mydomain was shut off
 - another node (different ID) was added with name: mymachine.mydc.mydomain before
   72 hours

When restarting the consul server of domain, the consul server restarted failed
to start since it detected an issue in RAFT database because
mymachine.MYDC.mydomain and mymachine.mydc.mydomain had the same names.

Checking at registration time with case insensitivity should definitly fix
those issues and avoid Consul DB corruption.
2018-07-11 14:42:54 +02:00
Matt Keeler 22c5951ec4
Merge pull request #4303 from pierresouchay/non_blocking_acl
Only send one single ACL cache refresh across network when TTL is over
2018-07-10 08:57:33 -04:00
MagnumOpus21 0b50b84429 Agent/Proxy: Formatting and test cases fix 2018-07-09 12:46:10 -04:00
Kyle Havlovitz 883b2a518a
Store the time CARoot is rotated out instead of when to prune 2018-07-06 16:05:25 -07:00
Kyle Havlovitz 3c520019e9
connect/ca: add logic for pruning old stale RootCA entries 2018-07-02 10:35:05 -07:00
Pierre Souchay 95a0ab9f99 Updated swith case to use same branch for async-cache and extend-cache 2018-07-02 17:39:34 +02:00
Pierre Souchay 6dfbbf1350 Updated documentation and adding more test case for async-cache 2018-07-01 23:50:30 +02:00
Pierre Souchay 382bec0897 Added async-cache with similar behaviour as extend-cache but asynchronously 2018-07-01 23:50:30 +02:00
Pierre Souchay da9c91fd3d Only send one single ACL cache refresh across network when TTL is over
It will allow the following:

 * when connectivity is limited (saturated linnks between DCs), only one
   single request to refresh an ACL will be sent to ACL master DC instead
   of statcking ACL refresh queries
 * when extend-cache is used for ACL, do not wait for result, but refresh
   the ACL asynchronously, so no delay is not impacting slave DC
 * When extend-cache is not used, keep the existing blocking mechanism,
   but only send a single refresh request.

This will fix https://github.com/hashicorp/consul/issues/3524
2018-07-01 23:50:30 +02:00
Matt Keeler 02719c52ff
Move starting enterprise functionality 2018-06-29 17:38:29 -04:00
Matt Keeler 66af873639 Move default uuid test into the consul package 2018-06-27 09:21:58 -04:00
Matt Keeler dbc407cec9 go fmt changes 2018-06-27 09:07:22 -04:00
Matt Keeler 95291ec5ed Make sure to generate UUIDs when services are registered without one
This makes the behavior line up with the docs and expected behavior
2018-06-26 17:04:08 -04:00
mkeeler 1da3c42867 Merge remote-tracking branch 'connect/f-connect' 2018-06-25 19:42:51 +00:00
Kyle Havlovitz 859eaea5c4 connect/ca: pull the cluster ID from config during a rotation 2018-06-25 12:25:42 -07:00
Kyle Havlovitz fcc5dc6110 connect/ca: leave blank root key/cert out of the default config (unnecessary) 2018-06-25 12:25:42 -07:00
Kyle Havlovitz f3089a6647 connect/ca: undo the interface changes and use sign-self-issued in Vault 2018-06-25 12:25:42 -07:00
Kyle Havlovitz cea94d0bcf connect/ca: update Consul provider to use new cross-sign CSR method 2018-06-25 12:25:41 -07:00
Kyle Havlovitz 7b0845ccde connect/ca: fix vault provider URI SANs and test 2018-06-25 12:25:41 -07:00
Kyle Havlovitz a98b85b25c connect/ca: add the Vault CA provider 2018-06-25 12:25:41 -07:00
Paul Banks b4fbeb0453 Note leadership issues in comments 2018-06-25 12:25:41 -07:00
Paul Banks 824a9b4943 Actually return Intermediate certificates bundled with a leaf! 2018-06-25 12:25:40 -07:00
Paul Banks 81bd1b43a3 Fix hot loop in cache for RPC returning zero index. 2018-06-25 12:25:37 -07:00
Paul Banks d6b13463ed Fix misc test failures (some from other PRs) 2018-06-25 12:25:13 -07:00
Paul Banks 1283373a64 Only set precedence on write path 2018-06-25 12:25:13 -07:00
Paul Banks 22b95283e9 Fix some tests failures caused by the sorting change and some cuased by previous UpdatePrecedence() change 2018-06-25 12:25:13 -07:00
Paul Banks e2938138f6 Sort intention list by precedence 2018-06-25 12:25:13 -07:00
Kyle Havlovitz a242e5b130 agent: update accepted CA config fields and defaults 2018-06-25 12:25:09 -07:00
Mitchell Hashimoto 4ebddd6adb agent/consul: set precedence value on struct itself 2018-06-25 12:24:16 -07:00
Mitchell Hashimoto 52c10d2208 agent/consul: support a Connect option on prepared query request 2018-06-25 12:24:12 -07:00
Mitchell Hashimoto e8c899b1b8 agent/consul: prepared query supports "Connect" field 2018-06-25 12:24:11 -07:00
Mitchell Hashimoto ad382d7351 agent: switch ConnectNative to an embedded struct 2018-06-25 12:24:10 -07:00
Mitchell Hashimoto a3e0ac1ee3 agent/consul/state: support querying by Connect native 2018-06-25 12:24:08 -07:00
Mitchell Hashimoto 8e02bbc897 agent/consul: support catalog registration with Connect native 2018-06-25 12:24:07 -07:00
Matt Keeler 6ccc4f39db
Merge pull request #4216 from hashicorp/rpc-limiting
Make RPC limits reloadable
2018-06-20 09:05:28 -04:00
Mitchell Hashimoto 787ce3b269
agent: address feedback 2018-06-14 09:42:20 -07:00
Mitchell Hashimoto b5b29cd6af
agent: rename test to check 2018-06-14 09:42:18 -07:00
Mitchell Hashimoto a48ff54318
agent/consul: forward request if necessary 2018-06-14 09:42:17 -07:00
Mitchell Hashimoto b02502be73
agent: comments to point to differing logic 2018-06-14 09:42:17 -07:00
Mitchell Hashimoto 526cfc34bd
agent/consul: implement Intention.Test endpoint 2018-06-14 09:42:17 -07:00
Paul Banks bd5eb8b749
Add default CA config back - I didn't add it and causes nil panics 2018-06-14 09:42:17 -07:00
Paul Banks dbcf286d4c
Ooops remove the CA stuff from actual server defaults and make it test server only 2018-06-14 09:42:16 -07:00
Paul Banks 834ed1d25f
Fixed many tests after rebase. Some still failing and seem unrelated to any connect changes. 2018-06-14 09:42:16 -07:00
Paul Banks 30d90b3be4
Generate CSR using real trust-domain 2018-06-14 09:42:16 -07:00
Paul Banks 5a1408f186
Add CSR signing verification of service ACL, trust domain and datacenter. 2018-06-14 09:42:16 -07:00
Paul Banks c808833a78
Return TrustDomain from CARoots RPC 2018-06-14 09:42:15 -07:00
Kyle Havlovitz d1265bc38b
Rename some of the CA structs/files 2018-06-14 09:42:15 -07:00
Kyle Havlovitz 1660f9ebab
Add more metadata to structs.CARoot 2018-06-14 09:42:15 -07:00
Kyle Havlovitz baf4db1c72
Use provider state table for a global serial index 2018-06-14 09:42:15 -07:00
Kyle Havlovitz c90b353eea
Move connect CA provider to separate package 2018-06-14 09:42:15 -07:00
Mitchell Hashimoto 54a1662da8
agent/consul: change provider wait from goto to a loop 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto 749f81373f
agent/consul: check nil on getCAProvider result 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto c57405b323
agent/consul: retry reading provider a few times 2018-06-14 09:42:14 -07:00
Paul Banks dcd277de8a
Wire up agent leaf endpoint to cache framework to support blocking. 2018-06-14 09:42:07 -07:00
Kyle Havlovitz b28e11fdd3
Fill out connect CA rpc endpoint tests 2018-06-14 09:42:06 -07:00
Kyle Havlovitz 7c0976208d
Add tests for the built in CA's state store table 2018-06-14 09:42:06 -07:00
Kyle Havlovitz 19b9399f2f
Add more tests for built-in provider 2018-06-14 09:42:06 -07:00
Kyle Havlovitz a29f3c6b96
Fix some inconsistencies around the CA provider code 2018-06-14 09:42:06 -07:00
Kyle Havlovitz 2167713226
Add CA config to connect section of agent config 2018-06-14 09:42:05 -07:00
Kyle Havlovitz 02fef5f9a2
Move ConsulCAProviderConfig into structs package 2018-06-14 09:42:04 -07:00
Kyle Havlovitz 887cc98d7e
Simplify the CAProvider.Sign method 2018-06-14 09:42:04 -07:00
Kyle Havlovitz 44b30476cb
Simplify the CA provider interface by moving some logic out 2018-06-14 09:42:04 -07:00
Kyle Havlovitz aa10fb2f48
Clarify some comments and names around CA bootstrapping 2018-06-14 09:42:04 -07:00
Kyle Havlovitz 43f13d5a0b
Add cross-signing mechanism to root rotation 2018-06-14 09:42:00 -07:00
Kyle Havlovitz bbfcb278e1
Add the root rotation mechanism to the CA config endpoint 2018-06-14 09:41:59 -07:00
Kyle Havlovitz a585a0ba10
Have the built in CA store its state in raft 2018-06-14 09:41:59 -07:00
Kyle Havlovitz 80eddb0bfb
Fix the testing endpoint's root set op 2018-06-14 09:41:59 -07:00
Kyle Havlovitz fc9ef9741b
Hook the CA RPC endpoint into the provider interface 2018-06-14 09:41:59 -07:00
Kyle Havlovitz a40db26ffe
Add CA bootstrapping on establishing leadership 2018-06-14 09:41:59 -07:00
Kyle Havlovitz e26819ed9c
Add the bootstrap config for the CA 2018-06-14 09:41:59 -07:00
Kyle Havlovitz ebdda17a30
Add CA config set to fsm operations 2018-06-14 09:41:58 -07:00
Kyle Havlovitz f7ff16669f
Add the Connect CA config to the state store 2018-06-14 09:41:58 -07:00
Paul Banks 9d11cd9bf4
Fix various test failures and vet warnings.
Intention de-duplication in previously merged PR actualy failed some tests that were not caught be me or CI. I ran the test files for state changes but they happened not to trigger this case so I made sure they did first and then fixed. That fixed some upstream intention endpoint tests that I'd not run as part of testing the previous fix.
2018-06-14 09:41:58 -07:00
Paul Banks 280382c25f
Add tests all the way up through the endpoints to ensure duplicate src/destination is supported and so ultimately deny/allow nesting works.
Also adds a sanity check test for `api.Agent().ConnectAuthorize()` and a fix for a trivial bug in it.
2018-06-14 09:41:57 -07:00
Paul Banks adc5589329
Allow duplicate source or destination, but enforce uniqueness across all four. 2018-06-14 09:41:57 -07:00
Mitchell Hashimoto 1985655dff
agent/consul/state: ensure exactly one active CA exists when setting 2018-06-14 09:41:54 -07:00