open-consul

Author	SHA1	Message	Date
Kyle Havlovitz	6f40708aca	fsm: add Intention operations to transactions for internal use	2018-10-19 10:02:28 -07:00
Matt Keeler	df507a4a55	A few misc fixes found by go vet	2018-10-19 12:28:36 -04:00
Matt Keeler	99e0a124cb	New ACLs (#4791 ) This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week. Description At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers. On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though. Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though. All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management. Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are: A server running the new system must still support other clients using the legacy system. A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system. The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode. So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.	2018-10-19 12:04:07 -04:00
Pierre Souchay	a72f92cac6	dns: implements prefix lookups for DNS TTL (#4605 ) This will fix https://github.com/hashicorp/consul/issues/4509 and allow forinstance lb-* to match services lb-001 or lb-service-007.	2018-10-19 08:41:04 -07:00
Kyle Havlovitz	96a35f8abc	re-add Connect multi-dc config changes This reverts commit 8bcfbaffb6588b024cd1a3cf0952e6bfa7d9e900.	2018-10-19 08:41:03 -07:00
Jack Pearkes	847a0a5266	Revert "Connect multi-dc config" (#4784 )	2018-10-11 17:32:45 +01:00
Aestek	260a9880ae	[Security] Add finer control over script checks (#4715 ) * Add -enable-local-script-checks options These options allow for a finer control over when script checks are enabled by giving the option to only allow them when they are declared from the local file system. * Add documentation for the new option * Nitpick doc wording	2018-10-11 13:22:11 +01:00
Rebecca Zanzig	0ec6d880f5	Support multiple tags for health and catalog http api endpoints (#4717 ) * Support multiple tags for health and catalog api endpoints Fixes #1781. Adds a `ServiceTags` field to the ServiceSpecificRequest to support multiple tags, updates the filter logic in the catalog store, and propagates these change through to the health and catalog endpoints. Note: Leaves `ServiceTag` in the struct, since it is being used as part of the DNS lookup, which in turn uses the health check. * Update the api package to support multiple tags Includes additional tests. * Update new tests to use the `require` library * Update HealthConnect check after a bad merge	2018-10-11 12:50:05 +01:00
Pierre Souchay	b0fc91a1d2	[Performance On Large clusters] Reduce updates on large services (#4720 ) * [Performance On Large clusters] Checks do update services/nodes only when really modified to avoid too many updates on very large clusters In a large cluster, when having a few thousands of nodes, the anti-entropy mechanism performs lots of changes (several per seconds) while there is no real change. This patch wants to improve this in order to increase Consul scalability when using many blocking requests on health for instance. * [Performance for large clusters] Only updates index of service if service is really modified * [Performance for large clusters] Only updates index of nodes if node is really modified * Added comments / ensure IsSame() has clear semantics * Avoid having modified boolean, return nil directly if stutures are Same * Fixed unstable unit tests TestLeader_ChangeServerID * Rewrite TestNode_IsSame() for better readability as suggested by @banks * Rename ServiceNode.IsSame() into IsSameService() + added unit tests * Do not duplicate TestStructs_ServiceNode_Conversions() and increase test coverage of IsSameService * Clearer documentation in IsSameService * Take into account ServiceProxy into ServiceNode.IsSameService() * Fixed IsSameService() with all new structures	2018-10-11 12:42:39 +01:00
Pierre Souchay	42f250fa53	Added SOA configuration for DNS settings. (#4714 ) This will allow to fine TUNE SOA settings sent by Consul in DNS responses, for instance to be able to control negative ttl. Will fix: https://github.com/hashicorp/consul/issues/4713 # Example Override all settings: * min_ttl: 0 => 60s * retry: 600 (10m) => 300s (5 minutes), * expire: 86400 (24h) => 43200 (12h) * refresh: 3600 (1h) => 1800 (30 minutes) ``` consul agent -dev -hcl 'dns_config={soa={min_ttl=60,retry=300,expire=43200,refresh=1800}}' ``` Result: ``` dig +multiline @localhost -p 8600 service.consul ; <<>> DiG 9.12.1 <<>> +multiline @localhost -p 8600 service.consul ; (2 servers found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 36557 ;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;service.consul. IN A ;; AUTHORITY SECTION: consul. 0 IN SOA ns.consul. hostmaster.consul. ( 1537959133 ; serial 1800 ; refresh (30 minutes) 300 ; retry (5 minutes) 43200 ; expire (12 hours) 60 ; minimum (1 minute) ) ;; Query time: 4 msec ;; SERVER: 127.0.0.1#8600(127.0.0.1) ;; WHEN: Wed Sep 26 12:52:13 CEST 2018 ;; MSG SIZE rcvd: 93 ```	2018-10-10 15:50:56 -04:00
Kyle Havlovitz	0cbd176a48	connect/ca: more OSS split for multi-dc	2018-10-10 12:17:59 -07:00
Kyle Havlovitz	6d5160c139	connect/ca: split CA initialization logic between oss/enterprise	2018-10-10 12:17:59 -07:00
Kyle Havlovitz	5b98a602af	agent: add primary_datacenter and connect replication config options	2018-10-10 12:17:59 -07:00
Kyle Havlovitz	304595f7a6	connect: add ExternalTrustDomain to CARoot fields	2018-10-10 12:16:47 -07:00
Kyle Havlovitz	475afd0300	docs: deprecate acl_datacenter and replace it with primary_datacenter	2018-10-10 12:16:47 -07:00
Paul Banks	92fe8c8e89	Add Proxy Upstreams to Service Definition (#4639 ) * Refactor Service Definition ProxyDestination. This includes: - Refactoring all internal structs used - Updated tests for both deprecated and new input for: - Agent Services endpoint response - Agent Service endpoint response - Agent Register endpoint - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Register - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Services endpoint response - Catalog Node endpoint response - Catalog Service endpoint response - Updated API tests for all of the above too (both deprecated and new forms of register) TODO: - config package changes for on-disk service definitions - proxy config endpoint - built-in proxy support for new fields * Agent proxy config endpoint updated with upstreams * Config file changes for upstreams. * Add upstream opaque config and update all tests to ensure it works everywhere. * Built in proxy working with new Upstreams config * Command fixes and deprecations * Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts... TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct. * Fix translated keys in API registration. ≈ * Fixes from docs - omit some empty undocumented fields in API - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally. * Documentation updates for Upstreams in service definition * Fixes for tests broken by many refactors. * Enable travis on f-connect branch in this branch too. * Add consistent Deprecation comments to ProxyDestination uses * Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs	2018-10-10 16:55:34 +01:00
Alex Dadgar	90ed72fd70	do not bootstrap with non voters	2018-09-19 17:41:36 -07:00
Kyle Havlovitz	9b8f8975c6	Merge pull request #4644 from hashicorp/ca-refactor connect/ca: rework initialization/root generation in providers	2018-09-13 13:08:34 -07:00
Paul Banks	09e4c2995b	Fix CA pruning when CA config uses string durations. (#4669 ) * Fix CA pruning when CA config uses string durations. The tl;dr here is: - Configuring LeafCertTTL with a string like "72h" is how we do it by default and should be supported - Most of our tests managed to escape this by defining them as time.Duration directly - Out actual default value is a string - Since this is stored in a map[string]interface{} config, when it is written to Raft it goes through a msgpack encode/decode cycle (even though it's written from server not over RPC). - msgpack decode leaves the string as a `[]uint8` - Some of our parsers required string and failed - So after 1 hour, a default configured server would throw an error about pruning old CAs - If a new CA was configured that set LeafCertTTL as a time.Duration, things might be OK after that, but if a new CA was just configured from config file, intialization would cause same issue but always fail still so would never prune the old CA. - Mostly this is just a janky error that got passed tests due to many levels of complicated encoding/decoding. tl;dr of the tl;dr: Yay for type safety. Map[string]interface{} combined with msgpack always goes wrong but we somehow get bitten every time in a new way :D We already fixed this once! The main CA config had the same problem so @kyhavlov already wrote the mapstructure DecodeHook that fixes it. It wasn't used in several places it needed to be and one of those is notw in `structs` which caused a dependency cycle so I've moved them. This adds a whole new test thta explicitly tests the case that broke here. It also adds tests that would have failed in other places before (Consul and Vaul provider parsing functions). I'm not sure if they would ever be affected as it is now as we've not seen things broken with them but it seems better to explicitly test that and support it to not be bitten a third time! * Typo fix * Fix bad Uint8 usage	2018-09-13 15:43:00 +01:00
Pierre Souchay	5ecf9823d2	Fix more unstable tests in agent and command	2018-09-12 14:49:27 +01:00
Kyle Havlovitz	8fc2c77fdf	connect/ca: some cleanup and reorganizing of the new methods	2018-09-11 16:43:04 -07:00
Pierre Souchay	7a42c31330	Fix unstable tests in agent, api, and command/watch	2018-09-10 16:58:53 +01:00
Pierre Souchay	473e589d86	Implementation of Weights Data structures (#4468 ) * Implementation of Weights Data structures Adding this datastructure will allow us to resolve the issues #1088 and #4198 This new structure defaults to values: ``` { Passing: 1, Warning: 0 } ``` Which means, use weight of 0 for a Service in Warning State while use Weight 1 for a Healthy Service. Thus it remains compatible with previous Consul versions. * Implemented weights for DNS SRV Records * DNS properly support agents with weight support while server does not (backwards compatibility) * Use Warning value of Weights of 1 by default When using DNS interface with only_passing = false, all nodes with non-Critical healthcheck used to have a weight value of 1. While having weight.Warning = 0 as default value, this is probably a bad idea as it breaks ascending compatibility. Thus, we put a default value of 1 to be consistent with existing behaviour. * Added documentation for new weight field in service description * Better documentation about weights as suggested by @banks * Return weight = 1 for unknown Check states as suggested by @banks * Fixed typo (of -> or) in error message as requested by @mkeeler * Fixed unstable unit test TestRetryJoin * Fixed unstable tests * Fixed wrong Fatalf format in `testrpc/wait.go` * Added notes regarding DNS SRV lookup limitations regarding number of instances * Documentation fixes and clarification regarding SRV records with weights as requested by @banks * Rephrase docs	2018-09-07 15:30:47 +01:00
Kyle Havlovitz	e184a18e4b	connect/ca: add Configure/GenerateRoot to provider interface	2018-09-06 19:18:59 -07:00
Pierre Souchay	54d8157ee1	Fixed more flaky tests in ./agent/consul (#4617 )	2018-09-04 14:02:47 +01:00
Freddy	10d3048bd6	Bugfix: Use "%#v" when formatting structs (#4600 )	2018-08-28 12:37:34 -04:00
Pierre Souchay	9b5cf0c1d0	[BUGFIX] Avoid returning empty data on startup of a non-leader server (#4554 ) Ensure that DB is properly initialized when performing stale queries Addresses: - https://github.com/hashicorp/consul-replicate/issues/82 - https://github.com/hashicorp/consul/issues/3975 - https://github.com/hashicorp/consul-template/issues/1131	2018-08-23 12:06:39 -04:00
Kyle Havlovitz	26a21df014	Merge branch 'master' into ca-snapshot-fix	2018-08-16 13:00:54 -07:00
Kyle Havlovitz	af4b037c52	fsm: add connect service config to snapshot/restore test	2018-08-16 12:58:54 -07:00
nickmy9729	43a68822e3	Added code to allow snapshot inclusion of NodeMeta (#4527 )	2018-08-16 15:33:35 -04:00
Kyle Havlovitz	880eccb502	fsm: add missing CA config to snapshot/restore logic	2018-08-16 11:58:50 -07:00
Kyle Havlovitz	fd83063686	autopilot: don't follow the normal server removal rules for nonvoters	2018-08-14 14:24:51 -07:00
Kyle Havlovitz	aa19559cc7	Fix stats fetcher healthcheck RPCs not being independent	2018-08-14 14:23:52 -07:00
Pierre Souchay	a16f34058b	Display more information about check being not properly added when it fails (#4405 ) * Display more information about check being not properly added when it fails It follows an incident where we add lots of error messages: [WARN] consul.fsm: EnsureRegistration failed: failed inserting check: Missing service registration That seems related to Consul failing to restart on respective agents. Having Node information as well as service information would help diagnose the issue. * Renamed ensureCheckIfNodeMatches() as requested by @banks	2018-08-14 17:45:33 +01:00
Pierre Souchay	821a91ca31	Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415 ) * Allow to rename nodes with IDs, will fix #3974 and #4413 This change allow to rename any well behaving recent agent with an ID to be renamed safely, ie: without taking the name of another one with case insensitive comparison. Deprecated behaviour warning ---------------------------- Due to asceding compatibility, it is still possible however to "take" the name of another name by not providing any ID. Note that when not providing any ID, it is possible to have 2 nodes having similar names with case differences, ie: myNode and mynode which might lead to DB corruption on Consul server side and lead to server not properly restarting. See #3983 and #4399 for Context about this change. Disabling registration of nodes without IDs as specified in #4414 should probably be the way to go eventually. * Removed the case-insensitive search when adding a node within the else block since it breaks the test TestAgentAntiEntropy_Services While the else case is probably legit, it will be fixed with #4414 in a later release. * Added again the test in the else to avoid duplicated names, but enforce this test only for nodes having IDs. Thus most tests without any ID will work, and allows us fixing * Added more tests regarding request with/without IDs. `TestStateStore_EnsureNode` now test registration and renaming with IDs `TestStateStore_EnsureNodeDeprecated` tests registration without IDs and tests removing an ID from a node as well as updated a node without its ID (deprecated behaviour kept for backwards compatibility) * Do not allow renaming in case of conflict, including when other node has no ID * Fixed function GetNodeID that was not working due to wrong type when searching node from its ID Thus, all tests about renaming were not working properly. Added the full test cas that allowed me to detect it. * Better error messages, more tests when nodeID is not a valid UUID in GetNodeID() * Added separate TestStateStore_GetNodeID to test GetNodeID. More complete test coverage for GetNodeID * Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn` Also fixed comments to be clearer after remarks from @banks * Fixed error message in unit test to match test case * Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler	2018-08-10 11:30:45 -04:00
Siva Prasad	d98d02777f	PR to fix TestAgent_IndexChurn and TestPreparedQuery_Wrapper. (#4512 ) * Fixes TestAgent_IndexChurn * Fixes TestPreparedQuery_Wrapper * Increased sleep in agent_test for IndexChurn to 500ms * Made the comment about joinWAN operation much less of a cliffhanger	2018-08-09 12:40:07 -04:00
Armon Dadgar	a343392f63	consul: Update buffer sizes	2018-08-08 10:26:58 -07:00
Siva Prasad	cfa436dc16	Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix." (#4497 ) * Revert "BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)" This reverts commit cec5d7239621e0732b3f70158addb1899442acb3. * Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)" This reverts commit 589b589b53e56af38de25db9b56967bdf1f2c069.	2018-08-07 08:29:48 -04:00
Pierre Souchay	fd927ea110	BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472 ) - Improve resilience of testrpc.WaitForLeader() - Add additionall retry to CI - Increase "go test" timeout to 8m - Add wait for cluster leader to several tests in the agent package - Add retry to some tests in the api and command packages	2018-08-06 19:46:09 -04:00
Siva Prasad	29c181f5fa	CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493 ) * connect: fix an issue with Consul CA bootstrapping being interrupted * streamline change server id test	2018-08-06 16:15:24 -04:00
Kyle Havlovitz	42ab07b398	fix inconsistency in TestConnectCAConfig_GetSet	2018-07-26 07:46:47 -07:00
Kyle Havlovitz	ecc02c6aee	Merge pull request #4400 from hashicorp/leaf-cert-ttl Add configurable leaf cert TTL to Connect CA	2018-07-25 17:53:25 -07:00
Paul Banks	217137b775	Fixes #4421 : General solution to stop blocking queries with index 0 (#4437 ) * Fix theoretical cache collision bug if/when we use more cache types with same result type * Generalized fix for blocking query handling when state store methods return zero index * Refactor test retry to only affect CI * Undo make file merge * Add hint to error message returned to end-user requests if Connect is not enabled when they try to request cert * Explicit error for Roots endpoint if connect is disabled * Fix tests that were asserting old behaviour	2018-07-25 20:26:27 +01:00
Kyle Havlovitz	a125735d76	connect/ca: check LeafCertTTL when rotating expired roots	2018-07-20 16:04:04 -07:00
Kyle Havlovitz	45ec8849f3	connect/ca: add configurable leaf cert TTL	2018-07-16 13:33:37 -07:00
Matt Keeler	cc46d59269	Merge pull request #4379 from hashicorp/persist-intermediates connect: persist intermediate CAs on leader change	2018-07-12 12:09:13 -04:00
Matt Keeler	965fc9cf62	Revert "Allow changing Node names since Node now have IDs"	2018-07-12 11:19:21 -04:00
Matt Keeler	d8a4d9137b	Fixup formatting	2018-07-12 10:14:26 -04:00
Matt Keeler	d63c5807cf	Revert PR 4294 - Catalog Register: Generate UUID for services registered without one UUID auto-generation here causes trouble in a few cases. The biggest being older nodes reregistering will fail when the UUIDs are different and the names match This reverts commit 0f700340828f464449c2e0d5a82db0bc5456d385. This reverts commit d1a8f9cb3f6f48dd9c8d0bc858031ff6ccff51d0. This reverts commit cf69ec42a418ab6594a6654e9545e12160f30970.	2018-07-12 10:06:50 -04:00
Kyle Havlovitz	2a40f93ac8	connect: use reflect.DeepEqual instead for test	2018-07-11 13:10:58 -07:00
Matt Keeler	42729d5aff	Merge pull request #3983 from pierresouchay/node_renaming Allow changing Node names since Node now have IDs	2018-07-11 16:03:02 -04:00
Kyle Havlovitz	f9a35a9338	connect: add provider state to snapshots	2018-07-11 11:34:49 -07:00
Kyle Havlovitz	9c21cc7ac9	connect: update leader initializeCA comment	2018-07-11 10:00:42 -07:00
Kyle Havlovitz	db254f0991	connect: persist intermediate CAs on leader change	2018-07-11 09:44:30 -07:00
Pierre Souchay	3d0a960470	When renaming a node, ensure the name is not taken by another node. Since DNS is case insensitive and DB as issues when similar names with different cases are added, check for unicity based on case insensitivity. Following another big incident we had in our cluster, we also validate that adding/renaming a not does not conflicts with case insensitive matches. We had the following error once: - one node called: mymachine.MYDC.mydomain was shut off - another node (different ID) was added with name: mymachine.mydc.mydomain before 72 hours When restarting the consul server of domain, the consul server restarted failed to start since it detected an issue in RAFT database because mymachine.MYDC.mydomain and mymachine.mydc.mydomain had the same names. Checking at registration time with case insensitivity should definitly fix those issues and avoid Consul DB corruption.	2018-07-11 14:42:54 +02:00
Matt Keeler	22c5951ec4	Merge pull request #4303 from pierresouchay/non_blocking_acl Only send one single ACL cache refresh across network when TTL is over	2018-07-10 08:57:33 -04:00
MagnumOpus21	0b50b84429	Agent/Proxy: Formatting and test cases fix	2018-07-09 12:46:10 -04:00
Kyle Havlovitz	883b2a518a	Store the time CARoot is rotated out instead of when to prune	2018-07-06 16:05:25 -07:00
Kyle Havlovitz	3c520019e9	connect/ca: add logic for pruning old stale RootCA entries	2018-07-02 10:35:05 -07:00
Pierre Souchay	95a0ab9f99	Updated swith case to use same branch for async-cache and extend-cache	2018-07-02 17:39:34 +02:00
Pierre Souchay	6dfbbf1350	Updated documentation and adding more test case for async-cache	2018-07-01 23:50:30 +02:00
Pierre Souchay	382bec0897	Added async-cache with similar behaviour as extend-cache but asynchronously	2018-07-01 23:50:30 +02:00
Pierre Souchay	da9c91fd3d	Only send one single ACL cache refresh across network when TTL is over It will allow the following: * when connectivity is limited (saturated linnks between DCs), only one single request to refresh an ACL will be sent to ACL master DC instead of statcking ACL refresh queries * when extend-cache is used for ACL, do not wait for result, but refresh the ACL asynchronously, so no delay is not impacting slave DC * When extend-cache is not used, keep the existing blocking mechanism, but only send a single refresh request. This will fix https://github.com/hashicorp/consul/issues/3524	2018-07-01 23:50:30 +02:00
Matt Keeler	02719c52ff	Move starting enterprise functionality	2018-06-29 17:38:29 -04:00
Matt Keeler	66af873639	Move default uuid test into the consul package	2018-06-27 09:21:58 -04:00
Matt Keeler	dbc407cec9	go fmt changes	2018-06-27 09:07:22 -04:00
Matt Keeler	95291ec5ed	Make sure to generate UUIDs when services are registered without one This makes the behavior line up with the docs and expected behavior	2018-06-26 17:04:08 -04:00
mkeeler	1da3c42867	Merge remote-tracking branch 'connect/f-connect'	2018-06-25 19:42:51 +00:00
Kyle Havlovitz	859eaea5c4	connect/ca: pull the cluster ID from config during a rotation	2018-06-25 12:25:42 -07:00
Kyle Havlovitz	fcc5dc6110	connect/ca: leave blank root key/cert out of the default config (unnecessary)	2018-06-25 12:25:42 -07:00
Kyle Havlovitz	f3089a6647	connect/ca: undo the interface changes and use sign-self-issued in Vault	2018-06-25 12:25:42 -07:00
Kyle Havlovitz	cea94d0bcf	connect/ca: update Consul provider to use new cross-sign CSR method	2018-06-25 12:25:41 -07:00
Kyle Havlovitz	7b0845ccde	connect/ca: fix vault provider URI SANs and test	2018-06-25 12:25:41 -07:00
Kyle Havlovitz	a98b85b25c	connect/ca: add the Vault CA provider	2018-06-25 12:25:41 -07:00
Paul Banks	b4fbeb0453	Note leadership issues in comments	2018-06-25 12:25:41 -07:00
Paul Banks	824a9b4943	Actually return Intermediate certificates bundled with a leaf!	2018-06-25 12:25:40 -07:00
Paul Banks	81bd1b43a3	Fix hot loop in cache for RPC returning zero index.	2018-06-25 12:25:37 -07:00
Paul Banks	d6b13463ed	Fix misc test failures (some from other PRs)	2018-06-25 12:25:13 -07:00
Paul Banks	1283373a64	Only set precedence on write path	2018-06-25 12:25:13 -07:00
Paul Banks	22b95283e9	Fix some tests failures caused by the sorting change and some cuased by previous UpdatePrecedence() change	2018-06-25 12:25:13 -07:00
Paul Banks	e2938138f6	Sort intention list by precedence	2018-06-25 12:25:13 -07:00
Kyle Havlovitz	a242e5b130	agent: update accepted CA config fields and defaults	2018-06-25 12:25:09 -07:00
Mitchell Hashimoto	4ebddd6adb	agent/consul: set precedence value on struct itself	2018-06-25 12:24:16 -07:00
Mitchell Hashimoto	52c10d2208	agent/consul: support a Connect option on prepared query request	2018-06-25 12:24:12 -07:00
Mitchell Hashimoto	e8c899b1b8	agent/consul: prepared query supports "Connect" field	2018-06-25 12:24:11 -07:00
Mitchell Hashimoto	ad382d7351	agent: switch ConnectNative to an embedded struct	2018-06-25 12:24:10 -07:00
Mitchell Hashimoto	a3e0ac1ee3	agent/consul/state: support querying by Connect native	2018-06-25 12:24:08 -07:00
Mitchell Hashimoto	8e02bbc897	agent/consul: support catalog registration with Connect native	2018-06-25 12:24:07 -07:00
Matt Keeler	6ccc4f39db	Merge pull request #4216 from hashicorp/rpc-limiting Make RPC limits reloadable	2018-06-20 09:05:28 -04:00
Mitchell Hashimoto	787ce3b269	agent: address feedback	2018-06-14 09:42:20 -07:00
Mitchell Hashimoto	b5b29cd6af	agent: rename test to check	2018-06-14 09:42:18 -07:00
Mitchell Hashimoto	a48ff54318	agent/consul: forward request if necessary	2018-06-14 09:42:17 -07:00
Mitchell Hashimoto	b02502be73	agent: comments to point to differing logic	2018-06-14 09:42:17 -07:00
Mitchell Hashimoto	526cfc34bd	agent/consul: implement Intention.Test endpoint	2018-06-14 09:42:17 -07:00
Paul Banks	bd5eb8b749	Add default CA config back - I didn't add it and causes nil panics	2018-06-14 09:42:17 -07:00
Paul Banks	dbcf286d4c	Ooops remove the CA stuff from actual server defaults and make it test server only	2018-06-14 09:42:16 -07:00
Paul Banks	834ed1d25f	Fixed many tests after rebase. Some still failing and seem unrelated to any connect changes.	2018-06-14 09:42:16 -07:00
Paul Banks	30d90b3be4	Generate CSR using real trust-domain	2018-06-14 09:42:16 -07:00
Paul Banks	5a1408f186	Add CSR signing verification of service ACL, trust domain and datacenter.	2018-06-14 09:42:16 -07:00
Paul Banks	c808833a78	Return TrustDomain from CARoots RPC	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	d1265bc38b	Rename some of the CA structs/files	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	1660f9ebab	Add more metadata to structs.CARoot	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	baf4db1c72	Use provider state table for a global serial index	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	c90b353eea	Move connect CA provider to separate package	2018-06-14 09:42:15 -07:00
Mitchell Hashimoto	54a1662da8	agent/consul: change provider wait from goto to a loop	2018-06-14 09:42:14 -07:00
Mitchell Hashimoto	749f81373f	agent/consul: check nil on getCAProvider result	2018-06-14 09:42:14 -07:00
Mitchell Hashimoto	c57405b323	agent/consul: retry reading provider a few times	2018-06-14 09:42:14 -07:00
Paul Banks	dcd277de8a	Wire up agent leaf endpoint to cache framework to support blocking.	2018-06-14 09:42:07 -07:00
Kyle Havlovitz	b28e11fdd3	Fill out connect CA rpc endpoint tests	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	7c0976208d	Add tests for the built in CA's state store table	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	19b9399f2f	Add more tests for built-in provider	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	a29f3c6b96	Fix some inconsistencies around the CA provider code	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	2167713226	Add CA config to connect section of agent config	2018-06-14 09:42:05 -07:00
Kyle Havlovitz	02fef5f9a2	Move ConsulCAProviderConfig into structs package	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	887cc98d7e	Simplify the CAProvider.Sign method	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	44b30476cb	Simplify the CA provider interface by moving some logic out	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	aa10fb2f48	Clarify some comments and names around CA bootstrapping	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	43f13d5a0b	Add cross-signing mechanism to root rotation	2018-06-14 09:42:00 -07:00
Kyle Havlovitz	bbfcb278e1	Add the root rotation mechanism to the CA config endpoint	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	a585a0ba10	Have the built in CA store its state in raft	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	80eddb0bfb	Fix the testing endpoint's root set op	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	fc9ef9741b	Hook the CA RPC endpoint into the provider interface	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	a40db26ffe	Add CA bootstrapping on establishing leadership	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	e26819ed9c	Add the bootstrap config for the CA	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	ebdda17a30	Add CA config set to fsm operations	2018-06-14 09:41:58 -07:00
Kyle Havlovitz	f7ff16669f	Add the Connect CA config to the state store	2018-06-14 09:41:58 -07:00
Paul Banks	9d11cd9bf4	Fix various test failures and vet warnings. Intention de-duplication in previously merged PR actualy failed some tests that were not caught be me or CI. I ran the test files for state changes but they happened not to trigger this case so I made sure they did first and then fixed. That fixed some upstream intention endpoint tests that I'd not run as part of testing the previous fix.	2018-06-14 09:41:58 -07:00
Paul Banks	280382c25f	Add tests all the way up through the endpoints to ensure duplicate src/destination is supported and so ultimately deny/allow nesting works. Also adds a sanity check test for `api.Agent().ConnectAuthorize()` and a fix for a trivial bug in it.	2018-06-14 09:41:57 -07:00
Paul Banks	adc5589329	Allow duplicate source or destination, but enforce uniqueness across all four.	2018-06-14 09:41:57 -07:00
Mitchell Hashimoto	1985655dff	agent/consul/state: ensure exactly one active CA exists when setting	2018-06-14 09:41:54 -07:00
Mitchell Hashimoto	da1bc48372	agent/connect: rename SpiffeID to CertURI	2018-06-14 09:41:53 -07:00
Mitchell Hashimoto	b0315811b9	agent/connect: use proper keyusage fields for CA and leaf	2018-06-14 09:41:53 -07:00
Mitchell Hashimoto	2026cf3753	agent/consul: encode issued cert serial number as hex encoded	2018-06-14 09:41:53 -07:00
Mitchell Hashimoto	746f80639a	agent: /v1/connect/ca/configuration PUT for setting configuration	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	2dfca5dbc2	agent/consul/fsm,state: snapshot/restore for CA roots	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	17d6b437d2	agent/consul/fsm,state: tests for CA root related changes	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	a8510f8224	agent/consul: set more fields on the issued cert	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	58b6f476e8	agent: /v1/connect/ca/leaf/:service_id	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	80a058a573	agent/consul: CAS operations for setting the CA root	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	712888258b	agent/consul: tests for CA endpoints	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	1928c07d0c	agent/consul: key the public key of the CSR, verify in test	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	9a8653f45e	agent/consul: test for ConnectCA.Sign	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	a360c5cca4	agent/consul: basic sign endpoint not tested yet	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	24830f4cfa	agent/consul: RPC endpoints to list roots	2018-06-14 09:41:50 -07:00
Mitchell Hashimoto	cfb62677c0	agent/consul/state: CARoot structs and initial state store	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	7e8d606717	agent: address PR feedback	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	f9a55aa7e0	agent: clarified a number of comments per PR feedback	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	62cbb892e3	agent/consul: Health.ServiceNodes ACL check for Connect	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	641c982480	agent/consul: Catalog endpoint ACL requirements for Connect proxies	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	566c98b2fc	agent/consul: require name for proxies	2018-06-14 09:41:48 -07:00
Mitchell Hashimoto	daaa6e2403	agent: clean up connect/non-connect duplication by using shared methods	2018-06-14 09:41:48 -07:00
Mitchell Hashimoto	119ffe3ed9	agent/consul: implement Health.ServiceNodes for Connect, DNS works	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	253256352c	agent/consul: Catalog.ServiceNodes supports Connect filtering	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	06957f6d7f	agent/consul/state: ConnectServiceNodes	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	200100d3f4	agent/consul: enforce ACL on ProxyDestination	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	8a72826483	agent/consul: proxy registration and tests	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	8777ff139c	agent: test /v1/catalog/node/:node to list connect proxies	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	761b561946	agent: /v1/catalog/service/:service works with proxies	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	58bff8dd05	agent/consul/state: convert proxy test to testify/assert	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	09568ce7b5	agent/consul/state: service registration with proxy works	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	23ee0888ec	agent/consul: convert intention ACLs to testify/assert	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	6a8bba7d48	agent/consul,structs: add tests for ACL filter and prefix for intentions	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	3e10a1ae7a	agent/consul: Intention.Match ACLs	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	db44a98a2d	agent/consul: Intention.Get ACLs	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	fd840da97a	agent/consul: Intention.Apply ACL on rename	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	14ca93e09c	agent/consul: tests for ACLs on Intention.Apply update/delete	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	c54be9bc09	agent/consul: Basic ACL on Intention.Apply	2018-06-14 09:41:44 -07:00
Mitchell Hashimoto	1d0b4ceedb	agent: convert all intention tests to testify/assert	2018-06-14 09:41:44 -07:00
Mitchell Hashimoto	f07340e94f	agent/consul/fsm,state: snapshot/restore for intentions	2018-06-14 09:41:44 -07:00
Mitchell Hashimoto	6f33b2d070	agent: use UTC time for intention times, move empty list check to agent/consul	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	67b017c95c	agent/consul/fsm: switch tests to use structs.TestIntention	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	3a00564411	agent/consul/state: need to set Meta for intentions for tests	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	027dad8672	agent/consul/state: remove TODO	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	37f66e47ed	agent: use testing intention to get valid intentions	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	04bd4af99c	agent/consul: set default intention SourceType, validate it	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	8e2462e301	agent/structs: Intention validation	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	d34ee200de	agent/consul: support intention description, meta is non-nil	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	e81d1c88b7	agent/consul/fsm: add tests for intention requests	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	2b047fb09b	agent,agent/consul: set default namespaces	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	e630d65d9d	agent/consul: set CreatedAt, UpdatedAt on intentions	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	e9d208bcb6	agent/consul: RPC endpoint for Intention.Match	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	987b7ce0a2	agent/consul/state: IntentionMatch for performing match resolution	2018-06-14 09:41:41 -07:00
Mitchell Hashimoto	bebe6870ff	agent/consul: test that Apply works to delete an intention	2018-06-14 09:41:41 -07:00
Mitchell Hashimoto	95e1c92edf	agent/consul/state,fsm: support for deleting intentions	2018-06-14 09:41:41 -07:00
Mitchell Hashimoto	32ad54369c	agent/consul: creating intention must not have ID set	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	f219c766cb	agent/consul: support updating intentions	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	37572829ab	agent: GET /v1/connect/intentions/:id	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	2a8a2f8167	agent/consul: Intention.Get endpoint	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	48b9a43f1d	agent/consul: Intention.Apply, FSM methods, very little validation	2018-06-14 09:41:39 -07:00
Mitchell Hashimoto	b19a289596	agent/consul: start Intention RPC endpoints, starting with List	2018-06-14 09:41:39 -07:00
Mitchell Hashimoto	8b0ac7d9c5	agent/consul/state: list intentions	2018-06-14 09:41:39 -07:00
Mitchell Hashimoto	c05bed86e1	agent/consul/state: initial work on intentions memdb table	2018-06-14 09:41:39 -07:00
Guido Iaquinti	3ed73961b3	Attach server.Name label to client.rpc.failed	2018-06-13 14:56:14 +01:00
Guido Iaquinti	bda575074e	Attach server.ID label to client.rpc.failed	2018-06-13 14:53:44 +01:00
Guido Iaquinti	edd6a69541	Client: add metric for failed RPC calls to server	2018-06-13 12:35:45 +01:00
Matt Keeler	c41fa6c010	Add a Client ReloadConfig test	2018-06-11 16:23:51 -04:00
Matt Keeler	c5d9c2362f	Merge branch 'master' of github.com:hashicorp/consul into rpc-limiting # Conflicts: # agent/agent.go # agent/consul/client.go	2018-06-11 16:11:36 -04:00
Matt Keeler	c589991452	Apply the limits to the clients rpcLimiter	2018-06-11 15:51:17 -04:00
Matt Keeler	14661a417b	Allow for easy enterprise/oss coexistence Uses struct/interface embedding with the embedded structs/interfaces being empty for oss. Also methods on the server/client types are defaulted to do nothing for OSS	2018-05-24 10:36:42 -04:00
Wim	88514d6a82	Add support for reverse lookup of services	2018-05-19 19:39:02 +02:00
Preetha Appan	7400a78f8a	Change default raft threshold config values and add a section to upgrade notes	2018-05-11 10:45:41 -05:00
Preetha Appan	e28c5fbb4e	Also make snapshot interval configurable	2018-05-11 10:43:24 -05:00
Preetha Appan	eb4bc79118	Make raft snapshot commit threshold configurable	2018-05-11 10:43:24 -05:00
Jack Pearkes	e611b1728a	Merge pull request #4097 from hashicorp/remove-deprecated Remove deprecated check/service fields and metric names	2018-05-10 15:45:49 -07:00
Kyle Havlovitz	60307ef328	Remove deprecated metric names	2018-05-08 16:23:15 -07:00
Paul Banks	c55885efd8	Merge pull request #3970 from pierresouchay/node_health_should_change_service_index [BUGFIX] When a node level check is removed, ensure all services of node are notified	2018-05-08 16:44:50 +01:00
Pierre Souchay	ee47eb7d7d	Added Missing Service Meta synchronization and field	2018-04-21 17:34:29 +02:00
Pierre Souchay	1b55e3559b	Allow renaming nodes when ID is unchanged	2018-04-18 15:39:38 +02:00
Kyle Havlovitz	be10300d06	Update make static-assets goal and run format	2018-04-13 09:57:25 -07:00
Matt Keeler	ed94d356e0	Merge pull request #4023 from hashicorp/f-near-ip Add near=_ip support for prepared queries	2018-04-12 12:10:48 -04:00
Matt Keeler	aa9151738a	GH-3798: A couple more PR updates Test HTTP/DNS source IP without header/extra EDNS data. Add WARN log for when prepared query with near=_ip is executed without specifying the source ip	2018-04-12 10:10:37 -04:00
Matt Keeler	3a0f7789ec	GH-3798: A few more PR updates	2018-04-11 20:32:35 -04:00
Matt Keeler	de3a9be3d0	GH-3798: Updates for PR Allow DNS peer IP as the source IP. Break early when the right node was found for executing the preapred query. Update docs	2018-04-11 17:02:04 -04:00
Matt Keeler	89cd24aeca	GH-3798: Add near=_ip support for prepared queries	2018-04-10 14:50:50 -04:00
Paul Banks	2ed0d2afcd	Allow ignoring checks by ID when defining a PreparedQuery. Fixes #3727 .	2018-04-10 14:04:16 +01:00
Preetha Appan	d9d9944179	Renames agent API layer for service metadata to "meta" for consistency	2018-03-28 09:04:50 -05:00
Preetha	8dacb12c79	Merge pull request #3881 from pierresouchay/service_metadata Feature Request: Support key-value attributes for services	2018-03-27 16:33:57 -05:00
Pierre Souchay	b9ae4e647f	Added validation of ServiceMeta in Catalog Fixed Error Message when ServiceMeta is not valid Added Unit test for adding a Service with badly formatted ServiceMeta	2018-03-27 22:22:42 +02:00
Preetha Appan	17a011b9bd	fix typo and remove comment	2018-03-27 14:28:05 -05:00
Preetha Appan	6d16afc65c	Remove unnecessary nil checks	2018-03-27 10:59:42 -05:00
Preetha Appan	c21c2da690	Fix test and remove unused method	2018-03-27 09:44:41 -05:00
Preetha Appan	512f9a50fc	Allows disabling WAN federation by setting serf WAN port to -1	2018-03-26 14:21:06 -05:00
Pierre Souchay	eccb56ade0	Added support for renaming nodes when their IP does not change	2018-03-26 16:44:13 +02:00
Pierre Souchay	90d2f7bca1	Merge remote-tracking branch 'origin/master' into node_health_should_change_service_index	2018-03-22 13:07:11 +01:00
Pierre Souchay	9cc9dce848	More test cases	2018-03-22 12:41:06 +01:00
Pierre Souchay	7e8e4e014b	Added new test regarding checks index	2018-03-22 12:20:25 +01:00
Pierre Souchay	a8b66fb7aa	Fixed minor typo in comments Might fix unstable travis build	2018-03-22 10:30:10 +01:00
Josh Soref	1dd8c378b9	Spelling (#3958 ) * spelling: another * spelling: autopilot * spelling: beginning * spelling: circonus * spelling: default * spelling: definition * spelling: distance * spelling: encountered * spelling: enterprise * spelling: expands * spelling: exits * spelling: formatting * spelling: health * spelling: hierarchy * spelling: imposed * spelling: independence * spelling: inspect * spelling: last * spelling: latest * spelling: client * spelling: message * spelling: minimum * spelling: notify * spelling: nonexistent * spelling: operator * spelling: payload * spelling: preceded * spelling: prepared * spelling: programmatically * spelling: required * spelling: reconcile * spelling: responses * spelling: request * spelling: response * spelling: results * spelling: retrieve * spelling: service * spelling: significantly * spelling: specifies * spelling: supported * spelling: synchronization * spelling: synchronous * spelling: themselves * spelling: unexpected * spelling: validations * spelling: value	2018-03-19 16:56:00 +00:00
Pierre Souchay	3eb287f57d	Fixed typo in comments	2018-03-19 17:12:08 +01:00
Pierre Souchay	eb2a4eaea3	Refactoring to have clearer code without weird bool	2018-03-19 16:12:54 +01:00
Pierre Souchay	a5f6ac0df4	[BUGFIX] When a node level check is removed, ensure all services of node are notified Bugfix for https://github.com/hashicorp/consul/pull/3899 When a node level check is removed (example: maintenance), some watchers on services might have to recompute their state. If those nodes are performing blocking queries, they have to be notified. While their state was updated when node-level state did change or was added this was not the case when the check was removed. This fixes it.	2018-03-19 14:14:03 +01:00
Devin Canterberry	881d20c606	🐛 Formatting changes only; add missing trailing commas	2018-03-15 10:19:46 -07:00
Mitchell Hashimoto	fbac58280e	agent/consul/fsm: begin using testify/assert	2018-03-06 09:48:15 -08:00
Paul Banks	628dcc9793	Merge pull request #3899 from pierresouchay/fix_blocking_queries_index Services Indexes modified per service instead of using a global Index	2018-03-02 16:24:43 +00:00
Pierre Souchay	85b73f8163	Simplified error handling for maxIndexForService * added unit tests to ensure service index is properly garbage collected * added Upgrade from Version 1.0.6 to higher section in documentation	2018-03-01 14:09:36 +01:00
Preetha Appan	77d35f1829	Remove extra newline	2018-02-21 13:21:47 -06:00
Preetha Appan	573500dc51	Unit test that calls revokeLeadership twice to make sure its idempotent	2018-02-21 12:48:53 -06:00
Preetha Appan	bd270b02ba	Make sure revokeLeadership is called if establishLeadership errors	2018-02-21 12:33:22 -06:00
Alex Dadgar	535842004c	Test autopilots start/stop idempotency	2018-02-21 10:19:30 -08:00
Alex Dadgar	4d99696f02	Improve autopilot shutdown to be idempotent	2018-02-20 15:51:59 -08:00
Pierre Souchay	e6d85cb36a	Fixed comments for function maxIndexForService	2018-02-20 23:57:28 +01:00
Pierre Souchay	b26ea3c230	[Revert] Only update services if tags are different This patch did give some better results, but break watches on the services of a node. It is possible to apply the same optimization for nodes than to services (one index per instance), but it would complicate further the patch. Let's do it in another PR.	2018-02-20 23:34:42 +01:00
Pierre Souchay	903e866835	Only update services if tags are different	2018-02-20 23:08:04 +01:00
Pierre Souchay	56d5c0bf22	Enable Raft index optimization per service name on health endpoint Had to fix unit test in order to check properly indexes.	2018-02-20 01:35:50 +01:00
Pierre Souchay	ec1b278595	Get only first service to test whether we have to cleanup index of a service	2018-02-19 22:44:49 +01:00
Pierre Souchay	523feb0be4	Fixed comment about raftIndex + use test.Helper()	2018-02-19 19:30:25 +01:00
Pierre Souchay	4c188c1d08	Services Indexes modified per service instead of using a global Index This patch improves the watches for services on large cluster: each service has now its own index, such watches on a specific service are not modified by changes in the global catalog. It should improve a lot the performance of tools such as consul-template or libraries performing watches on very large clusters with many services/watches.	2018-02-19 18:29:22 +01:00
Veselkov Konstantin	05666113a4	remove golint warnings	2018-01-28 22:40:13 +04:00
Kyle Havlovitz	0e76d62846	Reset clusterHealth when autopilot starts	2018-01-23 12:52:28 -08:00
Kyle Havlovitz	6d1dbe6cc4	Move autopilot health loop into leader operations	2018-01-23 11:17:41 -08:00
James Phillips	62e97a6602	Fixes a `go fmt` cleanup.	2017-12-20 13:43:38 -08:00
Kyle Havlovitz	74b0c58831	Fix vet error	2017-12-18 18:04:42 -08:00
Kyle Havlovitz	dfc165a47b	Move autopilot initializing to oss file	2017-12-18 18:02:44 -08:00
Kyle Havlovitz	044c38aa7b	Move autopilot setup to a separate file	2017-12-18 16:55:51 -08:00
Kyle Havlovitz	9e1ba6fb4e	Make some final tweaks to autopilot package	2017-12-18 12:26:47 -08:00
Kyle Havlovitz	6b58df5898	Merge pull request #3737 from hashicorp/autopilot-refactor Move autopilot to a standalone package	2017-12-15 14:09:40 -08:00
James Phillips	262cbbd9ca	Merge pull request #3728 from weiwei04/fix_globalRPC_goroutine_leak fix globalRPC goroutine leak	2017-12-14 17:54:19 -08:00
Kyle Havlovitz	798aca92c5	Expose IsPotentialVoter for advanced autopilot logic	2017-12-13 17:53:51 -08:00
Kyle Havlovitz	a4ac148077	Merge branch 'master' into autopilot-refactor	2017-12-13 11:54:32 -08:00
Kyle Havlovitz	6c985132de	A few last autopilot adjustments	2017-12-13 11:19:17 -08:00
Kyle Havlovitz	77d92bf15c	More autopilot reorganizing	2017-12-13 10:57:37 -08:00
James Phillips	984de6e2e0	Adds TODOs referencing #3744 .	2017-12-13 10:52:06 -08:00
Kyle Havlovitz	f347c8a531	More refactoring to make autopilot consul-agnostic	2017-12-12 17:46:28 -08:00
Kyle Havlovitz	8546a1d3c6	Move autopilot to a standalone package	2017-12-11 16:45:33 -08:00
James Phillips	32b64575d1	Moves Serf helper into lib to fix import cycle in consul-enterprise.	2017-12-07 16:57:58 -08:00
James Phillips	c16cce80bb	Turns of intent queue warnings and enables dynamic queue sizing.	2017-12-07 16:27:06 -08:00
Wei Wei	04531ff0fb	fix globalRPC goroutine leak Signed-off-by: Wei Wei <weiwei.inf@gmail.com>	2017-12-05 11:53:30 +08:00
James Phillips	c4bc89a187	Creates a registration mechanism for snapshot and restore.	2017-11-29 18:36:53 -08:00
James Phillips	8571555703	Begins split out of snapshots from the main FSM class.	2017-11-29 18:36:53 -08:00
James Phillips	4eaee8e0ba	Creates a registration mechanism for FSM commands.	2017-11-29 18:36:53 -08:00
James Phillips	3e7ea1931c	Moves the FSM into its own package. This will help make it clearer what happens when we add some registration plumbing for the different operations and snapshots.	2017-11-29 18:36:53 -08:00
James Phillips	7f3783f4be	Resolves an FSM snapshot TODO. This adds checks for sink write calls before we continue the refactor, which will resolve the other TODO comment we deleted as part of this change.	2017-11-29 18:36:53 -08:00
James Phillips	5a24d37ac0	Creates a registration mechanism for schemas. This also splits out the registration into the table-specific source files.	2017-11-29 18:36:52 -08:00
James Phillips	36bb30e67a	Creates a registration mechanism for RPC endpoints.	2017-11-29 18:36:52 -08:00
James Phillips	ba56669ea8	Renames stubs to be more consistent.	2017-11-29 18:36:52 -08:00
James Phillips	56552095c9	Sheds monotonic time info so tombstone GC bins work properly.	2017-11-29 10:34:24 -08:00
James Phillips	8656b7a3e9	Gives back the lock before writing to the expire channel. The lock isn't needed after we clean up the expire bin, and as seen in #3700 we can get into a deadlock waiting to place the expire index into the channel while holding this lock. Fixes #3700	2017-11-19 16:24:16 -08:00
James Phillips	8210523b1b	Moves the LAN event handler after the router is created. Fixes #3680	2017-11-10 12:26:48 -08:00
James Phillips	bfbbfb62ca	Revert "Adds a small sleep to make sure we are in the next GC bucket."	2017-11-08 22:18:37 -08:00
James Phillips	d6328a5bf8	Adds a sleep to make sure we are in the next GC bucket, ups time. Fixes #3670	2017-11-08 22:02:40 -08:00
James Phillips	91824375be	Skips the tombstone GC test in Travis for now. Related to #3670	2017-11-08 20:14:20 -08:00
James Phillips	b94ba8aeb4	Removes bogus getPort() in favor of freeport.	2017-11-08 19:55:50 -08:00
James Phillips	444a345a3a	Tightens timing up and reorders GC test to be less flaky.	2017-11-08 15:09:29 -08:00
James Phillips	e00624425b	Doubles the GC timing.	2017-11-08 15:01:11 -08:00
James Phillips	8eb91777d9	Opens up test timing a little more.	2017-11-08 14:01:19 -08:00
James Phillips	d45c2a01f1	Shifts off a gran boundary to help make test less flaky.	2017-11-08 13:57:17 -08:00
James Phillips	757e353334	Opens up the tombstone GC test timing.	2017-11-08 13:43:39 -08:00
Kyle Havlovitz	068ca11eb8	Move check definition to a sub-struct	2017-11-01 14:54:46 -07:00
Kyle Havlovitz	bc3ba5f873	Merge branch 'master' into esm-changes	2017-11-01 11:37:48 -07:00
Kyle Havlovitz	83524f44c4	Merge pull request #3622 from hashicorp/coordinate-node-endpoint agent: add /v1/coordianate/node/:node endpoint	2017-11-01 11:35:50 -07:00
Kyle Havlovitz	9909b661ac	Fill out the tests around coordinate/node functionality	2017-10-31 15:36:44 -07:00
Kyle Havlovitz	fd4d9f1c16	Factor out registerNodes function	2017-10-31 13:34:49 -07:00
James Phillips	c6e0366c02	Relaxes Autopilot promotion logic. (#3623 ) * Relaxes Autopilot promotion logic. When we defaulted the Raft protocol version to 3 in #3477 we made the numPeers() routine more strict to only count voters (this is more conservative and more correct). This had the side effect of breaking rolling updates because it's at odds with the Autopilot non-voter promotion logic. That logic used to wait to only promote to maintain an odd quorum of servers. During a rolling update (add one new server, wait, and then kill an old server) the dead server cleanup would still count the old server as a peer, which is conservative and the right thing to do, and no longer count the non-voter. This would wait to promote, so you could get into a stalemate. It is safer to promote early than remove early, so by promoting as soon as possible we have chosen that as the solution here. Fixes #3611 * Gets rid of unnecessary extra not-a-voter check.	2017-10-31 15:16:56 -05:00
Kyle Havlovitz	496dd7ab5b	Merge branch 'coordinate-node-endpoint' of github.com:hashicorp/consul into esm-changes	2017-10-26 19:20:24 -07:00
Kyle Havlovitz	f80e70271d	Added Coordinate.Node rpc endpoint and client api method	2017-10-26 19:16:40 -07:00
Kyle Havlovitz	84a07ea113	Expose SkipNodeUpdate field and some health check info in the http api	2017-10-25 19:37:30 +02:00
Frank Schroeder	74859ff3c0	test: replace porter tool with freeport lib This patch removes the porter tool which hands out free ports from a given range with a library which does the same thing. The challenge for acquiring free ports in concurrent go test runs is that go packages are tested concurrently and run in separate processes. There has to be some inter-process synchronization in preventing processes allocating the same ports. freeport allocates blocks of ports from a range expected to be not in heavy use and implements a system-wide mutex by binding to the first port of that block for the lifetime of the application. Ports are then provided sequentially from that block and are tested on localhost before being returned as available.	2017-10-21 22:01:09 +02:00
Ryan Slade	6f05ea91a3	Replace time.Now().Sub(x) with time.Since(x)	2017-10-17 20:38:24 +02:00
James Phillips	e9670761f9	Cleans up some drift between the OSS and Enterprise trees.	2017-10-11 15:53:07 -07:00
James Phillips	d1ad538345	Makes RPC handling more robust when rolling servers. (#3561 ) * Adds client-side retry for no leader errors. This paves over the case where the client was connected to the leader when it loses leadership. * Adds a configurable server RPC drain time and a fail-fast path for RPCs. When a server leaves it gets removed from the Raft configuration, so it will never know who the new leader server ends up being. Without this we'd be doomed to wait out the RPC hold timeout and then fail. This makes things fail a little quicker while a sever is draining, and since we added a client retry AND since the server doing this has already shut down and left the Serf LAN, clients should retry against some other server. * Makes the RPC hold timeout configurable. * Reorders struct members. * Sets the RPC hold timeout default for test servers. * Bumps the leave drain time up to 5 seconds. * Robustifies retries with a simpler client-side RPC hold. * Reverts untended delete.	2017-10-10 15:19:50 -07:00
James Phillips	a1db119d02	Fixes handling of stop channel and failed barrier attempts. (#3546 ) * Fixes handling of stop channel and failed barrier attempts. There were two issues here. First, we needed to not exit when there was a timeout trying to write the barrier, because Raft might not step down, so we'd be left as the leader but having run all the step down actions. Second, we didn't close over the stopCh correctly, so it was possible to nil that out and have the leaderLoop never exit. We close over it properly AND sequence the nil-ing of it AFTER the leaderLoop exits for good measure, so the code is more robust. Fixes #3545 * Cleans up based on code review feedback. * Tweaks comments. * Renames variables and removes comments.	2017-10-06 07:54:49 -07:00
Kyle Havlovitz	0063516e5e	Update metric names and add a legacy config flag	2017-10-04 16:43:27 -07:00
Preetha Appan	f38d20eb40	Remove extra newline	2017-10-03 15:19:31 -05:00
Preetha Appan	3c81e2db7c	Only allow 'list' policies within 'key' policy definitions. Consolidated two similar tests into one and fixed alignment.	2017-10-03 15:15:56 -05:00
Preetha Appan	d5acfc3982	Introduces new 'list' permission that applies to KV store recursive reads, and enforced only when opted in.	2017-10-02 17:10:21 -05:00
James Phillips	330ce87851	Gets rid of flaky clause in stats fetcher unit test. Given how the rutine is coded we can still get data so this wasn't a reliable thing to check.	2017-09-26 20:53:06 -07:00
preetapan	783e24be64	Issue 3452 (#3500 ) * Make sure that id and address are set in member created during reaping of catalog nodes that have been removed from serf * Get address from node table in the state store rather than from service address * Fix incorrect lookup by checkname instead of node name * Make sure that serverlookup is called with the right address format, added unit test. * Address code review comments * Tweaks style stuff.	2017-09-26 20:49:41 -07:00
James Phillips	4b17c9618f	Cleans up some edge cases in TestSnapshot_Forward_Leader. These could cause the tests to hang.	2017-09-26 14:07:28 -07:00
Preetha Appan	318d0232f7	Move Raft protocol version for list peers end point to server side, fix unit tests. This fixes #3449	2017-09-26 09:35:39 -05:00
James Phillips	fcaa889116	Bumps default Raft protocol to version 3. (#3477 ) * Changes default Raft protocol to 3. * Changes numPeers() to report only voters. This should have been there before, but it's more obvious that this is incorrect now that we default the Raft protocol to 3, which puts new servers in a read-only state while Autopilot waits for them to become healthy. * Fixes TestLeader_RollRaftServer. * Fixes TestOperator_RaftRemovePeerByAddress. * Fixes TestServer_. Relaxed the check for a given number of voter peers and instead do a thorough check that all servers see each other in their Raft configurations. Fixes TestACL_. These now just check for Raft replication to be set up, and don't care about the number of voter peers. Fixes TestOperator_Raft_ListPeers. * Fixes TestAutopilot_CleanupDeadServerPeriodic. * Fixes TestCatalog_ListNodes_ConsistentRead_Fail. * Fixes TestLeader_ChangeServerID and adjusts the conn pool to throw away sockets when it sees io.EOF. * Changes version to 1.0.0 in the options doc. * Makes metrics test more deterministic with autopilot metrics possible.	2017-09-25 15:27:04 -07:00
Preetha Appan	8394ad08db	Introduce Code Policy validation via sentinel, with a noop implementation	2017-09-25 13:44:55 -05:00
Frank Schröder	69a088ca85	New config parser, HCL support, multiple bind addrs (#3480 ) * new config parser for agent This patch implements a new config parser for the consul agent which makes the following changes to the previous implementation: * add HCL support * all configuration fragments in tests and for default config are expressed as HCL fragments * HCL fragments can be provided on the command line so that they can eventually replace the command line flags. * HCL/JSON fragments are parsed into a temporary Config structure which can be merged using reflection (all values are pointers). The existing merge logic of overwrite for values and append for slices has been preserved. * A single builder process generates a typed runtime configuration for the agent. The new implementation is more strict and fails in the builder process if no valid runtime configuration can be generated. Therefore, additional validations in other parts of the code should be removed. The builder also pre-computes all required network addresses so that no address/port magic should be required where the configuration is used and should therefore be removed. * Upgrade github.com/hashicorp/hcl to support int64 * improve error messages * fix directory permission test * Fix rtt test * Fix ForceLeave test * Skip performance test for now until we know what to do * Update github.com/hashicorp/memberlist to update log prefix * Make memberlist use the default logger * improve config error handling * do not fail on non-existing data-dir * experiment with non-uniform timeouts to get a handle on stalled leader elections * Run tests for packages separately to eliminate the spurious port conflicts * refactor private address detection and unify approach for ipv4 and ipv6. Fixes #2825 * do not allow unix sockets for DNS * improve bind and advertise addr error handling * go through builder using test coverage * minimal update to the docs * more coverage tests fixed * more tests * fix makefile * cleanup * fix port conflicts with external port server 'porter' * stop test server on error * do not run api test that change global ENV concurrently with the other tests * Run remaining api tests concurrently * no need for retry with the port number service * monkey patch race condition in go-sockaddr until we understand why that fails * monkey patch hcl decoder race condidtion until we understand why that fails * monkey patch spurious errors in strings.EqualFold from here * add test for hcl decoder race condition. Run with go test -parallel 128 * Increase timeout again * cleanup * don't log port allocations by default * use base command arg parsing to format help output properly * handle -dc deprecation case in Build * switch autopilot.max_trailing_logs to int * remove duplicate test case * remove unused methods * remove comments about flag/config value inconsistencies * switch got and want around since the error message was misleading. * Removes a stray debug log. * Removes a stray newline in imports. * Fixes TestACL_Version8. * Runs go fmt. * Adds a default case for unknown address types. * Reoders and reformats some imports. * Adds some comments and fixes typos. * Reorders imports. * add unix socket support for dns later * drop all deprecated flags and arguments * fix wrong field name * remove stray node-id file * drop unnecessary patch section in test * drop duplicate test * add test for LeaveOnTerm and SkipLeaveOnInt in client mode * drop "bla" and add clarifying comment for the test * split up tests to support enterprise/non-enterprise tests * drop raft multiplier and derive values during build phase * sanitize runtime config reflectively and add test * detect invalid config fields * fix tests with invalid config fields * use different values for wan sanitiziation test * drop recursor in favor of recursors * allow dns_config.udp_answer_limit to be zero * make sure tests run on machines with multiple ips * Fix failing tests in a few more places by providing a bind address in the test * Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder. * Add porter to server_test.go to make tests there less flaky * go fmt	2017-09-25 11:40:42 -07:00
James Phillips	268018c558	Robustifies check in TestCatalog_ListNodes_ConsistentRead_Fail test. Fixes #3469	2017-09-13 21:22:53 -07:00
James Phillips	8be4ee766a	Revert "Manages segments list via a pointer." This reverts commit c277a4250461443cbd63de0259e5e32766f651ea.	2017-09-07 16:37:11 -07:00
James Phillips	5008aabb62	Manages segments list via a pointer.	2017-09-07 16:21:07 -07:00
James Phillips	908f7be97f	Cleans up formatting.	2017-09-07 12:26:58 -07:00
James Phillips	02a3f3f27b	Shows the segment name in the keyring API and command output.	2017-09-07 12:17:39 -07:00
James Phillips	7c616e3768	Moves reconcile loop into segment stub.	2017-09-06 18:01:53 -07:00
James Phillips	4e34c2af06	Takes the skip out of the client check. Without this the merge delegate won't check the segment for non-servers a little below here.	2017-09-06 17:05:40 -07:00
James Phillips	78ac144fff	Merge pull request #3447 from hashicorp/issue-3070 Skips unique node ID check for old versions of Consul.	2017-09-06 13:24:15 -07:00
James Phillips	62d9299646	Fixes incorrect comment.	2017-09-06 13:23:19 -07:00
James Phillips	031f1874d0	Pulls down some code for the check loop.	2017-09-06 13:07:42 -07:00
James Phillips	2fd9328b21	Uses the Raft configuration for the self-add skip check.	2017-09-06 13:05:51 -07:00
Preetha Appan	1eae9f1e2f	Change member join reconcile step to process joining itself, to handle node IP address changes correctly when number of servers < 3	2017-09-06 13:53:01 -05:00
James Phillips	353e037c9b	Skips unique node ID check for old versions of Consul. Fixes #3070.	2017-09-05 22:57:29 -07:00
James Phillips	c629773b40	Makes the all segments query explict, and the default for `consul members`.	2017-09-05 12:22:20 -07:00
James Phillips	bc9780baad	Adds simple rate limiting for client agent RPC calls to Consul servers. (#3440 ) * Added rate limiting for agent RPC calls. * Initializes the rate limiter based on the config. * Adds the rate limiter into the snapshot RPC path. * Adds unit tests for the RPC rate limiter. * Groups the RPC limit parameters under "limits" in the config. * Adds some documentation about the RPC limiter. * Sends a 429 response when the rate limiter kicks in. * Adds docs for new telemetry. * Makes snapshot telemetry look like RPC telemetry and cleans up comments.	2017-09-01 15:02:50 -07:00
Kyle Havlovitz	334e082848	Merge pull request #3431 from hashicorp/network-segments-oss	2017-09-01 10:24:58 -07:00
Kyle Havlovitz	ff994e9ade	Pass listeners into setupSegments	2017-08-31 17:56:43 -07:00
Kyle Havlovitz	5cc4b32a5d	Organize segments for a cleaner split between enterprise and OSS	2017-08-31 17:39:46 -07:00
Kyle Havlovitz	b77a0aa932	Fix some inconsistencies with segment logic and comments	2017-08-30 17:43:46 -07:00
Preetha Appan	0728a04dbb	Wire server provider for raft layer only on protocol version 3 and above, and update changelog	2017-08-30 14:36:47 -05:00
Kyle Havlovitz	6ded43131a	Add segment addr field to tags for LAN flood joiner	2017-08-30 11:58:29 -07:00
Kyle Havlovitz	1c04f1537a	Add agent.segment interpolation to prepared queries	2017-08-30 11:58:29 -07:00
Kyle Havlovitz	107d7f6c5a	Add rpc_listener option to segment config	2017-08-30 11:58:29 -07:00
James Phillips	6a6eadd8c7	Adds open source side of network segments (feature is Enterprise-only).	2017-08-30 11:58:29 -07:00
Preetha Appan	e944370cde	More cleanup from code review	2017-08-30 12:31:36 -05:00
Preetha Appan	a215c764cd	Remove copy pasted duplicate line, update documentation.	2017-08-30 10:02:10 -05:00
Preetha Appan	5a29eb7486	Consolidate server lookup into one place and replace usages of localConsuls.	2017-08-30 09:30:33 -05:00
Preetha Appan	d8fe01db4c	Remove stray commented line	2017-08-30 09:30:33 -05:00
Preetha Appan	ca48e7e4c2	Remove server address tracking logic from manager/router and maintain it as part of lan event listener instead. Used sync.Map to track this, and added unit tests	2017-08-30 09:30:33 -05:00
Preetha Appan	b4a9d77d49	ServerAddressProvider interface also returns an error now	2017-08-30 09:30:33 -05:00
Preetha Appan	edb408bc22	Use config struct to create NetworkTransport layer when setting up raft	2017-08-30 09:30:33 -05:00
Preetha Appan	01f8e469aa	Implement AddressProvider and wire that up to raft transport layer to support server nodes changing their IP addresses in containerized environments	2017-08-30 09:30:33 -05:00
Frank Schroeder	62c77d70f0	build: make tests independent of build tags When the metadata server is scanning the agents for potential servers it is parsing the version number which the agent provided when it joined. This version number has to conform to a certain format, i.e. 'n.n.n'. Without this version number properly set some tests fail with error messages that disguise the root cause. The default version number is currently set to 'unknown' in version/version.go which does not parse and triggers the tests to fail. The work around is to use a build tag 'consul' which will use the version number set in version_base.go instead which has the correct format and is set to the current release version. In addition, some parts of the code also require the version number to be of a certain value. Setting it to '0.0.0' for example makes some tests pass and others fail since they don't pass the semantic check. When using go build/install/test one has to remember to use '-tags consul' or tests will fail with non-obvious error messages. Using build tags makes the build process more complex and error prone since it prevents the use of the plain go toolchain and - at least in its current form - introduces subtle build and test issues. We should try to eliminate build tags for anything else but platform specific code. This patch removes all references to specific version numbers in the code and tests and sets the default version to '9.9.9' which is syntactically correct and passes the semantic check. This solves the issue of running go build/install/test without tags for the OSS build.	2017-08-30 13:40:18 +02:00
Frank Schröder	44e6b8122d	acl: consolidate error handling (#3401 ) The error handling of the ACL code relies on the presence of certain magic error messages. Since the error values are sent via RPC between older and newer consul agents we cannot just replace the magic values with typed errors and switch to type checks since this would break compatibility with older clients. Therefore, this patch moves all magic ACL error messages into the acl package and provides default error values and helper functions which determine the type of error.	2017-08-23 16:52:48 +02:00
Frank Schroeder	d9e2a51887	agent: drop unused code This code from http://github.com/hashicorp/consul/pull/3353 is no longer required.	2017-08-22 00:02:46 +02:00
James Phillips	3518e27a76	Revert "Return 403 rather than a 404 when acls cause all results to be filter…"	2017-08-09 15:06:57 -07:00
James Phillips	91205b2cd6	Revert "Ensure that we return a permission denied only if the list of keys/en…"	2017-08-09 15:06:20 -07:00
Preetha Appan	121326161e	Added unit test case to kvs_endpointtest	2017-08-09 15:50:22 -05:00

... 5 6 7 8 9 ...

717 commits