open-consul

Commit Graph

Author	SHA1	Message	Date
Matt Keeler	a34f8c751e	Pass a testing.T into NewTestAgent and TestAgent.Start (#5342 ) This way we can avoid unnecessary panics which cause other tests not to run. This doesn't remove all the possibilities for panics causing other tests not to run, it just fixes the TestAgent	2019-02-14 10:59:14 -05:00
Aestek	5647ca2bbb	[Fix] Services sometimes not being synced with acl_enforce_version_8 = false (#4771 ) Fixes: https://github.com/hashicorp/consul/issues/3676 This fixes a bug were registering an agent with a non-existent ACL token can prevent other services registered with a good token from being synced to the server when using `acl_enforce_version_8 = false`. ## Background When `acl_enforce_version_8` is off the agent does not check the ACL token validity before storing the service in its state. When syncing a service registered with a missing ACL token we fall into the default error handling case (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1255) and stop the sync (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1082) without setting its Synced property to true like in the permission denied case. This means that the sync will always stop at the faulty service(s). The order in which the services are synced is random since we iterate on a map. So eventually all services with good ACL tokens will be synced, this can however take some time and is influenced by the cluster size, the bigger the slower because retries are less frequent. Having a service in this state also prevent all further sync of checks as they are done after the services. ## Changes This change modify the sync process to continue even if there is an error. This fixes the issue described above as well as making the sync more error tolerant: if the server repeatedly refuses a service (the ACL token could have been deleted by the time the service is synced, the servers were upgraded to a newer version that has more strict checks on the service definition...). Then all services and check that can be synced will, and those that don't will be marked as errors in the logs instead of blocking the whole process.	2019-01-04 10:01:50 -05:00
Matt Keeler	99e0a124cb	New ACLs (#4791 ) This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week. Description At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers. On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though. Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though. All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management. Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are: A server running the new system must still support other clients using the legacy system. A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system. The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode. So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.	2018-10-19 12:04:07 -04:00
Paul Banks	10af44006a	Proxy Config Manager (#4729 ) * Proxy Config Manager This component watches for local state changes on the agent and ensures that each service registered locally with Kind == connect-proxy has it's state being actively populated in the cache. This serves two purposes: 1. For the built-in proxy, it ensures that the state needed to accept connections is available in RAM shortly after registration and likely before the proxy actually starts accepting traffic. 2. For (future - next PR) xDS server and other possible future proxies that require _push_ based config discovery, this provides a mechanism to subscribe and be notified about updates to a proxy instance's config including upstream service discovery results. * Address review comments * Better comments; Better delivery of latest snapshot for slow watchers; Embed Config * Comment typos * Add upstream Stringer for funsies	2018-10-10 16:55:34 +01:00
Paul Banks	979e1c9c94	Add -sidecar-for and new /agent/service/:service_id endpoint (#4691 ) - A new endpoint `/v1/agent/service/:service_id` which is a generic way to look up the service for a single instance. The primary value here is that it: - supports hash-based blocking and so; - replaces `/agent/connect/proxy/:proxy_id` as the mechanism the built-in proxy uses to read its config. - It's not proxy specific and so works for any service. - It has a temporary shim to call through to the existing endpoint to preserve current managed proxy config defaulting behaviour until that is removed entirely (tested). - The built-in proxy now uses the new endpoint exclusively for it's config - The built-in proxy now has a `-sidecar-for` flag that allows the service ID of the _target_ service to be specified, on the condition that there is exactly one "sidecar" proxy (that is one that has `Proxy.DestinationServiceID` set) for the service registered. - Several fixes for edge cases for SidecarService - A fix for `Alias` checks - when running locally they didn't update their state until some external thing updated the target. If the target service has no checks registered as below, then the alias never made it past critical.	2018-10-10 16:55:34 +01:00
Paul Banks	92fe8c8e89	Add Proxy Upstreams to Service Definition (#4639 ) * Refactor Service Definition ProxyDestination. This includes: - Refactoring all internal structs used - Updated tests for both deprecated and new input for: - Agent Services endpoint response - Agent Service endpoint response - Agent Register endpoint - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Register - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Services endpoint response - Catalog Node endpoint response - Catalog Service endpoint response - Updated API tests for all of the above too (both deprecated and new forms of register) TODO: - config package changes for on-disk service definitions - proxy config endpoint - built-in proxy support for new fields * Agent proxy config endpoint updated with upstreams * Config file changes for upstreams. * Add upstream opaque config and update all tests to ensure it works everywhere. * Built in proxy working with new Upstreams config * Command fixes and deprecations * Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts... TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct. * Fix translated keys in API registration. ≈ * Fixes from docs - omit some empty undocumented fields in API - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally. * Documentation updates for Upstreams in service definition * Fixes for tests broken by many refactors. * Enable travis on f-connect branch in this branch too. * Add consistent Deprecation comments to ProxyDestination uses * Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs	2018-10-10 16:55:34 +01:00
Pierre Souchay	473e589d86	Implementation of Weights Data structures (#4468 ) * Implementation of Weights Data structures Adding this datastructure will allow us to resolve the issues #1088 and #4198 This new structure defaults to values: ``` { Passing: 1, Warning: 0 } ``` Which means, use weight of 0 for a Service in Warning State while use Weight 1 for a Healthy Service. Thus it remains compatible with previous Consul versions. * Implemented weights for DNS SRV Records * DNS properly support agents with weight support while server does not (backwards compatibility) * Use Warning value of Weights of 1 by default When using DNS interface with only_passing = false, all nodes with non-Critical healthcheck used to have a weight value of 1. While having weight.Warning = 0 as default value, this is probably a bad idea as it breaks ascending compatibility. Thus, we put a default value of 1 to be consistent with existing behaviour. * Added documentation for new weight field in service description * Better documentation about weights as suggested by @banks * Return weight = 1 for unknown Check states as suggested by @banks * Fixed typo (of -> or) in error message as requested by @mkeeler * Fixed unstable unit test TestRetryJoin * Fixed unstable tests * Fixed wrong Fatalf format in `testrpc/wait.go` * Added notes regarding DNS SRV lookup limitations regarding number of instances * Documentation fixes and clarification regarding SRV records with weights as requested by @banks * Rephrase docs	2018-09-07 15:30:47 +01:00
Martin	6af4501a68	Use target service name instead of ID as connect proxy service name (#4620 )	2018-09-05 20:33:17 +01:00
Siva Prasad	5fe9053416	TestAgentAntiEntropy: Wait until Consul service is up on the agent. (#4591 ) * Anti-Entropy test wait for Consul service added * Reverted some tests back to using WaitForLeader	2018-08-28 09:52:11 -04:00
Pierre Souchay	fd927ea110	BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472 ) - Improve resilience of testrpc.WaitForLeader() - Add additionall retry to CI - Increase "go test" timeout to 8m - Add wait for cluster leader to several tests in the agent package - Add retry to some tests in the api and command packages	2018-08-06 19:46:09 -04:00
Mitchell Hashimoto	dedc5ad69f	agent/local: silly spacing on select statements	2018-07-19 14:21:30 -05:00
Mitchell Hashimoto	e42ca78c5d	agent/local: address remaining test feedback	2018-07-19 14:20:50 -05:00
Mitchell Hashimoto	81f6486fb5	agent/local: don't use time.After in test since notify is instant	2018-07-18 16:16:28 -05:00
Mitchell Hashimoto	5889a3b6ff	agent: address some basic feedback	2018-07-12 09:36:11 -07:00
Mitchell Hashimoto	3177d1719d	agent/local: support local alias checks	2018-07-12 09:36:10 -07:00
Pierre Souchay	9128de5b11	Merge remote-tracking branch 'origin/master' into ACL_additional_info	2018-07-07 14:09:18 +02:00
Paul Banks	1e5a2561b6	Make tests pass and clean proxy persistence. No detached child changes yet. This is a good state for persistence stuff to re-start the detached child work that got mixed up last time.	2018-06-25 12:24:10 -07:00
Paul Banks	3bac52480e	Abandon daemonize for simpler solution (preserving history): Reverts: - bdb274852ae469c89092d6050697c0ff97178465 - 2c689179c4f61c11f0016214c0fc127a0b813bfe - d62e25c4a7ab753914b6baccd66f88ffd10949a3 - c727ffbcc98e3e0bf41e1a7bdd40169bd2d22191 - 31b4d18933fd0acbe157e28d03ad59c2abf9a1fb - 85c3f8df3eabc00f490cd392213c3b928a85aa44	2018-06-25 12:24:10 -07:00
Paul Banks	e1aca748c4	Make daemoinze an option on test binary without hacks. Misc fixes for racey or broken tests. Still failing on several though.	2018-06-25 12:24:09 -07:00
Paul Banks	3a00574a13	Persist proxy state through agent restart	2018-06-25 12:24:08 -07:00
Mitchell Hashimoto	ed14e9edf8	agent: resolve some conflicts and fix tests	2018-06-14 09:42:10 -07:00
Mitchell Hashimoto	657c09133a	agent/local: clarify the non-risk of a full buffer	2018-06-14 09:42:10 -07:00
Mitchell Hashimoto	31b09c0674	agent/local: remove outdated comment	2018-06-14 09:42:10 -07:00
Mitchell Hashimoto	a2167a7fd1	agent/proxy: manager and basic tests, not great coverage yet coming soon	2018-06-14 09:42:08 -07:00
Mitchell Hashimoto	fae8dc8951	agent/local: add Notify mechanism for proxy changes	2018-06-14 09:42:08 -07:00
Mitchell Hashimoto	f64a002f68	agent: start/stop proxies	2018-06-14 09:42:08 -07:00
Mitchell Hashimoto	76c6849ffe	agent/local: store proxy on local state, wip, not working yet	2018-06-14 09:42:08 -07:00
Paul Banks	02ab461dae	TLS watching integrated into Service with some basic tests. There are also a lot of small bug fixes found when testing lots of things end-to-end for the first time and some cleanup now it's integrated with real CA code.	2018-06-14 09:42:07 -07:00
Paul Banks	9d11cd9bf4	Fix various test failures and vet warnings. Intention de-duplication in previously merged PR actualy failed some tests that were not caught be me or CI. I ran the test files for state changes but they happened not to trigger this case so I made sure they did first and then fixed. That fixed some upstream intention endpoint tests that I'd not run as part of testing the previous fix.	2018-06-14 09:41:58 -07:00
Paul Banks	44afb5c699	Agent Connect Proxy config endpoint with hash-based blocking	2018-06-14 09:41:57 -07:00
Paul Banks	78e48fd547	Added connect proxy config and local agent state setup on boot.	2018-06-14 09:41:57 -07:00
Mitchell Hashimoto	c43ccd024a	agent/local: anti-entropy for connect proxy services	2018-06-14 09:41:48 -07:00
Pierre Souchay	6c7f01ae73	Removed labels from new ACL denied metrics	2018-06-08 11:56:46 +02:00
Pierre Souchay	2113071ae7	Removed consul prefix from metrics as requested by @kyhavlov	2018-06-08 11:51:50 +02:00
Pierre Souchay	bebf03e26e	Fixed import	2018-04-18 17:09:25 +02:00
Pierre Souchay	4739b05d12	Added labels to improve new metric	2018-04-18 16:51:22 +02:00
Pierre Souchay	12f81c60ac	Track calls blocked by ACLs using metrics	2018-04-17 10:17:16 +02:00
Guido Iaquinti	244fc72b05	Add package name to log output	2018-03-21 15:56:14 +00:00
Josh Soref	1dd8c378b9	Spelling (#3958 ) * spelling: another * spelling: autopilot * spelling: beginning * spelling: circonus * spelling: default * spelling: definition * spelling: distance * spelling: encountered * spelling: enterprise * spelling: expands * spelling: exits * spelling: formatting * spelling: health * spelling: hierarchy * spelling: imposed * spelling: independence * spelling: inspect * spelling: last * spelling: latest * spelling: client * spelling: message * spelling: minimum * spelling: notify * spelling: nonexistent * spelling: operator * spelling: payload * spelling: preceded * spelling: prepared * spelling: programmatically * spelling: required * spelling: reconcile * spelling: responses * spelling: request * spelling: response * spelling: results * spelling: retrieve * spelling: service * spelling: significantly * spelling: specifies * spelling: supported * spelling: synchronization * spelling: synchronous * spelling: themselves * spelling: unexpected * spelling: validations * spelling: value	2018-03-19 16:56:00 +00:00
James Phillips	c52824bab7	Adds a longer retry period for the AE deferred output test. There's some justification in the comments about this and a TODO to improve this later. Fixes #3668	2017-11-08 18:10:13 -08:00
Frank Schroeder	1d2ae14719	local state: fix go vet issue	2017-10-23 10:56:05 +02:00
Frank Schroeder	a818414bb6	local state: remove stale comment	2017-10-23 10:56:05 +02:00
Frank Schroeder	329fdc40a8	local state: make test more robust	2017-10-23 10:56:05 +02:00
Frank Schroeder	f5a3d73b27	local state: clone check to avoid side effect	2017-10-23 10:56:05 +02:00
Frank Schroeder	b36613e7ff	local state: use synchronized access to internal maps	2017-10-23 10:56:05 +02:00
Frank Schroeder	f187c37c27	local state: rename Add{Check,Service}State to Set{Check,Service}State	2017-10-23 10:56:04 +02:00
Frank Schroeder	209e67b2f9	local state: move Metadata methods together	2017-10-23 10:56:04 +02:00
Frank Schroeder	9513a042be	local state: update documentation of updateSyncState	2017-10-23 10:56:04 +02:00
Frank Schroeder	2e3b72d2c3	local state: update comments	2017-10-23 10:56:04 +02:00
Frank Schroeder	da604495a0	local state: address review comments * move non-blocking notification mechanism into ae.Trigger * move Pause/Resume into separate type	2017-10-23 10:56:04 +02:00

1 2

97 Commits