open-consul

Commit Graph

Author	SHA1	Message	Date
Matt Keeler	442924c35a	Sync of OSS changes to support namespaces (#6909 )	2019-12-09 21:26:41 -05:00
Christian Muehlhaeuser	26f9368567	Fixed typos in comments (#6175 ) Just a few nitpicky typo fixes.	2019-07-19 07:54:53 -04:00
Matt Keeler	4c03f99a85	Fix CAS operations on Services (#5971 ) * Fix CAS operations on services * Update agent/consul/state/catalog_test.go Co-Authored-By: R.B. Boyer <public@richardboyer.net>	2019-06-17 10:41:04 -04:00
Kyle Havlovitz	dcbffdb956	Merge branch 'master' into change-node-id	2019-05-15 10:51:04 -07:00
Paul Banks	68e8933ba5	Connect: Make Connect health queries unblock correctly (#5508 ) * Make Connect health queryies unblock correctly in all cases and use optimal number of watch chans. Fixes #5506. * Node check test cases and clearer bug test doc * Comment update	2019-03-21 16:01:56 +00:00
Paul Banks	dd08426b04	Optimize health watching to single chan/goroutine. (#5449 ) Refs #4984. Watching chans for every node we touch in a health query is wasteful. In #4984 it shows that if there are more than 682 service instances we always fallback to watching all services which kills performance. We already have a record in MemDB that is reliably update whenever the service health result should change thanks to per-service watch indexes. So in general, provided there is at least one service instances and we actually have a service index for it (we always do now) we only ever need to watch a single channel. This saves us from ever falling back to the general index and causing the performance cliff in #4984, but it also means fewer goroutines and work done for every blocking health query. It also saves some allocations made during the query because we no longer have to populate a WatchSet with 3 chans per service instance which saves the internal map allocation. This passes all state store tests except the one that explicitly checked for the fallback behaviour we've now optimized away and in general seems safe.	2019-03-15 20:18:48 +00:00
Kyle Havlovitz	3aec844fd2	Update state store test for changing node ID	2019-03-13 17:05:31 -07:00
Aestek	071fcb28ba	[catalog] Update the node's services indexes on update (#5458 ) Node updates were not updating the service indexes, which are used for service related queries. This caused the X-Consul-Index to stay the same after a node update as seen from a service query even though the node data is returned in heath queries. If that happened in between queries the client would miss this change. We now update the indexes of the services on the node when it is updated. Fixes: #5450	2019-03-11 14:48:19 +00:00
Aestek	ff13518961	Improve blocking queries on services that do not exist (#4810 ) ## Background When making a blocking query on a missing service (was never registered, or is not registered anymore) the query returns as soon as any service is updated. On clusters with frequent updates (5~10 updates/s in our DCs) these queries virtually do not block, and clients with no protections againt this waste ressources on the agent and server side. Clients that do protect against this get updates later than they should because of the backoff time they implement between requests. ## Implementation While reducing the number of unnecessary updates we still want : * Clients to be notified as soon as when the last instance of a service disapears. * Clients to be notified whenever there's there is an update for the service. * Clients to be notified as soon as the first instance of the requested service is added. To reduce the number of unnecessary updates we need to block when a request to a missing service is made. However in the following case : 1. Client `client1` makes a query for service `foo`, gets back a node and X-Consul-Index 42 2. `foo` is unregistered 3. `client1` makes a query for `foo` with `index=42` -> `foo` does not exist, the query blocks and `client1` is not notified of the change on `foo` We could store the last raft index when each service was last alive to know wether we should block on the incoming query or not, but that list could grow indefinetly. We instead store the last raft index when a service was unregistered and use it when a query targets a service that does not exist. When a service `srv` is unregistered this "missing service index" is always greater than any X-Consul-Index held by the clients while `srv` was up, allowing us to immediatly notify them. 1. Client `client1` makes a query for service `foo`, gets back a node and `X-Consul-Index: 42` 2. `foo` is unregistered, we set the "missing service index" to 43 3. `client1` makes a blocking query for `foo` with `index=42` -> `foo` does not exist, we check against the "missing service index" and return immediatly with `X-Consul-Index: 43` 4. `client1` makes a blocking query for `foo` with `index=43` -> we block 5. Other changes happen in the cluster, but foo still doesn't exist and "missing service index" hasn't changed, the query is still blocked 6. `foo` is registered again on index 62 -> `foo` exists and its index is greater than 43, we unblock the query	2019-01-11 09:26:14 -05:00
Rebecca Zanzig	0ec6d880f5	Support multiple tags for health and catalog http api endpoints (#4717 ) * Support multiple tags for health and catalog api endpoints Fixes #1781. Adds a `ServiceTags` field to the ServiceSpecificRequest to support multiple tags, updates the filter logic in the catalog store, and propagates these change through to the health and catalog endpoints. Note: Leaves `ServiceTag` in the struct, since it is being used as part of the DNS lookup, which in turn uses the health check. * Update the api package to support multiple tags Includes additional tests. * Update new tests to use the `require` library * Update HealthConnect check after a bad merge	2018-10-11 12:50:05 +01:00
Pierre Souchay	b0fc91a1d2	[Performance On Large clusters] Reduce updates on large services (#4720 ) * [Performance On Large clusters] Checks do update services/nodes only when really modified to avoid too many updates on very large clusters In a large cluster, when having a few thousands of nodes, the anti-entropy mechanism performs lots of changes (several per seconds) while there is no real change. This patch wants to improve this in order to increase Consul scalability when using many blocking requests on health for instance. * [Performance for large clusters] Only updates index of service if service is really modified * [Performance for large clusters] Only updates index of nodes if node is really modified * Added comments / ensure IsSame() has clear semantics * Avoid having modified boolean, return nil directly if stutures are Same * Fixed unstable unit tests TestLeader_ChangeServerID * Rewrite TestNode_IsSame() for better readability as suggested by @banks * Rename ServiceNode.IsSame() into IsSameService() + added unit tests * Do not duplicate TestStructs_ServiceNode_Conversions() and increase test coverage of IsSameService * Clearer documentation in IsSameService * Take into account ServiceProxy into ServiceNode.IsSameService() * Fixed IsSameService() with all new structures	2018-10-11 12:42:39 +01:00
Paul Banks	92fe8c8e89	Add Proxy Upstreams to Service Definition (#4639 ) * Refactor Service Definition ProxyDestination. This includes: - Refactoring all internal structs used - Updated tests for both deprecated and new input for: - Agent Services endpoint response - Agent Service endpoint response - Agent Register endpoint - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Register - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Services endpoint response - Catalog Node endpoint response - Catalog Service endpoint response - Updated API tests for all of the above too (both deprecated and new forms of register) TODO: - config package changes for on-disk service definitions - proxy config endpoint - built-in proxy support for new fields * Agent proxy config endpoint updated with upstreams * Config file changes for upstreams. * Add upstream opaque config and update all tests to ensure it works everywhere. * Built in proxy working with new Upstreams config * Command fixes and deprecations * Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts... TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct. * Fix translated keys in API registration. ≈ * Fixes from docs - omit some empty undocumented fields in API - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally. * Documentation updates for Upstreams in service definition * Fixes for tests broken by many refactors. * Enable travis on f-connect branch in this branch too. * Add consistent Deprecation comments to ProxyDestination uses * Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs	2018-10-10 16:55:34 +01:00
Pierre Souchay	473e589d86	Implementation of Weights Data structures (#4468 ) * Implementation of Weights Data structures Adding this datastructure will allow us to resolve the issues #1088 and #4198 This new structure defaults to values: ``` { Passing: 1, Warning: 0 } ``` Which means, use weight of 0 for a Service in Warning State while use Weight 1 for a Healthy Service. Thus it remains compatible with previous Consul versions. * Implemented weights for DNS SRV Records * DNS properly support agents with weight support while server does not (backwards compatibility) * Use Warning value of Weights of 1 by default When using DNS interface with only_passing = false, all nodes with non-Critical healthcheck used to have a weight value of 1. While having weight.Warning = 0 as default value, this is probably a bad idea as it breaks ascending compatibility. Thus, we put a default value of 1 to be consistent with existing behaviour. * Added documentation for new weight field in service description * Better documentation about weights as suggested by @banks * Return weight = 1 for unknown Check states as suggested by @banks * Fixed typo (of -> or) in error message as requested by @mkeeler * Fixed unstable unit test TestRetryJoin * Fixed unstable tests * Fixed wrong Fatalf format in `testrpc/wait.go` * Added notes regarding DNS SRV lookup limitations regarding number of instances * Documentation fixes and clarification regarding SRV records with weights as requested by @banks * Rephrase docs	2018-09-07 15:30:47 +01:00
Freddy	10d3048bd6	Bugfix: Use "%#v" when formatting structs (#4600 )	2018-08-28 12:37:34 -04:00
Pierre Souchay	821a91ca31	Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415 ) * Allow to rename nodes with IDs, will fix #3974 and #4413 This change allow to rename any well behaving recent agent with an ID to be renamed safely, ie: without taking the name of another one with case insensitive comparison. Deprecated behaviour warning ---------------------------- Due to asceding compatibility, it is still possible however to "take" the name of another name by not providing any ID. Note that when not providing any ID, it is possible to have 2 nodes having similar names with case differences, ie: myNode and mynode which might lead to DB corruption on Consul server side and lead to server not properly restarting. See #3983 and #4399 for Context about this change. Disabling registration of nodes without IDs as specified in #4414 should probably be the way to go eventually. * Removed the case-insensitive search when adding a node within the else block since it breaks the test TestAgentAntiEntropy_Services While the else case is probably legit, it will be fixed with #4414 in a later release. * Added again the test in the else to avoid duplicated names, but enforce this test only for nodes having IDs. Thus most tests without any ID will work, and allows us fixing * Added more tests regarding request with/without IDs. `TestStateStore_EnsureNode` now test registration and renaming with IDs `TestStateStore_EnsureNodeDeprecated` tests registration without IDs and tests removing an ID from a node as well as updated a node without its ID (deprecated behaviour kept for backwards compatibility) * Do not allow renaming in case of conflict, including when other node has no ID * Fixed function GetNodeID that was not working due to wrong type when searching node from its ID Thus, all tests about renaming were not working properly. Added the full test cas that allowed me to detect it. * Better error messages, more tests when nodeID is not a valid UUID in GetNodeID() * Added separate TestStateStore_GetNodeID to test GetNodeID. More complete test coverage for GetNodeID * Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn` Also fixed comments to be clearer after remarks from @banks * Fixed error message in unit test to match test case * Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler	2018-08-10 11:30:45 -04:00
Matt Keeler	965fc9cf62	Revert "Allow changing Node names since Node now have IDs"	2018-07-12 11:19:21 -04:00
Matt Keeler	42729d5aff	Merge pull request #3983 from pierresouchay/node_renaming Allow changing Node names since Node now have IDs	2018-07-11 16:03:02 -04:00
Pierre Souchay	3d0a960470	When renaming a node, ensure the name is not taken by another node. Since DNS is case insensitive and DB as issues when similar names with different cases are added, check for unicity based on case insensitivity. Following another big incident we had in our cluster, we also validate that adding/renaming a not does not conflicts with case insensitive matches. We had the following error once: - one node called: mymachine.MYDC.mydomain was shut off - another node (different ID) was added with name: mymachine.mydc.mydomain before 72 hours When restarting the consul server of domain, the consul server restarted failed to start since it detected an issue in RAFT database because mymachine.MYDC.mydomain and mymachine.mydc.mydomain had the same names. Checking at registration time with case insensitivity should definitly fix those issues and avoid Consul DB corruption.	2018-07-11 14:42:54 +02:00
Mitchell Hashimoto	ad382d7351	agent: switch ConnectNative to an embedded struct	2018-06-25 12:24:10 -07:00
Mitchell Hashimoto	a3e0ac1ee3	agent/consul/state: support querying by Connect native	2018-06-25 12:24:08 -07:00
Mitchell Hashimoto	119ffe3ed9	agent/consul: implement Health.ServiceNodes for Connect, DNS works	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	06957f6d7f	agent/consul/state: ConnectServiceNodes	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	58bff8dd05	agent/consul/state: convert proxy test to testify/assert	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	09568ce7b5	agent/consul/state: service registration with proxy works	2018-06-14 09:41:46 -07:00
Paul Banks	c55885efd8	Merge pull request #3970 from pierresouchay/node_health_should_change_service_index [BUGFIX] When a node level check is removed, ensure all services of node are notified	2018-05-08 16:44:50 +01:00
Pierre Souchay	1b55e3559b	Allow renaming nodes when ID is unchanged	2018-04-18 15:39:38 +02:00
Preetha Appan	d9d9944179	Renames agent API layer for service metadata to "meta" for consistency	2018-03-28 09:04:50 -05:00
Preetha	8dacb12c79	Merge pull request #3881 from pierresouchay/service_metadata Feature Request: Support key-value attributes for services	2018-03-27 16:33:57 -05:00
Pierre Souchay	b9ae4e647f	Added validation of ServiceMeta in Catalog Fixed Error Message when ServiceMeta is not valid Added Unit test for adding a Service with badly formatted ServiceMeta	2018-03-27 22:22:42 +02:00
Pierre Souchay	eccb56ade0	Added support for renaming nodes when their IP does not change	2018-03-26 16:44:13 +02:00
Pierre Souchay	90d2f7bca1	Merge remote-tracking branch 'origin/master' into node_health_should_change_service_index	2018-03-22 13:07:11 +01:00
Pierre Souchay	9cc9dce848	More test cases	2018-03-22 12:41:06 +01:00
Pierre Souchay	7e8e4e014b	Added new test regarding checks index	2018-03-22 12:20:25 +01:00
Josh Soref	1dd8c378b9	Spelling (#3958 ) * spelling: another * spelling: autopilot * spelling: beginning * spelling: circonus * spelling: default * spelling: definition * spelling: distance * spelling: encountered * spelling: enterprise * spelling: expands * spelling: exits * spelling: formatting * spelling: health * spelling: hierarchy * spelling: imposed * spelling: independence * spelling: inspect * spelling: last * spelling: latest * spelling: client * spelling: message * spelling: minimum * spelling: notify * spelling: nonexistent * spelling: operator * spelling: payload * spelling: preceded * spelling: prepared * spelling: programmatically * spelling: required * spelling: reconcile * spelling: responses * spelling: request * spelling: response * spelling: results * spelling: retrieve * spelling: service * spelling: significantly * spelling: specifies * spelling: supported * spelling: synchronization * spelling: synchronous * spelling: themselves * spelling: unexpected * spelling: validations * spelling: value	2018-03-19 16:56:00 +00:00
Pierre Souchay	a5f6ac0df4	[BUGFIX] When a node level check is removed, ensure all services of node are notified Bugfix for https://github.com/hashicorp/consul/pull/3899 When a node level check is removed (example: maintenance), some watchers on services might have to recompute their state. If those nodes are performing blocking queries, they have to be notified. While their state was updated when node-level state did change or was added this was not the case when the check was removed. This fixes it.	2018-03-19 14:14:03 +01:00
Pierre Souchay	85b73f8163	Simplified error handling for maxIndexForService * added unit tests to ensure service index is properly garbage collected * added Upgrade from Version 1.0.6 to higher section in documentation	2018-03-01 14:09:36 +01:00
Pierre Souchay	56d5c0bf22	Enable Raft index optimization per service name on health endpoint Had to fix unit test in order to check properly indexes.	2018-02-20 01:35:50 +01:00
Pierre Souchay	523feb0be4	Fixed comment about raftIndex + use test.Helper()	2018-02-19 19:30:25 +01:00
Pierre Souchay	4c188c1d08	Services Indexes modified per service instead of using a global Index This patch improves the watches for services on large cluster: each service has now its own index, such watches on a specific service are not modified by changes in the global catalog. It should improve a lot the performance of tools such as consul-template or libraries performing watches on very large clusters with many services/watches.	2018-02-19 18:29:22 +01:00
Frank Schroeder	1d0bbfed9c	agent: move agent/consul/structs to agent/structs	2017-08-09 14:32:12 +02:00
Frank Schroeder	cd837b0b18	pkg refactor command/agent/* -> agent/* command/consul/* -> agent/consul/* command/agent/command{,_test}.go -> command/agent{,_test}.go command/base/command.go -> command/base.go command/base/* -> command/* commands.go -> command/commands.go The script which did the refactor is: ( cd $GOPATH/src/github.com/hashicorp/consul git mv command/agent/command.go command/agent.go git mv command/agent/command_test.go command/agent_test.go git mv command/agent/flag_slice_value{,_test}.go command/ git mv command/agent . git mv command/base/command.go command/base.go git mv command/base/config_util{,_test}.go command/ git mv commands.go command/ git mv consul agent rmdir command/base/ gsed -i -e 's\|package agent\|package command\|' command/agent{,_test}.go gsed -i -e 's\|package agent\|package command\|' command/flag_slice_value{,_test}.go gsed -i -e 's\|package base\|package command\|' command/base.go command/config_util{,_test}.go gsed -i -e 's\|package main\|package command\|' command/commands.go gsed -i -e 's\|base.Command\|BaseCommand\|' command/commands.go gsed -i -e 's\|agent.Command\|AgentCommand\|' command/commands.go gsed -i -e 's\|\tCommand:\|\tBaseCommand:\|' command/commands.go gsed -i -e 's\|base\.\|\|' command/commands.go gsed -i -e 's\|command\.\|\|' command/commands.go gsed -i -e 's\|command\|c\|' main.go gsed -i -e 's\|range Commands\|range command.Commands\|' main.go gsed -i -e 's\|Commands: Commands\|Commands: command.Commands\|' main.go gsed -i -e 's\|base\.BoolValue\|BoolValue\|' command/operator_autopilot_set.go gsed -i -e 's\|base\.DurationValue\|DurationValue\|' command/operator_autopilot_set.go gsed -i -e 's\|base\.StringValue\|StringValue\|' command/operator_autopilot_set.go gsed -i -e 's\|base\.UintValue\|UintValue\|' command/operator_autopilot_set.go gsed -i -e 's\|\bCommand\b\|BaseCommand\|' command/base.go gsed -i -e 's\|BaseCommand Options\|Command Options\|' command/base.go gsed -i -e 's\|base.Command\|BaseCommand\|' command/.go gsed -i -e 's\|c\.Command\|c.BaseCommand\|g' command/.go gsed -i -e 's\|\tCommand:\|\tBaseCommand:\|' command/_test.go gsed -i -e 's\|base\.\|\|' command/_test.go gsed -i -e 's\|\bCommand\b\|AgentCommand\|' command/agent{,_test}.go gsed -i -e 's\|cmd.AgentCommand\|cmd.BaseCommand\|' command/agent.go gsed -i -e 's\|cli.AgentCommand = new(Command)\|cli.Command = new(AgentCommand)\|' command/agent_test.go gsed -i -e 's\|exec.AgentCommand\|exec.Command\|' command/agent_test.go gsed -i -e 's\|exec.BaseCommand\|exec.Command\|' command/agent_test.go gsed -i -e 's\|NewTestAgent\|agent.NewTestAgent\|' command/agent_test.go gsed -i -e 's\|= TestConfig\|= agent.TestConfig\|' command/agent_test.go gsed -i -e 's\|: RetryJoin\|: agent.RetryJoin\|' command/agent_test.go gsed -i -e 's\|\.\./\.\./\|../\|' command/config_util_test.go gsed -i -e 's\|\bverifyUniqueListeners\|VerifyUniqueListeners\|' agent/config{,_test}.go command/agent.go gsed -i -e 's\|\bserfLANKeyring\b\|SerfLANKeyring\|g' agent/{agent,keyring,testagent}.go command/agent.go gsed -i -e 's\|\bserfWANKeyring\b\|SerfWANKeyring\|g' agent/{agent,keyring,testagent}.go command/agent.go gsed -i -e 's\|\bNewAgent\b\|agent.New\|g' command/agent{,_test}.go gsed -i -e 's\|\bNewAgent\|New\|' agent/{acl_test,agent,testagent}.go gsed -i -e 's\|\bAgent\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bBool\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bConfig\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bDefaultConfig\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bDevConfig\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bMergeConfig\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bReadConfigPaths\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bParseMetaPair\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bSerfLANKeyring\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|\bSerfWANKeyring\b\|agent.&\|g' command/agent{,_test}.go gsed -i -e 's\|circonus\.agent\|circonus\|g' command/agent{,_test}.go gsed -i -e 's\|logger\.agent\|logger\|g' command/agent{,_test}.go gsed -i -e 's\|metrics\.agent\|metrics\|g' command/agent{,_test}.go gsed -i -e 's\|// agent.Agent\|// agent\|' command/agent{,_test}.go gsed -i -e 's\|a\.agent\.Config\|a.Config\|' command/agent{,_test}.go gsed -i -e 's\|agent\.AppendSliceValue\|AppendSliceValue\|' command/{configtest,validate}.go gsed -i -e 's\|consul/consul\|agent/consul\|' GNUmakefile gsed -i -e 's\|\.\./test\|../../test\|' agent/consul/server_test.go # fix imports f=$(grep -rl 'github.com/hashicorp/consul/command/agent' * \| grep '\.go') gsed -i -e 's\|github.com/hashicorp/consul/command/agent\|github.com/hashicorp/consul/agent\|' $f goimports -w $f f=$(grep -rl 'github.com/hashicorp/consul/consul' * \| grep '\.go') gsed -i -e 's\|github.com/hashicorp/consul/consul\|github.com/hashicorp/consul/agent/consul\|' $f goimports -w $f goimports -w command/*.go main.go )	2017-06-10 18:52:45 +02:00

1 2

91 Commits