open-consul

Commit Graph

Author	SHA1	Message	Date
Todd Radel	8ece11a24a	connect: store signingKeyId instead of authorityKeyId (#6005 )	2019-06-27 16:47:22 +02:00
Aestek	04a52a967b	acl: allow service deregistration with node write permission (#5217 ) With ACLs enabled if an agent is wiped and restarted without a leave it can no longer deregister the services it had previously registered because it no longer has the tokens the services were registered with. To remedy that we allow service deregistration from tokens with node write permission.	2019-06-27 14:24:34 +02:00
hashicorp-ci	3224bea082	Merge Consul OSS branch 'master' at commit 4eb73973b6e53336fd505dc727ac84c1f7e78872	2019-06-27 02:00:41 +00:00
Pierre Souchay	ca7c7faac8	agent: added metadata information about servers into consul service description (#5455 ) This allows have information about servers from HTTP APIs without using the command line.	2019-06-26 23:46:47 +02:00
Pierre Souchay	e394a9469b	Support for maximum size for Output of checks (#5233 ) * Support for maximum size for Output of checks This PR allows users to limit the size of output produced by checks at the agent and check level. When set at the agent level, it will limit the output for all checks monitored by the agent. When set at the check level, it can override the agent max for a specific check but only if it is lower than the agent max. Default value is 4k, and input must be at least 1.	2019-06-26 09:43:25 -06:00
hashicorp-ci	d237e86d83	Merge Consul OSS branch 'master' at commit 88b15d84f9fdb58ceed3dc971eb0390be85e3c15 skip-checks: true	2019-06-25 02:00:26 +00:00
Matt Keeler	f0f28707bc	New Cache Types (#5995 ) * Add a cache type for the Catalog.ListServices endpoint * Add a cache type for the Catalog.ListDatacenters endpoint	2019-06-24 14:11:34 -04:00
Matt Keeler	93debd2610	Ensure that looking for services by addreses works with Tagged Addresses (#5984 )	2019-06-21 13:16:17 -04:00
Hans Hasselberg	0d8d7ae052	agent: transfer leadership when establishLeadership fails (#5247 )	2019-06-19 14:50:48 +02:00
Aestek	24c29e195b	kv: do not trigger watches when setting the same value (#5885 ) If a KVSet is performed but does not update the entry, do not trigger watches for this key. This avoids releasing blocking queries for KV values that did not actually changed.	2019-06-18 15:06:29 +02:00
Matt Keeler	4c03f99a85	Fix CAS operations on Services (#5971 ) * Fix CAS operations on services * Update agent/consul/state/catalog_test.go Co-Authored-By: R.B. Boyer <public@richardboyer.net>	2019-06-17 10:41:04 -04:00
Paul Banks	e90fab0aec	Add rate limiting to RPCs sent within a server instance too (#5927 )	2019-06-13 04:26:27 -05:00
Freddy	8f5fe058ea	Increase reliability of TestResetSessionTimerLocked_Renew	2019-05-24 13:54:51 -04:00
Freddy	f7f0207f78	Run TestServer_Expect on its own (#5890 )	2019-05-23 19:52:33 -04:00
Freddy	e9bdb3a4f9	Flaky test: ACLReplication_Tokens (#5891 ) * Exclude non-go workflows while testing * Wait for s2 global-management policy * Revert "Exclude non-go workflows while testing" This reverts commit 47a83cbe9f19d0e1e475eabaa223d61fb4c56019.	2019-05-23 19:52:02 -04:00
Freddy	c9e6640337	Add retries to StatsFetcherTest (#5892 )	2019-05-23 19:51:31 -04:00
freddygv	d133d565a5	Wait for s2 global-management policy	2019-05-21 17:58:37 -06:00
Freddy	7ce28bbfee	Stop running TestLeader_ChangeServerID in parallel	2019-05-21 15:28:08 -06:00
Kyle Havlovitz	ad24456f49	Set the dead node reclaim timer at 30s	2019-05-15 11:59:33 -07:00
Kyle Havlovitz	dcbffdb956	Merge branch 'master' into change-node-id	2019-05-15 10:51:04 -07:00
Matt Keeler	46956ed769	Copy the proxy config instead of direct assignment (#5786 ) This prevents modifying the data in the state store which is supposed to be immutable.	2019-05-06 12:09:59 -04:00
R.B. Boyer	372bb06c83	acl: a role binding rule for a role that does not exist should be ignored (#5778 ) I wrote the docs under this assumption but completely forgot to actually enforce it.	2019-05-03 14:22:44 -05:00
R.B. Boyer	7d0f729f77	acl: enforce that you cannot persist tokens and roles with missing links except during replication (#5779 )	2019-05-02 15:02:21 -05:00
Matt Keeler	26708570c5	Fix ConfigEntryResponse binary marshaller and ensure we watch the chan in ConfigEntry.Get even when no entry exists. (#5773 )	2019-05-02 15:25:29 -04:00
Paul Banks	df0c61fd31	Fix previous accidental master push 🤦 (#5771 ) * Fix previous accidental master push 🤦 * Fix ACL test	2019-05-02 15:49:37 +01:00
Paul Banks	95bb1e368f	Fix panic in Resolving service config when proxy-defaults isn't defined yet (#5769 )	2019-05-02 14:12:21 +01:00
Paul Banks	cf24e7d1ed	Fix uint8 conversion issues for service config response maps.	2019-05-02 14:11:33 +01:00
Paul Banks	078f4cf5bb	Add integration test for central config; fix central config WIP (#5752 ) * Add integration test for central config; fix central config WIP * Add integration test for central config; fix central config WIP * Set proxy protocol correctly and begin adding upstream support * Add upstreams to service config cache key and start new notify watcher if they change. This doesn't update the tests to pass though. * Fix some merging logic get things working manually with a hack (TODO fix properly) * Simplification to not allow enabling sidecars centrally - it makes no sense without upstreams anyway * Test compile again and obvious ones pass. Lots of failures locally not debugged yet but may be flakes. Pushing up to see what CI does * Fix up service manageer and API test failures * Remove the enable command since it no longer makes much sense without being able to turn on sidecar proxies centrally * Remove version.go hack - will make integration test fail until release * Remove unused code from commands and upstream merge * Re-bump version to 1.5.0	2019-05-01 16:39:31 -07:00
Matt Keeler	9c77f2c52a	Update to use a consulent build tag instead of just ent (#5759 )	2019-05-01 11:11:27 -04:00
Matt Keeler	ea6cbf01a5	Centralized Config CLI (#5731 ) * Add HTTP endpoints for config entry management * Finish implementing decoding in the HTTP Config entry apply endpoint * Add CAS operation to the config entry apply endpoint Also use this for the bootstrapping and move the config entry decoding function into the structs package. * First pass at the API client for the config entries * Fixup some of the ConfigEntry APIs Return a singular response object instead of a list for the ConfigEntry.Get RPC. This gets plumbed through the HTTP API as well. Dont return QueryMeta in the JSON response for the config entry listing HTTP API. Instead just return a list of config entries. * Minor API client fixes * Attempt at some ConfigEntry api client tests These don’t currently work due to weak typing in JSON * Get some of the api client tests passing * Implement reflectwalk magic to correct JSON encoding a ProxyConfigEntry Also added a test for the HTTP endpoint that exposes the problem. However, since the test doesn’t actually do the JSON encode/decode its still failing. * Move MapWalk magic into a binary marshaller instead of JSON. * Add a MapWalk test * Get rid of unused func * Get rid of unused imports * Fixup some tests now that the decoding from msgpack coerces things into json compat types * Stub out most of the central config cli Fully implement the config read command. * Basic config delete command implementation * Implement config write command * Implement config list subcommand Not entirely sure about the output here. Its basically the read output indented with a line specifying the kind/name of each type which is also duplicated in the indented output. * Update command usage * Update some help usage formatting * Add the connect enable helper cli command * Update list command output * Rename the config entry API client methods. * Use renamed apis * Implement config write tests Stub the others with the noTabs tests. * Change list output format Now just simply output 1 line per named config * Add config read tests * Add invalid args write test. * Add config delete tests * Add config list tests * Add connect enable tests * Update some CLI commands to use CAS ops This also modifies the HTTP API for a write op to return a boolean indicating whether the value was written or not. * Fix up the HTTP API CAS tests as I realized they weren’t testing what they should. * Update config entry rpc tests to properly test CAS * Fix up a few more tests * Fix some tests that using ConfigEntries.Apply * Update config_write_test.go * Get rid of unused import	2019-04-30 16:27:16 -07:00
Matt Keeler	697efb588c	Make a few config entry endpoints return 404s and allow for snake_case and lowercase key names. (#5748 )	2019-04-30 18:19:19 -04:00
Matt Keeler	8beb5c6082	ACL Token ID Initialization (#5307 )	2019-04-30 11:45:36 -04:00
Kyle Havlovitz	64174f13d6	Add HTTP endpoints for config entry management (#5718 )	2019-04-29 18:08:09 -04:00
R.B. Boyer	ea2740fd32	Merge pull request #5617 from hashicorp/f-acl-ux Secure ACL Introduction for Kubernetes	2019-04-26 15:34:26 -05:00
Aestek	9813abcb09	Fix: fail messages after a node rename replace the new node definition (#5520 ) When receiving a serf faild message for a node which is not in the catalog, do not perform a register request to set is serf heath to critical as it could overwrite the node information and services if it was renamed. Fixes : #5518	2019-04-26 21:33:41 +01:00
R.B. Boyer	5a505c5b3a	acl: adding support for kubernetes auth provider login (#5600 ) * auth providers * binding rules * auth provider for kubernetes * login/logout	2019-04-26 14:49:25 -05:00
R.B. Boyer	9542fdc9bc	acl: adding Roles to Tokens (#5514 ) Roles are named and can express the same bundle of permissions that can currently be assigned to a Token (lists of Policies and Service Identities). The difference with a Role is that it not itself a bearer token, but just another entity that can be tied to a Token. This lets an operator potentially curate a set of smaller reusable Policies and compose them together into reusable Roles, rather than always exploding that same list of Policies on any Token that needs similar permissions. This also refactors the acl replication code to be semi-generic to avoid 3x copypasta.	2019-04-26 14:49:12 -05:00
R.B. Boyer	f43bc981e9	making ACLToken.ExpirationTime a *time.Time value instead of time.Time (#5663 ) This is mainly to avoid having the API return "0001-01-01T00:00:00Z" as a value for the ExpirationTime field when it is not set. Unfortunately time.Time doesn't respect the json marshalling "omitempty" directive.	2019-04-26 14:48:16 -05:00
R.B. Boyer	b3956e511c	acl: ACL Tokens can now be assigned an optional set of service identities (#5390 ) These act like a special cased version of a Policy Template for granting a token the privileges necessary to register a service and its connect proxy, and read upstreams from the catalog.	2019-04-26 14:48:04 -05:00
R.B. Boyer	76321aa952	acl: tokens can be created with an optional expiration time (#5353 )	2019-04-26 14:47:51 -05:00
Matt Keeler	3ea9fe3bff	Implement bootstrapping proxy defaults from the config file (#5714 )	2019-04-26 14:25:03 -04:00
Matt Keeler	3b5d38fb49	Implement config entry replication (#5706 )	2019-04-26 13:38:39 -04:00
Alvin Huang	96c2c79908	Add fmt and vet (#5671 ) * add go fmt and vet * go fmt fixes	2019-04-25 12:26:33 -04:00
Kyle Havlovitz	6faa8ba451	Fill out the service manager functionality and fix tests	2019-04-23 00:17:28 -07:00
Kyle Havlovitz	d51fd740bf	Merge pull request #5615 from hashicorp/config-entry-rpc Add RPC endpoints for config entry operations	2019-04-23 00:16:54 -07:00
Kyle Havlovitz	e64d1b8016	Rename config entry ACL methods	2019-04-22 23:55:11 -07:00
kaitlincarter-hc	7859d8c409	[docs] Server Performance (#5627 ) * Moving server performance guide to docs. * fixing broken links * updating broken link * fixing broken links	2019-04-17 13:17:12 -05:00
Matt Keeler	ac78c23021	Implement data filtering of some endpoints (#5579 ) Fixes: #4222 # Data Filtering This PR will implement filtering for the following endpoints: ## Supported HTTP Endpoints - `/agent/checks` - `/agent/services` - `/catalog/nodes` - `/catalog/service/:service` - `/catalog/connect/:service` - `/catalog/node/:node` - `/health/node/:node` - `/health/checks/:service` - `/health/service/:service` - `/health/connect/:service` - `/health/state/:state` - `/internal/ui/nodes` - `/internal/ui/services` More can be added going forward and any endpoint which is used to list some data is a good candidate. ## Usage When using the HTTP API a `filter` query parameter can be used to pass a filter expression to Consul. Filter Expressions take the general form of: ``` <selector> == <value> <selector> != <value> <value> in <selector> <value> not in <selector> <selector> contains <value> <selector> not contains <value> <selector> is empty <selector> is not empty not <other expression> <expression 1> and <expression 2> <expression 1> or <expression 2> ``` Normal boolean logic and precedence is supported. All of the actual filtering and evaluation logic is coming from the [go-bexpr](https://github.com/hashicorp/go-bexpr) library ## Other changes Adding the `Internal.ServiceDump` RPC endpoint. This will allow the UI to filter services better.	2019-04-16 12:00:15 -04:00
Kyle Havlovitz	2cffe4894f	Move the ACL logic into the ConfigEntry interface	2019-04-10 14:27:28 -07:00
Kyle Havlovitz	81254deb59	Add RPC endpoints for config entry operations	2019-04-06 23:38:08 -07:00
Alvin Huang	aacb81a566	Merge pull request #5376 from hashicorp/fix-tests Fix tests in prep for CircleCI Migration	2019-04-04 17:09:32 -04:00
Kyle Havlovitz	d6c25a13a5	Merge pull request #5539 from hashicorp/service-config Service config state model	2019-04-02 16:34:58 -07:00
Kyle Havlovitz	63c9434779	Cleaned up some error handling/comments around config entries	2019-04-02 15:42:12 -07:00
Kyle Havlovitz	ace5c7a1cb	Encode config entry FSM messages in a generic type	2019-03-28 00:06:56 -07:00
Kyle Havlovitz	96a460c0cf	Clean up service config state store methods	2019-03-27 16:52:38 -07:00
R.B. Boyer	ab57b02ff8	acl: memdb filter of tokens-by-policy was inverted (#5575 ) The inversion wasn't noticed because the parallel execution of TokenList tests was operating incorrectly due to variable shadowing.	2019-03-27 15:24:44 -05:00
Jeff Mitchell	d3c7d57209	Move internal/ to sdk/ (#5568 ) * Move internal/ to sdk/ * Add a readme to the SDK folder	2019-03-27 08:54:56 -04:00
Jeff Mitchell	a41c865059	Convert to Go Modules (#5517 ) * First conversion * Use serf 0.8.2 tag and associated updated deps * * Move freeport and testutil into internal/ * Make internal/ its own module * Update imports * Add replace statements so API and normal Consul code are self-referencing for ease of development * Adapt to newer goe/values * Bump to new cleanhttp * Fix ban nonprintable chars test * Update lock bad args test The error message when the duration cannot be parsed changed in Go 1.12 (ae0c435877d3aacb9af5e706c40f9dddde5d3e67). This updates that test. * Update another test as well * Bump travis * Bump circleci * Bump go-discover and godo to get rid of launchpad dep * Bump dockerfile go version * fix tar command * Bump go-cleanhttp	2019-03-26 17:04:58 -04:00
Paul Banks	68e8933ba5	Connect: Make Connect health queries unblock correctly (#5508 ) * Make Connect health queryies unblock correctly in all cases and use optimal number of watch chans. Fixes #5506. * Node check test cases and clearer bug test doc * Comment update	2019-03-21 16:01:56 +00:00
Kyle Havlovitz	c2cba68042	Fix fsm serialization and add snapshot/restore	2019-03-20 16:13:13 -07:00
Kyle Havlovitz	9df597b257	Fill out state store/FSM functions and add tests	2019-03-19 15:56:17 -07:00
Kyle Havlovitz	53913461db	Add config types and state store table	2019-03-19 10:06:46 -07:00
Kyle Havlovitz	bb0839ea5b	Condense some test logic and add a comment about renaming	2019-03-18 16:15:36 -07:00
Paul Banks	dd08426b04	Optimize health watching to single chan/goroutine. (#5449 ) Refs #4984. Watching chans for every node we touch in a health query is wasteful. In #4984 it shows that if there are more than 682 service instances we always fallback to watching all services which kills performance. We already have a record in MemDB that is reliably update whenever the service health result should change thanks to per-service watch indexes. So in general, provided there is at least one service instances and we actually have a service index for it (we always do now) we only ever need to watch a single channel. This saves us from ever falling back to the general index and causing the performance cliff in #4984, but it also means fewer goroutines and work done for every blocking health query. It also saves some allocations made during the query because we no longer have to populate a WatchSet with 3 chans per service instance which saves the internal map allocation. This passes all state store tests except the one that explicitly checked for the fallback behaviour we've now optimized away and in general seems safe.	2019-03-15 20:18:48 +00:00
R.B. Boyer	d65008700a	acl: reduce complexity of token resolution process with alternative singleflighting (#5480 ) acl: reduce complexity of token resolution process with alternative singleflighting Switches acl resolution to use golang.org/x/sync/singleflight. For the identity/legacy lookups this is a drop-in replacement with the same overall approach to request coalescing. For policies this is technically a change in behavior, but when considered holistically is approximately performance neutral (with the benefit of less code). There are two goals with this blob of code (speaking specifically of policy resolution here): 1) Minimize cross-DC requests. 2) Minimize client-to-server LAN requests. The previous iteration of this code was optimizing for the case of many possibly different tokens being resolved concurrently that have a significant overlap in linked policies such that deduplication would be worth the complexity. While this is laudable there are some things to consider that can help to adjust expectations: 1) For v1.4+ policies are always replicated, and once a single policy shows up in a secondary DC the replicated data is considered authoritative for requests made in that DC. This means that our earlier concerns about minimizing cross-DC requests are irrelevant because there will be no cross-DC policy reads that occur. 2) For Server nodes the in-memory ACL policy cache is capped at zero, meaning it has no caching. Only Client nodes run with a cache. This means that instead of having an entire DC's worth of tokens (what a Server might see) that can have policy resolutions coalesced these nodes will only ever be seeing node-local token resolutions. In a reasonable worst-case scenario where a scheduler like Kubernetes has "filled" a node with Connect services, even that will only schedule ~100 connect services per node. If every service has a unique token there will only be 100 tokens to coalesce and even then those requests have to occur concurrently AND be hitting an empty consul cache. Instead of seeing a great coalescing opportunity for cutting down on redundant Policy resolutions, in practice it's far more likely given node densities that you'd see requests for the same token concurrently than you would for two tokens sharing a policy concurrently (to a degree that would warrant the overhead of the current variation of singleflighting. Given that, this patch switches the Policy resolution process to only singleflight by requesting token (but keeps the cache as by-policy).	2019-03-14 09:35:34 -05:00
Kyle Havlovitz	3aec844fd2	Update state store test for changing node ID	2019-03-13 17:05:31 -07:00
Kyle Havlovitz	df4ec913f0	Add a test for changing a failed node's ID	2019-03-13 15:39:07 -07:00
Hans Hasselberg	d511e86491	agent: enable reloading of tls config (#5419 ) This PR introduces reloading tls configuration. Consul will now be able to reload the TLS configuration which previously required a restart. It is not yet possible to turn TLS ON or OFF with these changes. Only when TLS is already turned on, the configuration can be reloaded. Most importantly the certificates and CAs.	2019-03-13 10:29:06 +01:00
R.B. Boyer	e9614ee92f	acl: correctly extend the cache for acl identities during resolution (#5475 )	2019-03-12 10:23:43 -05:00
Aestek	071fcb28ba	[catalog] Update the node's services indexes on update (#5458 ) Node updates were not updating the service indexes, which are used for service related queries. This caused the X-Consul-Index to stay the same after a node update as seen from a service query even though the node data is returned in heath queries. If that happened in between queries the client would miss this change. We now update the indexes of the services on the node when it is updated. Fixes: #5450	2019-03-11 14:48:19 +00:00
Kyle Havlovitz	bf09061e86	Add logic to allow changing a failed node's ID	2019-03-07 22:42:54 -08:00
Alvin Huang	ece3b5907d	fix typos	2019-03-06 14:47:33 -05:00
R.B. Boyer	91e78e00c7	fix typos reported by golangci-lint:misspell (#5434 )	2019-03-06 11:13:28 -06:00
R.B. Boyer	c24e3584be	improve flaky LANReap tests by expliciting configuring the tombstone timeout In TestServer_LANReap autopilot is running, so the alternate flow through the serf reaping function is possible. In that situation the ReconnectTimeout is not relevant so for parity also override the TombstoneTimeout value as well. For additional parity update the TestServer_WANReap and TestClient_LANReap versions of this test in the same way even though autopilot is irrelevant here .	2019-03-05 14:34:03 -06:00
Matt Keeler	87f9365eee	Fixes for CVE-2019-8336 Fix error in detecting raft replication errors. Detect redacted token secrets and prevent attempting to insert. Add a Redacted field to the TokenBatchRead and TokenRead RPC endpoints This will indicate whether token secrets have been redacted. Ensure any token with a redacted secret in secondary datacenters is removed. Test that redacted tokens cannot be replicated.	2019-03-04 19:13:24 +00:00
Matt Keeler	612aba7ced	Dont modify memdb owned token data for get/list requests of tokens (#5412 ) Previously we were fixing up the token links directly on the *ACLToken returned by memdb. This invalidated some assumptions that a snapshot is immutable as well as potentially being able to cause a crash. The fix here is to give the policy link fixing function copy on write semantics. When no fixes are necessary we can return the memdb object directly, otherwise we copy it and create a new list of links. Eventually we might find a better way to keep those policy links in sync but for now this fixes the issue.	2019-03-04 09:28:46 -05:00
Matt Keeler	416a6543a6	Call RemoveServer for reap events (#5317 ) This ensures that servers are removed from RPC routing when they are reaped.	2019-03-04 09:19:35 -05:00
R.B. Boyer	d3be5c1d3a	fix ignored errors in state store internals as reported by errcheck	2019-03-01 14:18:00 -06:00
Matt Keeler	0c76a4389f	ACL Token Persistence and Reloading (#5328 ) This PR adds two features which will be useful for operators when ACLs are in use. 1. Tokens set in configuration files are now reloadable. 2. If `acl.enable_token_persistence` is set to `true` in the configuration, tokens set via the `v1/agent/token` endpoint are now persisted to disk and loaded when the agent starts (or during configuration reload) Note that token persistence is opt-in so our users who do not want tokens on the local disk will see no change. Some other secondary changes: * Refactored a bunch of places where the replication token is retrieved from the token store. This token isn't just for replicating ACLs and now it is named accordingly. * Allowed better paths in the `v1/agent/token/` API. Instead of paths like: `v1/agent/token/acl_replication_token` the path can now be just `v1/agent/token/replication`. The old paths remain to be valid. * Added a couple new API functions to set tokens via the new paths. Deprecated the old ones and pointed to the new names. The names are also generally better and don't imply that what you are setting is for ACLs but rather are setting ACL tokens. There is a minor semantic difference there especially for the replication token as again, its no longer used only for ACL token/policy replication. The new functions will detect 404s and fallback to using the older token paths when talking to pre-1.4.3 agents. * Docs updated to reflect the API additions and to show using the new endpoints. * Updated the ACL CLI set-agent-tokens command to use the non-deprecated APIs.	2019-02-27 14:28:31 -05:00
Hans Hasselberg	75ababb54f	Centralise tls configuration part 1 (#5366 ) In order to be able to reload the TLS configuration, we need one way to generate the different configurations. This PR introduces a `tlsutil.Configurator` which holds a `tlsutil.Config`. Afterwards it is responsible for rendering every `tls.Config`. In this particular PR I moved `IncomingHTTPSConfig`, `IncomingTLSConfig`, and `OutgoingTLSWrapper` into `tlsutil.Configurator`. This PR is a pure refactoring - not a single feature added. And not a single test added. I only slightly modified existing tests as necessary.	2019-02-26 16:52:07 +01:00
Alvin Huang	c4168e6dfc	add wait to TestClient_JoinLAN	2019-02-22 17:34:45 -05:00
Alvin Huang	2e961d6539	add retry to TestResetSessionTimerLocked	2019-02-22 17:34:45 -05:00
R.B. Boyer	ae1cb27126	fix incorrect body of TestACLEndpoint_PolicyBatchRead Lifted from PR #5307 as it was an unrelated drive-by fix on that PR anyway. s/token/policy/	2019-02-22 09:32:51 -06:00
R.B. Boyer	8e344c0218	test: switch test file from assert -> require for consistency Also in acl_endpoint_test.go: * convert logical blocks in some token tests to subtests * remove use of require.New This removes a lot of noise in a later PR.	2019-02-14 14:21:19 -06:00
R.B. Boyer	57be6ca215	correct some typos	2019-02-13 13:02:12 -06:00
R.B. Boyer	a3e0fb8370	ensure that we plumb our configured logger into all parts of the raft library	2019-02-13 13:02:09 -06:00
R.B. Boyer	3b60891bf8	reduce the local scope of variable	2019-02-13 11:54:28 -06:00
R.B. Boyer	77d28fe9ce	clarify the ACL.PolicyDelete endpoint (#5337 ) There was an errant early-return in PolicyDelete() that bypassed the rest of the function. This was ok because the only caller of this function ignores the results. This removes the early-return making it structurally behave like TokenDelete() and for both PolicyDelete and TokenDelete clarify the lone callers to indicate that the return values are ignored. We may wish to avoid the entire return value as well, but this patch doesn't go that far.	2019-02-13 09:16:30 -06:00
R.B. Boyer	106d87a4a8	update TestStateStore_ACLBootstrap to not rely upon request mutation (#5335 )	2019-02-12 16:09:26 -06:00
Matt Keeler	fa2c7059a2	Move autopilot initialization to prevent race (#5322 ) `establishLeadership` invoked during leadership monitoring may use autopilot to do promotions etc. There was a race with doing that and having autopilot initialized and this fixes it.	2019-02-11 11:12:24 -05:00
Matt Keeler	210c3a56b0	Improve Connect with Prepared Queries (#5291 ) Given a query like: ``` { "Name": "tagged-connect-query", "Service": { "Service": "foo", "Tags": ["tag"], "Connect": true } } ``` And a Consul configuration like: ``` { "services": [ "name": "foo", "port": 8080, "connect": { "sidecar_service": {} }, "tags": ["tag"] ] } ``` If you executed the query it would always turn up with 0 results. This was because the sidecar service was being created without any tags. You could instead make your config look like: ``` { "services": [ "name": "foo", "port": 8080, "connect": { "sidecar_service": { "tags": ["tag"] } }, "tags": ["tag"] ] } ``` However that is a bit redundant for most cases. This PR ensures that the tags and service meta of the parent service get copied to the sidecar service. If there are any tags or service meta set in the sidecar service definition then this copying does not take place. After the changes, the query will now return the expected results. A second change was made to prepared queries in this PR which is to allow filtering on ServiceMeta just like we allow for filtering on NodeMeta.	2019-02-04 09:36:51 -05:00
R.B. Boyer	b5d71ea779	testutil: redirect some test agent logs to testing.T.Logf (#5304 ) When tests fail, only the logs for the failing run are dumped to the console which helps in diagnosis. This is easily added to other test scenarios as they come up.	2019-02-01 09:21:54 -06:00
Kyle Havlovitz	b30b541007	connect: Forward intention RPCs if this isn't the primary	2019-01-22 11:29:21 -08:00
Kyle Havlovitz	a731173661	Merge pull request #5249 from hashicorp/ca-fixes-oss Minor CA fixes	2019-01-22 11:25:09 -08:00
Kyle Havlovitz	b0f07d9b5e	Merge pull request #4869 from hashicorp/txn-checks Add node/service/check operations to transaction api	2019-01-22 11:16:09 -08:00
Matt Keeler	cc2cd75f5c	Fix several ACL token/policy resolution issues. (#5246 ) * Fix 2 remote ACL policy resolution issues 1 - Use the right method to fire async not found errors when the ACL.PolicyResolve RPC returns that error. This was previously accidentally firing a token result instead of a policy result which would have effectively done nothing (unless there happened to be a token with a secret id == the policy id being resolved. 2. When concurrent policy resolution is being done we single flight the requests. The bug before was that for the policy resolution that was going to piggy back on anothers RPC results it wasn’t waiting long enough for the results to come back due to looping with the wrong variable. * Fix a handful of other edge case ACL scenarios The main issue was that token specific issues (not able to access a particular policy or the token being deleted after initial fetching) were poisoning the policy cache. A second issue was that for concurrent token resolutions, the first resolution to get started would go fetch all the policies. If before the policies were retrieved a second resolution request came in, the new request would register watchers for those policies but then never block waiting for them to complete. This resulted in using the default policy when it shouldn't have.	2019-01-22 13:14:43 -05:00
Paul Banks	1c4dfbcd2e	connect: tame thundering herd of CSRs on CA rotation (#5228 ) * Support rate limiting and concurrency limiting CSR requests on servers; handle CA rotations gracefully with jitter and backoff-on-rate-limit in client * Add CSR rate limiting docs * Fix config naming and add tests for new CA configs	2019-01-22 17:19:36 +00:00
Kyle Havlovitz	4f53fe897a	oss: add the enterprise server stub for intention replication check	2019-01-18 17:32:10 -08:00
Matt Keeler	2f6a9edfac	Store leaf cert indexes in raft and use for the ModifyIndex on the returned certs (#5211 ) * Store leaf cert indexes in raft and use for the ModifyIndex on the returned certs This ensures that future certificate signings will have a strictly greater ModifyIndex than any previous certs signed.	2019-01-11 16:04:57 -05:00
Aestek	ff13518961	Improve blocking queries on services that do not exist (#4810 ) ## Background When making a blocking query on a missing service (was never registered, or is not registered anymore) the query returns as soon as any service is updated. On clusters with frequent updates (5~10 updates/s in our DCs) these queries virtually do not block, and clients with no protections againt this waste ressources on the agent and server side. Clients that do protect against this get updates later than they should because of the backoff time they implement between requests. ## Implementation While reducing the number of unnecessary updates we still want : * Clients to be notified as soon as when the last instance of a service disapears. * Clients to be notified whenever there's there is an update for the service. * Clients to be notified as soon as the first instance of the requested service is added. To reduce the number of unnecessary updates we need to block when a request to a missing service is made. However in the following case : 1. Client `client1` makes a query for service `foo`, gets back a node and X-Consul-Index 42 2. `foo` is unregistered 3. `client1` makes a query for `foo` with `index=42` -> `foo` does not exist, the query blocks and `client1` is not notified of the change on `foo` We could store the last raft index when each service was last alive to know wether we should block on the incoming query or not, but that list could grow indefinetly. We instead store the last raft index when a service was unregistered and use it when a query targets a service that does not exist. When a service `srv` is unregistered this "missing service index" is always greater than any X-Consul-Index held by the clients while `srv` was up, allowing us to immediatly notify them. 1. Client `client1` makes a query for service `foo`, gets back a node and `X-Consul-Index: 42` 2. `foo` is unregistered, we set the "missing service index" to 43 3. `client1` makes a blocking query for `foo` with `index=42` -> `foo` does not exist, we check against the "missing service index" and return immediatly with `X-Consul-Index: 43` 4. `client1` makes a blocking query for `foo` with `index=43` -> we block 5. Other changes happen in the cluster, but foo still doesn't exist and "missing service index" hasn't changed, the query is still blocked 6. `foo` is registered again on index 62 -> `foo` exists and its index is greater than 43, we unblock the query	2019-01-11 09:26:14 -05:00
Matt Keeler	29b4512120	acl: Prevent tokens from deleting themselves (#5210 ) Fixes #4897 Also apparently token deletion could segfault in secondary DCs when attempting to delete non-existant tokens. For that reason both checks are wrapped within the non-nil check.	2019-01-10 09:22:51 -05:00
Kyle Havlovitz	c266277a49	txn: clean up some state store/acl code	2019-01-09 11:59:23 -08:00
Pierre Souchay	5b8a7d7127	Avoid to have infinite recursion in DNS lookups when resolving CNAMEs (#4918 ) * Avoid to have infinite recursion in DNS lookups when resolving CNAMEs This will avoid killing Consul when a Service.Address is using CNAME to a Consul CNAME that creates an infinite recursion. This will fix https://github.com/hashicorp/consul/issues/4907 * Use maxRecursionLevel = 3 to allow several recursions	2019-01-07 16:53:54 -05:00
Paul Banks	0962e95e85	bugfix: use ServiceTags to generate cache key hash (#4987 ) * bugfix: use ServiceTags to generate cahce key hash * update unit test * update * remote print log * Update .gitignore * Completely deprecate ServiceTag field internally for clarity * Add explicit test for CacheInfo cases	2019-01-07 21:30:47 +00:00
Kyle Havlovitz	8b1dc6a22c	txn: fix an issue with querying nodes by name instead of ID	2018-12-12 12:46:33 -08:00
Pierre Souchay	61870be137	[Travis][UnstableTests] Fixed unstable tests in travis (#5013 ) * [Travis][UnstableTests] Fixed unstable tests in travis as seen in https://travis-ci.org/hashicorp/consul/jobs/460824602 * Fixed unstable tests in https://travis-ci.org/hashicorp/consul/jobs/460857687	2018-12-12 12:09:42 -08:00
Kyle Havlovitz	efcdc85e1a	api: add support for new txn operations	2018-12-12 10:54:09 -08:00
Kyle Havlovitz	2408f99cca	txn: add tests for RPC endpoint	2018-12-12 10:04:10 -08:00
Kyle Havlovitz	9f4f673c4d	txn: add ACL enforcement/validation to new txn ops	2018-12-12 10:04:10 -08:00
Kyle Havlovitz	41e8120d3d	state: add tests for new txn ops	2018-12-12 10:04:10 -08:00
Kyle Havlovitz	a40a346be8	txn: add service operations	2018-12-12 10:04:10 -08:00
Kyle Havlovitz	b1aeb3b943	txn: add node operations	2018-12-12 10:04:10 -08:00
Kyle Havlovitz	bd6b7ad162	txn: add pre-check operations to txn endpoint	2018-12-12 10:04:10 -08:00
Kyle Havlovitz	8a0d7b65d6	Add check operations to transaction api	2018-12-12 10:04:10 -08:00
Kyle Havlovitz	e7946197b8	connect/ca: prevent blank CA config in snapshot This PR both prevents a blank CA config from being written out to a snapshot and allows Consul to gracefully recover from a snapshot with an invalid CA config. Fixes #4954.	2018-12-06 17:40:53 -08:00
R.B. Boyer	c86eff8859	agent: remove some stray fmt.Print* calls (#5015 )	2018-11-29 09:45:51 -06:00
Pierre Souchay	d0ca1bade9	Fixed another list of unstable unit tests in travis (#4915 ) * Fixed another list of unstable unit tests in travis Fixed failing tests in https://travis-ci.org/hashicorp/consul/jobs/451357061 * Fixed another list of unstable unit tests in travis. Fixed failing tests in https://travis-ci.org/hashicorp/consul/jobs/451357061	2018-11-20 11:27:26 +00:00
Kyle Havlovitz	3cc7d6ebb5	Merge pull request #4952 from hashicorp/test-version tests: Bump test server version to 1.4.0	2018-11-13 13:37:10 -08:00
R.B. Boyer	8662a6d260	acl: add stub hooks to support some plumbing in enterprise (#4951 )	2018-11-13 15:35:54 -06:00
Kyle Havlovitz	19f9cad3fe	oss: bump test server version to 1.4.0	2018-11-13 13:13:26 -08:00
Aestek	4fb564abbc	Fix catalog tag filter backward compat (#4944 ) Fix catalog service node filtering (ex /v1/catalog/service/srv?tag=tag1) between agent version <=v1.2.3 and server >=v1.3.0. New server version did not account for the old field when filtering hence request made from old agent were not tag-filtered.	2018-11-13 14:44:36 +00:00
Kyle Havlovitz	b0dcf54e50	Merge pull request #4917 from hashicorp/replication-token-cleanup Use acl replication_token for connect	2018-11-12 09:12:54 -08:00
Kyle Havlovitz	038aefa0bc	update non-voting server test to fix enterprise diff	2018-11-09 12:50:24 -08:00
Kyle Havlovitz	70accbb2e0	oss: do a proper check-and-set on the CA roots/config fsm operation	2018-11-09 12:36:23 -08:00
R.B. Boyer	2e29f234b1	acl: fixes ACL replication for legacy tokens without AccessorIDs (#4885 )	2018-11-07 07:59:44 -08:00
Kyle Havlovitz	1a4204f363	agent: fix formatting	2018-11-07 02:16:03 -08:00
R.B. Boyer	a5d57f5326	fix comment typos (#4890 )	2018-11-02 12:00:39 -05:00
Kyle Havlovitz	5b7b8bf842	Merge pull request #4872 from hashicorp/node-snapshot-fix Node ID/datacenter snapshot fix	2018-10-31 15:51:07 -07:00
Matt Keeler	26b1873b3b	Adds documentation for the new ACL APIs (#4851 ) * Update the ACL API docs * Add a CreateTime to the anon token Also require acl:read permissions at least to perform rule translation. Don’t want someone DoSing the system with an open endpoint that actually does a bit of work. * Fix one place where I was referring to id instead of AccessorID * Add godocs for the API package additions. * Minor updates: removed some extra commas and updated the acl intro paragraph * minor tweaks * Updated the language to be clearer * Updated the language to be clearer for policy page * I was also confused by that! Your updates are much clearer. Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com> * Sounds much better. Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com> * Updated sidebar layout and deprecated warning	2018-10-31 15:11:51 -07:00
Matt Keeler	ec9934b6f8	Remaining ACL Unit Tests (#4852 ) * Add leader token upgrade test and fix various ACL enablement bugs * Update the leader ACL initialization tests. * Add a StateStore ACL tests for ACLTokenSet and ACLTokenGetBy* functions * Advertise the agents acl support status with the agent/self endpoint. * Make batch token upsert CAS’able to prevent consistency issues with token auto-upgrade * Finish up the ACL state store token tests * Finish the ACL state store unit tests Also rename some things to make them more consistent. * Do as much ACL replication testing as I can.	2018-10-31 13:00:46 -07:00
Kyle Havlovitz	cf2210b5c5	fsm: update snapshot/restore test to include ID and datacenter	2018-10-30 15:53:14 -07:00
Kyle Havlovitz	58ff5e46cb	fsm: add missing ID/datacenter to persistNodes	2018-10-30 15:52:54 -07:00
Matt Keeler	0dd537e506	Fix the NonVoter Bootstrap test (#4786 )	2018-10-24 10:23:50 -04:00
Kyle Havlovitz	6f40708aca	fsm: add Intention operations to transactions for internal use	2018-10-19 10:02:28 -07:00
Matt Keeler	df507a4a55	A few misc fixes found by go vet	2018-10-19 12:28:36 -04:00
Matt Keeler	99e0a124cb	New ACLs (#4791 ) This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week. Description At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers. On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though. Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though. All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management. Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are: A server running the new system must still support other clients using the legacy system. A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system. The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode. So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.	2018-10-19 12:04:07 -04:00
Pierre Souchay	a72f92cac6	dns: implements prefix lookups for DNS TTL (#4605 ) This will fix https://github.com/hashicorp/consul/issues/4509 and allow forinstance lb-* to match services lb-001 or lb-service-007.	2018-10-19 08:41:04 -07:00
Kyle Havlovitz	96a35f8abc	re-add Connect multi-dc config changes This reverts commit 8bcfbaffb6588b024cd1a3cf0952e6bfa7d9e900.	2018-10-19 08:41:03 -07:00
Jack Pearkes	847a0a5266	Revert "Connect multi-dc config" (#4784 )	2018-10-11 17:32:45 +01:00
Aestek	260a9880ae	[Security] Add finer control over script checks (#4715 ) * Add -enable-local-script-checks options These options allow for a finer control over when script checks are enabled by giving the option to only allow them when they are declared from the local file system. * Add documentation for the new option * Nitpick doc wording	2018-10-11 13:22:11 +01:00
Rebecca Zanzig	0ec6d880f5	Support multiple tags for health and catalog http api endpoints (#4717 ) * Support multiple tags for health and catalog api endpoints Fixes #1781. Adds a `ServiceTags` field to the ServiceSpecificRequest to support multiple tags, updates the filter logic in the catalog store, and propagates these change through to the health and catalog endpoints. Note: Leaves `ServiceTag` in the struct, since it is being used as part of the DNS lookup, which in turn uses the health check. * Update the api package to support multiple tags Includes additional tests. * Update new tests to use the `require` library * Update HealthConnect check after a bad merge	2018-10-11 12:50:05 +01:00
Pierre Souchay	b0fc91a1d2	[Performance On Large clusters] Reduce updates on large services (#4720 ) * [Performance On Large clusters] Checks do update services/nodes only when really modified to avoid too many updates on very large clusters In a large cluster, when having a few thousands of nodes, the anti-entropy mechanism performs lots of changes (several per seconds) while there is no real change. This patch wants to improve this in order to increase Consul scalability when using many blocking requests on health for instance. * [Performance for large clusters] Only updates index of service if service is really modified * [Performance for large clusters] Only updates index of nodes if node is really modified * Added comments / ensure IsSame() has clear semantics * Avoid having modified boolean, return nil directly if stutures are Same * Fixed unstable unit tests TestLeader_ChangeServerID * Rewrite TestNode_IsSame() for better readability as suggested by @banks * Rename ServiceNode.IsSame() into IsSameService() + added unit tests * Do not duplicate TestStructs_ServiceNode_Conversions() and increase test coverage of IsSameService * Clearer documentation in IsSameService * Take into account ServiceProxy into ServiceNode.IsSameService() * Fixed IsSameService() with all new structures	2018-10-11 12:42:39 +01:00
Pierre Souchay	42f250fa53	Added SOA configuration for DNS settings. (#4714 ) This will allow to fine TUNE SOA settings sent by Consul in DNS responses, for instance to be able to control negative ttl. Will fix: https://github.com/hashicorp/consul/issues/4713 # Example Override all settings: * min_ttl: 0 => 60s * retry: 600 (10m) => 300s (5 minutes), * expire: 86400 (24h) => 43200 (12h) * refresh: 3600 (1h) => 1800 (30 minutes) ``` consul agent -dev -hcl 'dns_config={soa={min_ttl=60,retry=300,expire=43200,refresh=1800}}' ``` Result: ``` dig +multiline @localhost -p 8600 service.consul ; <<>> DiG 9.12.1 <<>> +multiline @localhost -p 8600 service.consul ; (2 servers found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 36557 ;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;service.consul. IN A ;; AUTHORITY SECTION: consul. 0 IN SOA ns.consul. hostmaster.consul. ( 1537959133 ; serial 1800 ; refresh (30 minutes) 300 ; retry (5 minutes) 43200 ; expire (12 hours) 60 ; minimum (1 minute) ) ;; Query time: 4 msec ;; SERVER: 127.0.0.1#8600(127.0.0.1) ;; WHEN: Wed Sep 26 12:52:13 CEST 2018 ;; MSG SIZE rcvd: 93 ```	2018-10-10 15:50:56 -04:00
Kyle Havlovitz	0cbd176a48	connect/ca: more OSS split for multi-dc	2018-10-10 12:17:59 -07:00
Kyle Havlovitz	6d5160c139	connect/ca: split CA initialization logic between oss/enterprise	2018-10-10 12:17:59 -07:00
Kyle Havlovitz	5b98a602af	agent: add primary_datacenter and connect replication config options	2018-10-10 12:17:59 -07:00
Kyle Havlovitz	304595f7a6	connect: add ExternalTrustDomain to CARoot fields	2018-10-10 12:16:47 -07:00
Kyle Havlovitz	475afd0300	docs: deprecate acl_datacenter and replace it with primary_datacenter	2018-10-10 12:16:47 -07:00
Paul Banks	92fe8c8e89	Add Proxy Upstreams to Service Definition (#4639 ) * Refactor Service Definition ProxyDestination. This includes: - Refactoring all internal structs used - Updated tests for both deprecated and new input for: - Agent Services endpoint response - Agent Service endpoint response - Agent Register endpoint - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Register - Unmanaged deprecated field - Unmanaged new fields - Managed deprecated upstreams - Managed new - Catalog Services endpoint response - Catalog Node endpoint response - Catalog Service endpoint response - Updated API tests for all of the above too (both deprecated and new forms of register) TODO: - config package changes for on-disk service definitions - proxy config endpoint - built-in proxy support for new fields * Agent proxy config endpoint updated with upstreams * Config file changes for upstreams. * Add upstream opaque config and update all tests to ensure it works everywhere. * Built in proxy working with new Upstreams config * Command fixes and deprecations * Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts... TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct. * Fix translated keys in API registration. ≈ * Fixes from docs - omit some empty undocumented fields in API - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally. * Documentation updates for Upstreams in service definition * Fixes for tests broken by many refactors. * Enable travis on f-connect branch in this branch too. * Add consistent Deprecation comments to ProxyDestination uses * Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs	2018-10-10 16:55:34 +01:00
Alex Dadgar	90ed72fd70	do not bootstrap with non voters	2018-09-19 17:41:36 -07:00
Kyle Havlovitz	9b8f8975c6	Merge pull request #4644 from hashicorp/ca-refactor connect/ca: rework initialization/root generation in providers	2018-09-13 13:08:34 -07:00
Paul Banks	09e4c2995b	Fix CA pruning when CA config uses string durations. (#4669 ) * Fix CA pruning when CA config uses string durations. The tl;dr here is: - Configuring LeafCertTTL with a string like "72h" is how we do it by default and should be supported - Most of our tests managed to escape this by defining them as time.Duration directly - Out actual default value is a string - Since this is stored in a map[string]interface{} config, when it is written to Raft it goes through a msgpack encode/decode cycle (even though it's written from server not over RPC). - msgpack decode leaves the string as a `[]uint8` - Some of our parsers required string and failed - So after 1 hour, a default configured server would throw an error about pruning old CAs - If a new CA was configured that set LeafCertTTL as a time.Duration, things might be OK after that, but if a new CA was just configured from config file, intialization would cause same issue but always fail still so would never prune the old CA. - Mostly this is just a janky error that got passed tests due to many levels of complicated encoding/decoding. tl;dr of the tl;dr: Yay for type safety. Map[string]interface{} combined with msgpack always goes wrong but we somehow get bitten every time in a new way :D We already fixed this once! The main CA config had the same problem so @kyhavlov already wrote the mapstructure DecodeHook that fixes it. It wasn't used in several places it needed to be and one of those is notw in `structs` which caused a dependency cycle so I've moved them. This adds a whole new test thta explicitly tests the case that broke here. It also adds tests that would have failed in other places before (Consul and Vaul provider parsing functions). I'm not sure if they would ever be affected as it is now as we've not seen things broken with them but it seems better to explicitly test that and support it to not be bitten a third time! * Typo fix * Fix bad Uint8 usage	2018-09-13 15:43:00 +01:00
Pierre Souchay	5ecf9823d2	Fix more unstable tests in agent and command	2018-09-12 14:49:27 +01:00
Kyle Havlovitz	8fc2c77fdf	connect/ca: some cleanup and reorganizing of the new methods	2018-09-11 16:43:04 -07:00
Pierre Souchay	7a42c31330	Fix unstable tests in agent, api, and command/watch	2018-09-10 16:58:53 +01:00
Pierre Souchay	473e589d86	Implementation of Weights Data structures (#4468 ) * Implementation of Weights Data structures Adding this datastructure will allow us to resolve the issues #1088 and #4198 This new structure defaults to values: ``` { Passing: 1, Warning: 0 } ``` Which means, use weight of 0 for a Service in Warning State while use Weight 1 for a Healthy Service. Thus it remains compatible with previous Consul versions. * Implemented weights for DNS SRV Records * DNS properly support agents with weight support while server does not (backwards compatibility) * Use Warning value of Weights of 1 by default When using DNS interface with only_passing = false, all nodes with non-Critical healthcheck used to have a weight value of 1. While having weight.Warning = 0 as default value, this is probably a bad idea as it breaks ascending compatibility. Thus, we put a default value of 1 to be consistent with existing behaviour. * Added documentation for new weight field in service description * Better documentation about weights as suggested by @banks * Return weight = 1 for unknown Check states as suggested by @banks * Fixed typo (of -> or) in error message as requested by @mkeeler * Fixed unstable unit test TestRetryJoin * Fixed unstable tests * Fixed wrong Fatalf format in `testrpc/wait.go` * Added notes regarding DNS SRV lookup limitations regarding number of instances * Documentation fixes and clarification regarding SRV records with weights as requested by @banks * Rephrase docs	2018-09-07 15:30:47 +01:00
Kyle Havlovitz	e184a18e4b	connect/ca: add Configure/GenerateRoot to provider interface	2018-09-06 19:18:59 -07:00
Pierre Souchay	54d8157ee1	Fixed more flaky tests in ./agent/consul (#4617 )	2018-09-04 14:02:47 +01:00
Freddy	10d3048bd6	Bugfix: Use "%#v" when formatting structs (#4600 )	2018-08-28 12:37:34 -04:00
Pierre Souchay	9b5cf0c1d0	[BUGFIX] Avoid returning empty data on startup of a non-leader server (#4554 ) Ensure that DB is properly initialized when performing stale queries Addresses: - https://github.com/hashicorp/consul-replicate/issues/82 - https://github.com/hashicorp/consul/issues/3975 - https://github.com/hashicorp/consul-template/issues/1131	2018-08-23 12:06:39 -04:00
Kyle Havlovitz	26a21df014	Merge branch 'master' into ca-snapshot-fix	2018-08-16 13:00:54 -07:00
Kyle Havlovitz	af4b037c52	fsm: add connect service config to snapshot/restore test	2018-08-16 12:58:54 -07:00
nickmy9729	43a68822e3	Added code to allow snapshot inclusion of NodeMeta (#4527 )	2018-08-16 15:33:35 -04:00
Kyle Havlovitz	880eccb502	fsm: add missing CA config to snapshot/restore logic	2018-08-16 11:58:50 -07:00
Kyle Havlovitz	fd83063686	autopilot: don't follow the normal server removal rules for nonvoters	2018-08-14 14:24:51 -07:00
Kyle Havlovitz	aa19559cc7	Fix stats fetcher healthcheck RPCs not being independent	2018-08-14 14:23:52 -07:00
Pierre Souchay	a16f34058b	Display more information about check being not properly added when it fails (#4405 ) * Display more information about check being not properly added when it fails It follows an incident where we add lots of error messages: [WARN] consul.fsm: EnsureRegistration failed: failed inserting check: Missing service registration That seems related to Consul failing to restart on respective agents. Having Node information as well as service information would help diagnose the issue. * Renamed ensureCheckIfNodeMatches() as requested by @banks	2018-08-14 17:45:33 +01:00
Pierre Souchay	821a91ca31	Allow to rename nodes with IDs, will fix #3974 and #4413 (#4415 ) * Allow to rename nodes with IDs, will fix #3974 and #4413 This change allow to rename any well behaving recent agent with an ID to be renamed safely, ie: without taking the name of another one with case insensitive comparison. Deprecated behaviour warning ---------------------------- Due to asceding compatibility, it is still possible however to "take" the name of another name by not providing any ID. Note that when not providing any ID, it is possible to have 2 nodes having similar names with case differences, ie: myNode and mynode which might lead to DB corruption on Consul server side and lead to server not properly restarting. See #3983 and #4399 for Context about this change. Disabling registration of nodes without IDs as specified in #4414 should probably be the way to go eventually. * Removed the case-insensitive search when adding a node within the else block since it breaks the test TestAgentAntiEntropy_Services While the else case is probably legit, it will be fixed with #4414 in a later release. * Added again the test in the else to avoid duplicated names, but enforce this test only for nodes having IDs. Thus most tests without any ID will work, and allows us fixing * Added more tests regarding request with/without IDs. `TestStateStore_EnsureNode` now test registration and renaming with IDs `TestStateStore_EnsureNodeDeprecated` tests registration without IDs and tests removing an ID from a node as well as updated a node without its ID (deprecated behaviour kept for backwards compatibility) * Do not allow renaming in case of conflict, including when other node has no ID * Fixed function GetNodeID that was not working due to wrong type when searching node from its ID Thus, all tests about renaming were not working properly. Added the full test cas that allowed me to detect it. * Better error messages, more tests when nodeID is not a valid UUID in GetNodeID() * Added separate TestStateStore_GetNodeID to test GetNodeID. More complete test coverage for GetNodeID * Added new unit test `TestStateStore_ensureNoNodeWithSimilarNameTxn` Also fixed comments to be clearer after remarks from @banks * Fixed error message in unit test to match test case * Use uuid.ParseUUID to parse Node.ID as requested by @mkeeler	2018-08-10 11:30:45 -04:00
Siva Prasad	d98d02777f	PR to fix TestAgent_IndexChurn and TestPreparedQuery_Wrapper. (#4512 ) * Fixes TestAgent_IndexChurn * Fixes TestPreparedQuery_Wrapper * Increased sleep in agent_test for IndexChurn to 500ms * Made the comment about joinWAN operation much less of a cliffhanger	2018-08-09 12:40:07 -04:00
Armon Dadgar	a343392f63	consul: Update buffer sizes	2018-08-08 10:26:58 -07:00
Siva Prasad	cfa436dc16	Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix." (#4497 ) * Revert "BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)" This reverts commit cec5d7239621e0732b3f70158addb1899442acb3. * Revert "CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493)" This reverts commit 589b589b53e56af38de25db9b56967bdf1f2c069.	2018-08-07 08:29:48 -04:00
Pierre Souchay	fd927ea110	BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472 ) - Improve resilience of testrpc.WaitForLeader() - Add additionall retry to CI - Increase "go test" timeout to 8m - Add wait for cluster leader to several tests in the agent package - Add retry to some tests in the api and command packages	2018-08-06 19:46:09 -04:00
Siva Prasad	29c181f5fa	CA initialization while boostrapping and TestLeader_ChangeServerID fix. (#4493 ) * connect: fix an issue with Consul CA bootstrapping being interrupted * streamline change server id test	2018-08-06 16:15:24 -04:00
Kyle Havlovitz	42ab07b398	fix inconsistency in TestConnectCAConfig_GetSet	2018-07-26 07:46:47 -07:00
Kyle Havlovitz	ecc02c6aee	Merge pull request #4400 from hashicorp/leaf-cert-ttl Add configurable leaf cert TTL to Connect CA	2018-07-25 17:53:25 -07:00
Paul Banks	217137b775	Fixes #4421 : General solution to stop blocking queries with index 0 (#4437 ) * Fix theoretical cache collision bug if/when we use more cache types with same result type * Generalized fix for blocking query handling when state store methods return zero index * Refactor test retry to only affect CI * Undo make file merge * Add hint to error message returned to end-user requests if Connect is not enabled when they try to request cert * Explicit error for Roots endpoint if connect is disabled * Fix tests that were asserting old behaviour	2018-07-25 20:26:27 +01:00
Kyle Havlovitz	a125735d76	connect/ca: check LeafCertTTL when rotating expired roots	2018-07-20 16:04:04 -07:00
Kyle Havlovitz	45ec8849f3	connect/ca: add configurable leaf cert TTL	2018-07-16 13:33:37 -07:00
Matt Keeler	cc46d59269	Merge pull request #4379 from hashicorp/persist-intermediates connect: persist intermediate CAs on leader change	2018-07-12 12:09:13 -04:00
Matt Keeler	965fc9cf62	Revert "Allow changing Node names since Node now have IDs"	2018-07-12 11:19:21 -04:00
Matt Keeler	d8a4d9137b	Fixup formatting	2018-07-12 10:14:26 -04:00
Matt Keeler	d63c5807cf	Revert PR 4294 - Catalog Register: Generate UUID for services registered without one UUID auto-generation here causes trouble in a few cases. The biggest being older nodes reregistering will fail when the UUIDs are different and the names match This reverts commit 0f700340828f464449c2e0d5a82db0bc5456d385. This reverts commit d1a8f9cb3f6f48dd9c8d0bc858031ff6ccff51d0. This reverts commit cf69ec42a418ab6594a6654e9545e12160f30970.	2018-07-12 10:06:50 -04:00
Kyle Havlovitz	2a40f93ac8	connect: use reflect.DeepEqual instead for test	2018-07-11 13:10:58 -07:00
Matt Keeler	42729d5aff	Merge pull request #3983 from pierresouchay/node_renaming Allow changing Node names since Node now have IDs	2018-07-11 16:03:02 -04:00
Kyle Havlovitz	f9a35a9338	connect: add provider state to snapshots	2018-07-11 11:34:49 -07:00
Kyle Havlovitz	9c21cc7ac9	connect: update leader initializeCA comment	2018-07-11 10:00:42 -07:00
Kyle Havlovitz	db254f0991	connect: persist intermediate CAs on leader change	2018-07-11 09:44:30 -07:00
Pierre Souchay	3d0a960470	When renaming a node, ensure the name is not taken by another node. Since DNS is case insensitive and DB as issues when similar names with different cases are added, check for unicity based on case insensitivity. Following another big incident we had in our cluster, we also validate that adding/renaming a not does not conflicts with case insensitive matches. We had the following error once: - one node called: mymachine.MYDC.mydomain was shut off - another node (different ID) was added with name: mymachine.mydc.mydomain before 72 hours When restarting the consul server of domain, the consul server restarted failed to start since it detected an issue in RAFT database because mymachine.MYDC.mydomain and mymachine.mydc.mydomain had the same names. Checking at registration time with case insensitivity should definitly fix those issues and avoid Consul DB corruption.	2018-07-11 14:42:54 +02:00
Matt Keeler	22c5951ec4	Merge pull request #4303 from pierresouchay/non_blocking_acl Only send one single ACL cache refresh across network when TTL is over	2018-07-10 08:57:33 -04:00
MagnumOpus21	0b50b84429	Agent/Proxy: Formatting and test cases fix	2018-07-09 12:46:10 -04:00
Kyle Havlovitz	883b2a518a	Store the time CARoot is rotated out instead of when to prune	2018-07-06 16:05:25 -07:00
Kyle Havlovitz	3c520019e9	connect/ca: add logic for pruning old stale RootCA entries	2018-07-02 10:35:05 -07:00
Pierre Souchay	95a0ab9f99	Updated swith case to use same branch for async-cache and extend-cache	2018-07-02 17:39:34 +02:00
Pierre Souchay	6dfbbf1350	Updated documentation and adding more test case for async-cache	2018-07-01 23:50:30 +02:00
Pierre Souchay	382bec0897	Added async-cache with similar behaviour as extend-cache but asynchronously	2018-07-01 23:50:30 +02:00
Pierre Souchay	da9c91fd3d	Only send one single ACL cache refresh across network when TTL is over It will allow the following: * when connectivity is limited (saturated linnks between DCs), only one single request to refresh an ACL will be sent to ACL master DC instead of statcking ACL refresh queries * when extend-cache is used for ACL, do not wait for result, but refresh the ACL asynchronously, so no delay is not impacting slave DC * When extend-cache is not used, keep the existing blocking mechanism, but only send a single refresh request. This will fix https://github.com/hashicorp/consul/issues/3524	2018-07-01 23:50:30 +02:00
Matt Keeler	02719c52ff	Move starting enterprise functionality	2018-06-29 17:38:29 -04:00
Matt Keeler	66af873639	Move default uuid test into the consul package	2018-06-27 09:21:58 -04:00
Matt Keeler	dbc407cec9	go fmt changes	2018-06-27 09:07:22 -04:00
Matt Keeler	95291ec5ed	Make sure to generate UUIDs when services are registered without one This makes the behavior line up with the docs and expected behavior	2018-06-26 17:04:08 -04:00
mkeeler	1da3c42867	Merge remote-tracking branch 'connect/f-connect'	2018-06-25 19:42:51 +00:00
Kyle Havlovitz	859eaea5c4	connect/ca: pull the cluster ID from config during a rotation	2018-06-25 12:25:42 -07:00
Kyle Havlovitz	fcc5dc6110	connect/ca: leave blank root key/cert out of the default config (unnecessary)	2018-06-25 12:25:42 -07:00
Kyle Havlovitz	f3089a6647	connect/ca: undo the interface changes and use sign-self-issued in Vault	2018-06-25 12:25:42 -07:00
Kyle Havlovitz	cea94d0bcf	connect/ca: update Consul provider to use new cross-sign CSR method	2018-06-25 12:25:41 -07:00
Kyle Havlovitz	7b0845ccde	connect/ca: fix vault provider URI SANs and test	2018-06-25 12:25:41 -07:00
Kyle Havlovitz	a98b85b25c	connect/ca: add the Vault CA provider	2018-06-25 12:25:41 -07:00
Paul Banks	b4fbeb0453	Note leadership issues in comments	2018-06-25 12:25:41 -07:00
Paul Banks	824a9b4943	Actually return Intermediate certificates bundled with a leaf!	2018-06-25 12:25:40 -07:00
Paul Banks	81bd1b43a3	Fix hot loop in cache for RPC returning zero index.	2018-06-25 12:25:37 -07:00
Paul Banks	d6b13463ed	Fix misc test failures (some from other PRs)	2018-06-25 12:25:13 -07:00
Paul Banks	1283373a64	Only set precedence on write path	2018-06-25 12:25:13 -07:00
Paul Banks	22b95283e9	Fix some tests failures caused by the sorting change and some cuased by previous UpdatePrecedence() change	2018-06-25 12:25:13 -07:00
Paul Banks	e2938138f6	Sort intention list by precedence	2018-06-25 12:25:13 -07:00
Kyle Havlovitz	a242e5b130	agent: update accepted CA config fields and defaults	2018-06-25 12:25:09 -07:00
Mitchell Hashimoto	4ebddd6adb	agent/consul: set precedence value on struct itself	2018-06-25 12:24:16 -07:00
Mitchell Hashimoto	52c10d2208	agent/consul: support a Connect option on prepared query request	2018-06-25 12:24:12 -07:00
Mitchell Hashimoto	e8c899b1b8	agent/consul: prepared query supports "Connect" field	2018-06-25 12:24:11 -07:00
Mitchell Hashimoto	ad382d7351	agent: switch ConnectNative to an embedded struct	2018-06-25 12:24:10 -07:00
Mitchell Hashimoto	a3e0ac1ee3	agent/consul/state: support querying by Connect native	2018-06-25 12:24:08 -07:00
Mitchell Hashimoto	8e02bbc897	agent/consul: support catalog registration with Connect native	2018-06-25 12:24:07 -07:00
Matt Keeler	6ccc4f39db	Merge pull request #4216 from hashicorp/rpc-limiting Make RPC limits reloadable	2018-06-20 09:05:28 -04:00
Mitchell Hashimoto	787ce3b269	agent: address feedback	2018-06-14 09:42:20 -07:00
Mitchell Hashimoto	b5b29cd6af	agent: rename test to check	2018-06-14 09:42:18 -07:00
Mitchell Hashimoto	a48ff54318	agent/consul: forward request if necessary	2018-06-14 09:42:17 -07:00
Mitchell Hashimoto	b02502be73	agent: comments to point to differing logic	2018-06-14 09:42:17 -07:00
Mitchell Hashimoto	526cfc34bd	agent/consul: implement Intention.Test endpoint	2018-06-14 09:42:17 -07:00
Paul Banks	bd5eb8b749	Add default CA config back - I didn't add it and causes nil panics	2018-06-14 09:42:17 -07:00
Paul Banks	dbcf286d4c	Ooops remove the CA stuff from actual server defaults and make it test server only	2018-06-14 09:42:16 -07:00
Paul Banks	834ed1d25f	Fixed many tests after rebase. Some still failing and seem unrelated to any connect changes.	2018-06-14 09:42:16 -07:00
Paul Banks	30d90b3be4	Generate CSR using real trust-domain	2018-06-14 09:42:16 -07:00
Paul Banks	5a1408f186	Add CSR signing verification of service ACL, trust domain and datacenter.	2018-06-14 09:42:16 -07:00
Paul Banks	c808833a78	Return TrustDomain from CARoots RPC	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	d1265bc38b	Rename some of the CA structs/files	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	1660f9ebab	Add more metadata to structs.CARoot	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	baf4db1c72	Use provider state table for a global serial index	2018-06-14 09:42:15 -07:00
Kyle Havlovitz	c90b353eea	Move connect CA provider to separate package	2018-06-14 09:42:15 -07:00
Mitchell Hashimoto	54a1662da8	agent/consul: change provider wait from goto to a loop	2018-06-14 09:42:14 -07:00
Mitchell Hashimoto	749f81373f	agent/consul: check nil on getCAProvider result	2018-06-14 09:42:14 -07:00
Mitchell Hashimoto	c57405b323	agent/consul: retry reading provider a few times	2018-06-14 09:42:14 -07:00
Paul Banks	dcd277de8a	Wire up agent leaf endpoint to cache framework to support blocking.	2018-06-14 09:42:07 -07:00
Kyle Havlovitz	b28e11fdd3	Fill out connect CA rpc endpoint tests	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	7c0976208d	Add tests for the built in CA's state store table	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	19b9399f2f	Add more tests for built-in provider	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	a29f3c6b96	Fix some inconsistencies around the CA provider code	2018-06-14 09:42:06 -07:00
Kyle Havlovitz	2167713226	Add CA config to connect section of agent config	2018-06-14 09:42:05 -07:00
Kyle Havlovitz	02fef5f9a2	Move ConsulCAProviderConfig into structs package	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	887cc98d7e	Simplify the CAProvider.Sign method	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	44b30476cb	Simplify the CA provider interface by moving some logic out	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	aa10fb2f48	Clarify some comments and names around CA bootstrapping	2018-06-14 09:42:04 -07:00
Kyle Havlovitz	43f13d5a0b	Add cross-signing mechanism to root rotation	2018-06-14 09:42:00 -07:00
Kyle Havlovitz	bbfcb278e1	Add the root rotation mechanism to the CA config endpoint	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	a585a0ba10	Have the built in CA store its state in raft	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	80eddb0bfb	Fix the testing endpoint's root set op	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	fc9ef9741b	Hook the CA RPC endpoint into the provider interface	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	a40db26ffe	Add CA bootstrapping on establishing leadership	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	e26819ed9c	Add the bootstrap config for the CA	2018-06-14 09:41:59 -07:00
Kyle Havlovitz	ebdda17a30	Add CA config set to fsm operations	2018-06-14 09:41:58 -07:00
Kyle Havlovitz	f7ff16669f	Add the Connect CA config to the state store	2018-06-14 09:41:58 -07:00
Paul Banks	9d11cd9bf4	Fix various test failures and vet warnings. Intention de-duplication in previously merged PR actualy failed some tests that were not caught be me or CI. I ran the test files for state changes but they happened not to trigger this case so I made sure they did first and then fixed. That fixed some upstream intention endpoint tests that I'd not run as part of testing the previous fix.	2018-06-14 09:41:58 -07:00
Paul Banks	280382c25f	Add tests all the way up through the endpoints to ensure duplicate src/destination is supported and so ultimately deny/allow nesting works. Also adds a sanity check test for `api.Agent().ConnectAuthorize()` and a fix for a trivial bug in it.	2018-06-14 09:41:57 -07:00
Paul Banks	adc5589329	Allow duplicate source or destination, but enforce uniqueness across all four.	2018-06-14 09:41:57 -07:00
Mitchell Hashimoto	1985655dff	agent/consul/state: ensure exactly one active CA exists when setting	2018-06-14 09:41:54 -07:00
Mitchell Hashimoto	da1bc48372	agent/connect: rename SpiffeID to CertURI	2018-06-14 09:41:53 -07:00
Mitchell Hashimoto	b0315811b9	agent/connect: use proper keyusage fields for CA and leaf	2018-06-14 09:41:53 -07:00
Mitchell Hashimoto	2026cf3753	agent/consul: encode issued cert serial number as hex encoded	2018-06-14 09:41:53 -07:00
Mitchell Hashimoto	746f80639a	agent: /v1/connect/ca/configuration PUT for setting configuration	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	2dfca5dbc2	agent/consul/fsm,state: snapshot/restore for CA roots	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	17d6b437d2	agent/consul/fsm,state: tests for CA root related changes	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	a8510f8224	agent/consul: set more fields on the issued cert	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	58b6f476e8	agent: /v1/connect/ca/leaf/:service_id	2018-06-14 09:41:52 -07:00
Mitchell Hashimoto	80a058a573	agent/consul: CAS operations for setting the CA root	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	712888258b	agent/consul: tests for CA endpoints	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	1928c07d0c	agent/consul: key the public key of the CSR, verify in test	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	9a8653f45e	agent/consul: test for ConnectCA.Sign	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	a360c5cca4	agent/consul: basic sign endpoint not tested yet	2018-06-14 09:41:51 -07:00
Mitchell Hashimoto	24830f4cfa	agent/consul: RPC endpoints to list roots	2018-06-14 09:41:50 -07:00
Mitchell Hashimoto	cfb62677c0	agent/consul/state: CARoot structs and initial state store	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	7e8d606717	agent: address PR feedback	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	f9a55aa7e0	agent: clarified a number of comments per PR feedback	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	62cbb892e3	agent/consul: Health.ServiceNodes ACL check for Connect	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	641c982480	agent/consul: Catalog endpoint ACL requirements for Connect proxies	2018-06-14 09:41:49 -07:00
Mitchell Hashimoto	566c98b2fc	agent/consul: require name for proxies	2018-06-14 09:41:48 -07:00
Mitchell Hashimoto	daaa6e2403	agent: clean up connect/non-connect duplication by using shared methods	2018-06-14 09:41:48 -07:00
Mitchell Hashimoto	119ffe3ed9	agent/consul: implement Health.ServiceNodes for Connect, DNS works	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	253256352c	agent/consul: Catalog.ServiceNodes supports Connect filtering	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	06957f6d7f	agent/consul/state: ConnectServiceNodes	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	200100d3f4	agent/consul: enforce ACL on ProxyDestination	2018-06-14 09:41:47 -07:00
Mitchell Hashimoto	8a72826483	agent/consul: proxy registration and tests	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	8777ff139c	agent: test /v1/catalog/node/:node to list connect proxies	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	761b561946	agent: /v1/catalog/service/:service works with proxies	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	58bff8dd05	agent/consul/state: convert proxy test to testify/assert	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	09568ce7b5	agent/consul/state: service registration with proxy works	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	23ee0888ec	agent/consul: convert intention ACLs to testify/assert	2018-06-14 09:41:46 -07:00
Mitchell Hashimoto	6a8bba7d48	agent/consul,structs: add tests for ACL filter and prefix for intentions	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	3e10a1ae7a	agent/consul: Intention.Match ACLs	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	db44a98a2d	agent/consul: Intention.Get ACLs	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	fd840da97a	agent/consul: Intention.Apply ACL on rename	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	14ca93e09c	agent/consul: tests for ACLs on Intention.Apply update/delete	2018-06-14 09:41:45 -07:00
Mitchell Hashimoto	c54be9bc09	agent/consul: Basic ACL on Intention.Apply	2018-06-14 09:41:44 -07:00
Mitchell Hashimoto	1d0b4ceedb	agent: convert all intention tests to testify/assert	2018-06-14 09:41:44 -07:00
Mitchell Hashimoto	f07340e94f	agent/consul/fsm,state: snapshot/restore for intentions	2018-06-14 09:41:44 -07:00
Mitchell Hashimoto	6f33b2d070	agent: use UTC time for intention times, move empty list check to agent/consul	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	67b017c95c	agent/consul/fsm: switch tests to use structs.TestIntention	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	3a00564411	agent/consul/state: need to set Meta for intentions for tests	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	027dad8672	agent/consul/state: remove TODO	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	37f66e47ed	agent: use testing intention to get valid intentions	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	04bd4af99c	agent/consul: set default intention SourceType, validate it	2018-06-14 09:41:43 -07:00
Mitchell Hashimoto	8e2462e301	agent/structs: Intention validation	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	d34ee200de	agent/consul: support intention description, meta is non-nil	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	e81d1c88b7	agent/consul/fsm: add tests for intention requests	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	2b047fb09b	agent,agent/consul: set default namespaces	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	e630d65d9d	agent/consul: set CreatedAt, UpdatedAt on intentions	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	e9d208bcb6	agent/consul: RPC endpoint for Intention.Match	2018-06-14 09:41:42 -07:00
Mitchell Hashimoto	987b7ce0a2	agent/consul/state: IntentionMatch for performing match resolution	2018-06-14 09:41:41 -07:00
Mitchell Hashimoto	bebe6870ff	agent/consul: test that Apply works to delete an intention	2018-06-14 09:41:41 -07:00
Mitchell Hashimoto	95e1c92edf	agent/consul/state,fsm: support for deleting intentions	2018-06-14 09:41:41 -07:00
Mitchell Hashimoto	32ad54369c	agent/consul: creating intention must not have ID set	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	f219c766cb	agent/consul: support updating intentions	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	37572829ab	agent: GET /v1/connect/intentions/:id	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	2a8a2f8167	agent/consul: Intention.Get endpoint	2018-06-14 09:41:40 -07:00
Mitchell Hashimoto	48b9a43f1d	agent/consul: Intention.Apply, FSM methods, very little validation	2018-06-14 09:41:39 -07:00
Mitchell Hashimoto	b19a289596	agent/consul: start Intention RPC endpoints, starting with List	2018-06-14 09:41:39 -07:00
Mitchell Hashimoto	8b0ac7d9c5	agent/consul/state: list intentions	2018-06-14 09:41:39 -07:00
Mitchell Hashimoto	c05bed86e1	agent/consul/state: initial work on intentions memdb table	2018-06-14 09:41:39 -07:00
Guido Iaquinti	3ed73961b3	Attach server.Name label to client.rpc.failed	2018-06-13 14:56:14 +01:00
Guido Iaquinti	bda575074e	Attach server.ID label to client.rpc.failed	2018-06-13 14:53:44 +01:00
Guido Iaquinti	edd6a69541	Client: add metric for failed RPC calls to server	2018-06-13 12:35:45 +01:00
Matt Keeler	c41fa6c010	Add a Client ReloadConfig test	2018-06-11 16:23:51 -04:00
Matt Keeler	c5d9c2362f	Merge branch 'master' of github.com:hashicorp/consul into rpc-limiting # Conflicts: # agent/agent.go # agent/consul/client.go	2018-06-11 16:11:36 -04:00
Matt Keeler	c589991452	Apply the limits to the clients rpcLimiter	2018-06-11 15:51:17 -04:00
Matt Keeler	14661a417b	Allow for easy enterprise/oss coexistence Uses struct/interface embedding with the embedded structs/interfaces being empty for oss. Also methods on the server/client types are defaulted to do nothing for OSS	2018-05-24 10:36:42 -04:00
Wim	88514d6a82	Add support for reverse lookup of services	2018-05-19 19:39:02 +02:00
Preetha Appan	7400a78f8a	Change default raft threshold config values and add a section to upgrade notes	2018-05-11 10:45:41 -05:00
Preetha Appan	e28c5fbb4e	Also make snapshot interval configurable	2018-05-11 10:43:24 -05:00
Preetha Appan	eb4bc79118	Make raft snapshot commit threshold configurable	2018-05-11 10:43:24 -05:00
Jack Pearkes	e611b1728a	Merge pull request #4097 from hashicorp/remove-deprecated Remove deprecated check/service fields and metric names	2018-05-10 15:45:49 -07:00
Kyle Havlovitz	60307ef328	Remove deprecated metric names	2018-05-08 16:23:15 -07:00
Paul Banks	c55885efd8	Merge pull request #3970 from pierresouchay/node_health_should_change_service_index [BUGFIX] When a node level check is removed, ensure all services of node are notified	2018-05-08 16:44:50 +01:00
Pierre Souchay	ee47eb7d7d	Added Missing Service Meta synchronization and field	2018-04-21 17:34:29 +02:00
Pierre Souchay	1b55e3559b	Allow renaming nodes when ID is unchanged	2018-04-18 15:39:38 +02:00
Kyle Havlovitz	be10300d06	Update make static-assets goal and run format	2018-04-13 09:57:25 -07:00
Matt Keeler	ed94d356e0	Merge pull request #4023 from hashicorp/f-near-ip Add near=_ip support for prepared queries	2018-04-12 12:10:48 -04:00
Matt Keeler	aa9151738a	GH-3798: A couple more PR updates Test HTTP/DNS source IP without header/extra EDNS data. Add WARN log for when prepared query with near=_ip is executed without specifying the source ip	2018-04-12 10:10:37 -04:00
Matt Keeler	3a0f7789ec	GH-3798: A few more PR updates	2018-04-11 20:32:35 -04:00
Matt Keeler	de3a9be3d0	GH-3798: Updates for PR Allow DNS peer IP as the source IP. Break early when the right node was found for executing the preapred query. Update docs	2018-04-11 17:02:04 -04:00
Matt Keeler	89cd24aeca	GH-3798: Add near=_ip support for prepared queries	2018-04-10 14:50:50 -04:00
Paul Banks	2ed0d2afcd	Allow ignoring checks by ID when defining a PreparedQuery. Fixes #3727 .	2018-04-10 14:04:16 +01:00
Preetha Appan	d9d9944179	Renames agent API layer for service metadata to "meta" for consistency	2018-03-28 09:04:50 -05:00
Preetha	8dacb12c79	Merge pull request #3881 from pierresouchay/service_metadata Feature Request: Support key-value attributes for services	2018-03-27 16:33:57 -05:00
Pierre Souchay	b9ae4e647f	Added validation of ServiceMeta in Catalog Fixed Error Message when ServiceMeta is not valid Added Unit test for adding a Service with badly formatted ServiceMeta	2018-03-27 22:22:42 +02:00
Preetha Appan	17a011b9bd	fix typo and remove comment	2018-03-27 14:28:05 -05:00
Preetha Appan	6d16afc65c	Remove unnecessary nil checks	2018-03-27 10:59:42 -05:00
Preetha Appan	c21c2da690	Fix test and remove unused method	2018-03-27 09:44:41 -05:00
Preetha Appan	512f9a50fc	Allows disabling WAN federation by setting serf WAN port to -1	2018-03-26 14:21:06 -05:00
Pierre Souchay	eccb56ade0	Added support for renaming nodes when their IP does not change	2018-03-26 16:44:13 +02:00
Pierre Souchay	90d2f7bca1	Merge remote-tracking branch 'origin/master' into node_health_should_change_service_index	2018-03-22 13:07:11 +01:00
Pierre Souchay	9cc9dce848	More test cases	2018-03-22 12:41:06 +01:00
Pierre Souchay	7e8e4e014b	Added new test regarding checks index	2018-03-22 12:20:25 +01:00
Pierre Souchay	a8b66fb7aa	Fixed minor typo in comments Might fix unstable travis build	2018-03-22 10:30:10 +01:00
Josh Soref	1dd8c378b9	Spelling (#3958 ) * spelling: another * spelling: autopilot * spelling: beginning * spelling: circonus * spelling: default * spelling: definition * spelling: distance * spelling: encountered * spelling: enterprise * spelling: expands * spelling: exits * spelling: formatting * spelling: health * spelling: hierarchy * spelling: imposed * spelling: independence * spelling: inspect * spelling: last * spelling: latest * spelling: client * spelling: message * spelling: minimum * spelling: notify * spelling: nonexistent * spelling: operator * spelling: payload * spelling: preceded * spelling: prepared * spelling: programmatically * spelling: required * spelling: reconcile * spelling: responses * spelling: request * spelling: response * spelling: results * spelling: retrieve * spelling: service * spelling: significantly * spelling: specifies * spelling: supported * spelling: synchronization * spelling: synchronous * spelling: themselves * spelling: unexpected * spelling: validations * spelling: value	2018-03-19 16:56:00 +00:00
Pierre Souchay	3eb287f57d	Fixed typo in comments	2018-03-19 17:12:08 +01:00
Pierre Souchay	eb2a4eaea3	Refactoring to have clearer code without weird bool	2018-03-19 16:12:54 +01:00
Pierre Souchay	a5f6ac0df4	[BUGFIX] When a node level check is removed, ensure all services of node are notified Bugfix for https://github.com/hashicorp/consul/pull/3899 When a node level check is removed (example: maintenance), some watchers on services might have to recompute their state. If those nodes are performing blocking queries, they have to be notified. While their state was updated when node-level state did change or was added this was not the case when the check was removed. This fixes it.	2018-03-19 14:14:03 +01:00
Devin Canterberry	881d20c606	🐛 Formatting changes only; add missing trailing commas	2018-03-15 10:19:46 -07:00
Mitchell Hashimoto	fbac58280e	agent/consul/fsm: begin using testify/assert	2018-03-06 09:48:15 -08:00
Paul Banks	628dcc9793	Merge pull request #3899 from pierresouchay/fix_blocking_queries_index Services Indexes modified per service instead of using a global Index	2018-03-02 16:24:43 +00:00
Pierre Souchay	85b73f8163	Simplified error handling for maxIndexForService * added unit tests to ensure service index is properly garbage collected * added Upgrade from Version 1.0.6 to higher section in documentation	2018-03-01 14:09:36 +01:00
Preetha Appan	77d35f1829	Remove extra newline	2018-02-21 13:21:47 -06:00
Preetha Appan	573500dc51	Unit test that calls revokeLeadership twice to make sure its idempotent	2018-02-21 12:48:53 -06:00
Preetha Appan	bd270b02ba	Make sure revokeLeadership is called if establishLeadership errors	2018-02-21 12:33:22 -06:00
Alex Dadgar	535842004c	Test autopilots start/stop idempotency	2018-02-21 10:19:30 -08:00
Alex Dadgar	4d99696f02	Improve autopilot shutdown to be idempotent	2018-02-20 15:51:59 -08:00
Pierre Souchay	e6d85cb36a	Fixed comments for function maxIndexForService	2018-02-20 23:57:28 +01:00
Pierre Souchay	b26ea3c230	[Revert] Only update services if tags are different This patch did give some better results, but break watches on the services of a node. It is possible to apply the same optimization for nodes than to services (one index per instance), but it would complicate further the patch. Let's do it in another PR.	2018-02-20 23:34:42 +01:00
Pierre Souchay	903e866835	Only update services if tags are different	2018-02-20 23:08:04 +01:00
Pierre Souchay	56d5c0bf22	Enable Raft index optimization per service name on health endpoint Had to fix unit test in order to check properly indexes.	2018-02-20 01:35:50 +01:00
Pierre Souchay	ec1b278595	Get only first service to test whether we have to cleanup index of a service	2018-02-19 22:44:49 +01:00
Pierre Souchay	523feb0be4	Fixed comment about raftIndex + use test.Helper()	2018-02-19 19:30:25 +01:00
Pierre Souchay	4c188c1d08	Services Indexes modified per service instead of using a global Index This patch improves the watches for services on large cluster: each service has now its own index, such watches on a specific service are not modified by changes in the global catalog. It should improve a lot the performance of tools such as consul-template or libraries performing watches on very large clusters with many services/watches.	2018-02-19 18:29:22 +01:00
Veselkov Konstantin	05666113a4	remove golint warnings	2018-01-28 22:40:13 +04:00
Kyle Havlovitz	0e76d62846	Reset clusterHealth when autopilot starts	2018-01-23 12:52:28 -08:00
Kyle Havlovitz	6d1dbe6cc4	Move autopilot health loop into leader operations	2018-01-23 11:17:41 -08:00
James Phillips	62e97a6602	Fixes a `go fmt` cleanup.	2017-12-20 13:43:38 -08:00
Kyle Havlovitz	74b0c58831	Fix vet error	2017-12-18 18:04:42 -08:00
Kyle Havlovitz	dfc165a47b	Move autopilot initializing to oss file	2017-12-18 18:02:44 -08:00
Kyle Havlovitz	044c38aa7b	Move autopilot setup to a separate file	2017-12-18 16:55:51 -08:00
Kyle Havlovitz	9e1ba6fb4e	Make some final tweaks to autopilot package	2017-12-18 12:26:47 -08:00
Kyle Havlovitz	6b58df5898	Merge pull request #3737 from hashicorp/autopilot-refactor Move autopilot to a standalone package	2017-12-15 14:09:40 -08:00
James Phillips	262cbbd9ca	Merge pull request #3728 from weiwei04/fix_globalRPC_goroutine_leak fix globalRPC goroutine leak	2017-12-14 17:54:19 -08:00
Kyle Havlovitz	798aca92c5	Expose IsPotentialVoter for advanced autopilot logic	2017-12-13 17:53:51 -08:00
Kyle Havlovitz	a4ac148077	Merge branch 'master' into autopilot-refactor	2017-12-13 11:54:32 -08:00
Kyle Havlovitz	6c985132de	A few last autopilot adjustments	2017-12-13 11:19:17 -08:00
Kyle Havlovitz	77d92bf15c	More autopilot reorganizing	2017-12-13 10:57:37 -08:00
James Phillips	984de6e2e0	Adds TODOs referencing #3744 .	2017-12-13 10:52:06 -08:00
Kyle Havlovitz	f347c8a531	More refactoring to make autopilot consul-agnostic	2017-12-12 17:46:28 -08:00
Kyle Havlovitz	8546a1d3c6	Move autopilot to a standalone package	2017-12-11 16:45:33 -08:00
James Phillips	32b64575d1	Moves Serf helper into lib to fix import cycle in consul-enterprise.	2017-12-07 16:57:58 -08:00
James Phillips	c16cce80bb	Turns of intent queue warnings and enables dynamic queue sizing.	2017-12-07 16:27:06 -08:00
Wei Wei	04531ff0fb	fix globalRPC goroutine leak Signed-off-by: Wei Wei <weiwei.inf@gmail.com>	2017-12-05 11:53:30 +08:00
James Phillips	c4bc89a187	Creates a registration mechanism for snapshot and restore.	2017-11-29 18:36:53 -08:00
James Phillips	8571555703	Begins split out of snapshots from the main FSM class.	2017-11-29 18:36:53 -08:00
James Phillips	4eaee8e0ba	Creates a registration mechanism for FSM commands.	2017-11-29 18:36:53 -08:00
James Phillips	3e7ea1931c	Moves the FSM into its own package. This will help make it clearer what happens when we add some registration plumbing for the different operations and snapshots.	2017-11-29 18:36:53 -08:00
James Phillips	7f3783f4be	Resolves an FSM snapshot TODO. This adds checks for sink write calls before we continue the refactor, which will resolve the other TODO comment we deleted as part of this change.	2017-11-29 18:36:53 -08:00
James Phillips	5a24d37ac0	Creates a registration mechanism for schemas. This also splits out the registration into the table-specific source files.	2017-11-29 18:36:52 -08:00
James Phillips	36bb30e67a	Creates a registration mechanism for RPC endpoints.	2017-11-29 18:36:52 -08:00
James Phillips	ba56669ea8	Renames stubs to be more consistent.	2017-11-29 18:36:52 -08:00
James Phillips	56552095c9	Sheds monotonic time info so tombstone GC bins work properly.	2017-11-29 10:34:24 -08:00
James Phillips	8656b7a3e9	Gives back the lock before writing to the expire channel. The lock isn't needed after we clean up the expire bin, and as seen in #3700 we can get into a deadlock waiting to place the expire index into the channel while holding this lock. Fixes #3700	2017-11-19 16:24:16 -08:00
James Phillips	8210523b1b	Moves the LAN event handler after the router is created. Fixes #3680	2017-11-10 12:26:48 -08:00
James Phillips	bfbbfb62ca	Revert "Adds a small sleep to make sure we are in the next GC bucket."	2017-11-08 22:18:37 -08:00
James Phillips	d6328a5bf8	Adds a sleep to make sure we are in the next GC bucket, ups time. Fixes #3670	2017-11-08 22:02:40 -08:00
James Phillips	91824375be	Skips the tombstone GC test in Travis for now. Related to #3670	2017-11-08 20:14:20 -08:00
James Phillips	b94ba8aeb4	Removes bogus getPort() in favor of freeport.	2017-11-08 19:55:50 -08:00
James Phillips	444a345a3a	Tightens timing up and reorders GC test to be less flaky.	2017-11-08 15:09:29 -08:00
James Phillips	e00624425b	Doubles the GC timing.	2017-11-08 15:01:11 -08:00
James Phillips	8eb91777d9	Opens up test timing a little more.	2017-11-08 14:01:19 -08:00
James Phillips	d45c2a01f1	Shifts off a gran boundary to help make test less flaky.	2017-11-08 13:57:17 -08:00
James Phillips	757e353334	Opens up the tombstone GC test timing.	2017-11-08 13:43:39 -08:00
Kyle Havlovitz	068ca11eb8	Move check definition to a sub-struct	2017-11-01 14:54:46 -07:00
Kyle Havlovitz	bc3ba5f873	Merge branch 'master' into esm-changes	2017-11-01 11:37:48 -07:00
Kyle Havlovitz	83524f44c4	Merge pull request #3622 from hashicorp/coordinate-node-endpoint agent: add /v1/coordianate/node/:node endpoint	2017-11-01 11:35:50 -07:00
Kyle Havlovitz	9909b661ac	Fill out the tests around coordinate/node functionality	2017-10-31 15:36:44 -07:00
Kyle Havlovitz	fd4d9f1c16	Factor out registerNodes function	2017-10-31 13:34:49 -07:00
James Phillips	c6e0366c02	Relaxes Autopilot promotion logic. (#3623 ) * Relaxes Autopilot promotion logic. When we defaulted the Raft protocol version to 3 in #3477 we made the numPeers() routine more strict to only count voters (this is more conservative and more correct). This had the side effect of breaking rolling updates because it's at odds with the Autopilot non-voter promotion logic. That logic used to wait to only promote to maintain an odd quorum of servers. During a rolling update (add one new server, wait, and then kill an old server) the dead server cleanup would still count the old server as a peer, which is conservative and the right thing to do, and no longer count the non-voter. This would wait to promote, so you could get into a stalemate. It is safer to promote early than remove early, so by promoting as soon as possible we have chosen that as the solution here. Fixes #3611 * Gets rid of unnecessary extra not-a-voter check.	2017-10-31 15:16:56 -05:00
Kyle Havlovitz	496dd7ab5b	Merge branch 'coordinate-node-endpoint' of github.com:hashicorp/consul into esm-changes	2017-10-26 19:20:24 -07:00
Kyle Havlovitz	f80e70271d	Added Coordinate.Node rpc endpoint and client api method	2017-10-26 19:16:40 -07:00
Kyle Havlovitz	84a07ea113	Expose SkipNodeUpdate field and some health check info in the http api	2017-10-25 19:37:30 +02:00
Frank Schroeder	74859ff3c0	test: replace porter tool with freeport lib This patch removes the porter tool which hands out free ports from a given range with a library which does the same thing. The challenge for acquiring free ports in concurrent go test runs is that go packages are tested concurrently and run in separate processes. There has to be some inter-process synchronization in preventing processes allocating the same ports. freeport allocates blocks of ports from a range expected to be not in heavy use and implements a system-wide mutex by binding to the first port of that block for the lifetime of the application. Ports are then provided sequentially from that block and are tested on localhost before being returned as available.	2017-10-21 22:01:09 +02:00
Ryan Slade	6f05ea91a3	Replace time.Now().Sub(x) with time.Since(x)	2017-10-17 20:38:24 +02:00
James Phillips	e9670761f9	Cleans up some drift between the OSS and Enterprise trees.	2017-10-11 15:53:07 -07:00
James Phillips	d1ad538345	Makes RPC handling more robust when rolling servers. (#3561 ) * Adds client-side retry for no leader errors. This paves over the case where the client was connected to the leader when it loses leadership. * Adds a configurable server RPC drain time and a fail-fast path for RPCs. When a server leaves it gets removed from the Raft configuration, so it will never know who the new leader server ends up being. Without this we'd be doomed to wait out the RPC hold timeout and then fail. This makes things fail a little quicker while a sever is draining, and since we added a client retry AND since the server doing this has already shut down and left the Serf LAN, clients should retry against some other server. * Makes the RPC hold timeout configurable. * Reorders struct members. * Sets the RPC hold timeout default for test servers. * Bumps the leave drain time up to 5 seconds. * Robustifies retries with a simpler client-side RPC hold. * Reverts untended delete.	2017-10-10 15:19:50 -07:00
James Phillips	a1db119d02	Fixes handling of stop channel and failed barrier attempts. (#3546 ) * Fixes handling of stop channel and failed barrier attempts. There were two issues here. First, we needed to not exit when there was a timeout trying to write the barrier, because Raft might not step down, so we'd be left as the leader but having run all the step down actions. Second, we didn't close over the stopCh correctly, so it was possible to nil that out and have the leaderLoop never exit. We close over it properly AND sequence the nil-ing of it AFTER the leaderLoop exits for good measure, so the code is more robust. Fixes #3545 * Cleans up based on code review feedback. * Tweaks comments. * Renames variables and removes comments.	2017-10-06 07:54:49 -07:00
Kyle Havlovitz	0063516e5e	Update metric names and add a legacy config flag	2017-10-04 16:43:27 -07:00
Preetha Appan	f38d20eb40	Remove extra newline	2017-10-03 15:19:31 -05:00
Preetha Appan	3c81e2db7c	Only allow 'list' policies within 'key' policy definitions. Consolidated two similar tests into one and fixed alignment.	2017-10-03 15:15:56 -05:00
Preetha Appan	d5acfc3982	Introduces new 'list' permission that applies to KV store recursive reads, and enforced only when opted in.	2017-10-02 17:10:21 -05:00
James Phillips	330ce87851	Gets rid of flaky clause in stats fetcher unit test. Given how the rutine is coded we can still get data so this wasn't a reliable thing to check.	2017-09-26 20:53:06 -07:00
preetapan	783e24be64	Issue 3452 (#3500 ) * Make sure that id and address are set in member created during reaping of catalog nodes that have been removed from serf * Get address from node table in the state store rather than from service address * Fix incorrect lookup by checkname instead of node name * Make sure that serverlookup is called with the right address format, added unit test. * Address code review comments * Tweaks style stuff.	2017-09-26 20:49:41 -07:00
James Phillips	4b17c9618f	Cleans up some edge cases in TestSnapshot_Forward_Leader. These could cause the tests to hang.	2017-09-26 14:07:28 -07:00
Preetha Appan	318d0232f7	Move Raft protocol version for list peers end point to server side, fix unit tests. This fixes #3449	2017-09-26 09:35:39 -05:00
James Phillips	fcaa889116	Bumps default Raft protocol to version 3. (#3477 ) * Changes default Raft protocol to 3. * Changes numPeers() to report only voters. This should have been there before, but it's more obvious that this is incorrect now that we default the Raft protocol to 3, which puts new servers in a read-only state while Autopilot waits for them to become healthy. * Fixes TestLeader_RollRaftServer. * Fixes TestOperator_RaftRemovePeerByAddress. * Fixes TestServer_. Relaxed the check for a given number of voter peers and instead do a thorough check that all servers see each other in their Raft configurations. Fixes TestACL_. These now just check for Raft replication to be set up, and don't care about the number of voter peers. Fixes TestOperator_Raft_ListPeers. * Fixes TestAutopilot_CleanupDeadServerPeriodic. * Fixes TestCatalog_ListNodes_ConsistentRead_Fail. * Fixes TestLeader_ChangeServerID and adjusts the conn pool to throw away sockets when it sees io.EOF. * Changes version to 1.0.0 in the options doc. * Makes metrics test more deterministic with autopilot metrics possible.	2017-09-25 15:27:04 -07:00
Preetha Appan	8394ad08db	Introduce Code Policy validation via sentinel, with a noop implementation	2017-09-25 13:44:55 -05:00
Frank Schröder	69a088ca85	New config parser, HCL support, multiple bind addrs (#3480 ) * new config parser for agent This patch implements a new config parser for the consul agent which makes the following changes to the previous implementation: * add HCL support * all configuration fragments in tests and for default config are expressed as HCL fragments * HCL fragments can be provided on the command line so that they can eventually replace the command line flags. * HCL/JSON fragments are parsed into a temporary Config structure which can be merged using reflection (all values are pointers). The existing merge logic of overwrite for values and append for slices has been preserved. * A single builder process generates a typed runtime configuration for the agent. The new implementation is more strict and fails in the builder process if no valid runtime configuration can be generated. Therefore, additional validations in other parts of the code should be removed. The builder also pre-computes all required network addresses so that no address/port magic should be required where the configuration is used and should therefore be removed. * Upgrade github.com/hashicorp/hcl to support int64 * improve error messages * fix directory permission test * Fix rtt test * Fix ForceLeave test * Skip performance test for now until we know what to do * Update github.com/hashicorp/memberlist to update log prefix * Make memberlist use the default logger * improve config error handling * do not fail on non-existing data-dir * experiment with non-uniform timeouts to get a handle on stalled leader elections * Run tests for packages separately to eliminate the spurious port conflicts * refactor private address detection and unify approach for ipv4 and ipv6. Fixes #2825 * do not allow unix sockets for DNS * improve bind and advertise addr error handling * go through builder using test coverage * minimal update to the docs * more coverage tests fixed * more tests * fix makefile * cleanup * fix port conflicts with external port server 'porter' * stop test server on error * do not run api test that change global ENV concurrently with the other tests * Run remaining api tests concurrently * no need for retry with the port number service * monkey patch race condition in go-sockaddr until we understand why that fails * monkey patch hcl decoder race condidtion until we understand why that fails * monkey patch spurious errors in strings.EqualFold from here * add test for hcl decoder race condition. Run with go test -parallel 128 * Increase timeout again * cleanup * don't log port allocations by default * use base command arg parsing to format help output properly * handle -dc deprecation case in Build * switch autopilot.max_trailing_logs to int * remove duplicate test case * remove unused methods * remove comments about flag/config value inconsistencies * switch got and want around since the error message was misleading. * Removes a stray debug log. * Removes a stray newline in imports. * Fixes TestACL_Version8. * Runs go fmt. * Adds a default case for unknown address types. * Reoders and reformats some imports. * Adds some comments and fixes typos. * Reorders imports. * add unix socket support for dns later * drop all deprecated flags and arguments * fix wrong field name * remove stray node-id file * drop unnecessary patch section in test * drop duplicate test * add test for LeaveOnTerm and SkipLeaveOnInt in client mode * drop "bla" and add clarifying comment for the test * split up tests to support enterprise/non-enterprise tests * drop raft multiplier and derive values during build phase * sanitize runtime config reflectively and add test * detect invalid config fields * fix tests with invalid config fields * use different values for wan sanitiziation test * drop recursor in favor of recursors * allow dns_config.udp_answer_limit to be zero * make sure tests run on machines with multiple ips * Fix failing tests in a few more places by providing a bind address in the test * Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder. * Add porter to server_test.go to make tests there less flaky * go fmt	2017-09-25 11:40:42 -07:00
James Phillips	268018c558	Robustifies check in TestCatalog_ListNodes_ConsistentRead_Fail test. Fixes #3469	2017-09-13 21:22:53 -07:00
James Phillips	8be4ee766a	Revert "Manages segments list via a pointer." This reverts commit c277a4250461443cbd63de0259e5e32766f651ea.	2017-09-07 16:37:11 -07:00
James Phillips	5008aabb62	Manages segments list via a pointer.	2017-09-07 16:21:07 -07:00
James Phillips	908f7be97f	Cleans up formatting.	2017-09-07 12:26:58 -07:00
James Phillips	02a3f3f27b	Shows the segment name in the keyring API and command output.	2017-09-07 12:17:39 -07:00
James Phillips	7c616e3768	Moves reconcile loop into segment stub.	2017-09-06 18:01:53 -07:00
James Phillips	4e34c2af06	Takes the skip out of the client check. Without this the merge delegate won't check the segment for non-servers a little below here.	2017-09-06 17:05:40 -07:00
James Phillips	78ac144fff	Merge pull request #3447 from hashicorp/issue-3070 Skips unique node ID check for old versions of Consul.	2017-09-06 13:24:15 -07:00
James Phillips	62d9299646	Fixes incorrect comment.	2017-09-06 13:23:19 -07:00
James Phillips	031f1874d0	Pulls down some code for the check loop.	2017-09-06 13:07:42 -07:00
James Phillips	2fd9328b21	Uses the Raft configuration for the self-add skip check.	2017-09-06 13:05:51 -07:00
Preetha Appan	1eae9f1e2f	Change member join reconcile step to process joining itself, to handle node IP address changes correctly when number of servers < 3	2017-09-06 13:53:01 -05:00
James Phillips	353e037c9b	Skips unique node ID check for old versions of Consul. Fixes #3070.	2017-09-05 22:57:29 -07:00
James Phillips	c629773b40	Makes the all segments query explict, and the default for `consul members`.	2017-09-05 12:22:20 -07:00
James Phillips	bc9780baad	Adds simple rate limiting for client agent RPC calls to Consul servers. (#3440 ) * Added rate limiting for agent RPC calls. * Initializes the rate limiter based on the config. * Adds the rate limiter into the snapshot RPC path. * Adds unit tests for the RPC rate limiter. * Groups the RPC limit parameters under "limits" in the config. * Adds some documentation about the RPC limiter. * Sends a 429 response when the rate limiter kicks in. * Adds docs for new telemetry. * Makes snapshot telemetry look like RPC telemetry and cleans up comments.	2017-09-01 15:02:50 -07:00
Kyle Havlovitz	334e082848	Merge pull request #3431 from hashicorp/network-segments-oss	2017-09-01 10:24:58 -07:00
Kyle Havlovitz	ff994e9ade	Pass listeners into setupSegments	2017-08-31 17:56:43 -07:00
Kyle Havlovitz	5cc4b32a5d	Organize segments for a cleaner split between enterprise and OSS	2017-08-31 17:39:46 -07:00
Kyle Havlovitz	b77a0aa932	Fix some inconsistencies with segment logic and comments	2017-08-30 17:43:46 -07:00
Preetha Appan	0728a04dbb	Wire server provider for raft layer only on protocol version 3 and above, and update changelog	2017-08-30 14:36:47 -05:00
Kyle Havlovitz	6ded43131a	Add segment addr field to tags for LAN flood joiner	2017-08-30 11:58:29 -07:00
Kyle Havlovitz	1c04f1537a	Add agent.segment interpolation to prepared queries	2017-08-30 11:58:29 -07:00
Kyle Havlovitz	107d7f6c5a	Add rpc_listener option to segment config	2017-08-30 11:58:29 -07:00
James Phillips	6a6eadd8c7	Adds open source side of network segments (feature is Enterprise-only).	2017-08-30 11:58:29 -07:00
Preetha Appan	e944370cde	More cleanup from code review	2017-08-30 12:31:36 -05:00
Preetha Appan	a215c764cd	Remove copy pasted duplicate line, update documentation.	2017-08-30 10:02:10 -05:00
Preetha Appan	5a29eb7486	Consolidate server lookup into one place and replace usages of localConsuls.	2017-08-30 09:30:33 -05:00
Preetha Appan	d8fe01db4c	Remove stray commented line	2017-08-30 09:30:33 -05:00
Preetha Appan	ca48e7e4c2	Remove server address tracking logic from manager/router and maintain it as part of lan event listener instead. Used sync.Map to track this, and added unit tests	2017-08-30 09:30:33 -05:00
Preetha Appan	b4a9d77d49	ServerAddressProvider interface also returns an error now	2017-08-30 09:30:33 -05:00
Preetha Appan	edb408bc22	Use config struct to create NetworkTransport layer when setting up raft	2017-08-30 09:30:33 -05:00
Preetha Appan	01f8e469aa	Implement AddressProvider and wire that up to raft transport layer to support server nodes changing their IP addresses in containerized environments	2017-08-30 09:30:33 -05:00
Frank Schroeder	62c77d70f0	build: make tests independent of build tags When the metadata server is scanning the agents for potential servers it is parsing the version number which the agent provided when it joined. This version number has to conform to a certain format, i.e. 'n.n.n'. Without this version number properly set some tests fail with error messages that disguise the root cause. The default version number is currently set to 'unknown' in version/version.go which does not parse and triggers the tests to fail. The work around is to use a build tag 'consul' which will use the version number set in version_base.go instead which has the correct format and is set to the current release version. In addition, some parts of the code also require the version number to be of a certain value. Setting it to '0.0.0' for example makes some tests pass and others fail since they don't pass the semantic check. When using go build/install/test one has to remember to use '-tags consul' or tests will fail with non-obvious error messages. Using build tags makes the build process more complex and error prone since it prevents the use of the plain go toolchain and - at least in its current form - introduces subtle build and test issues. We should try to eliminate build tags for anything else but platform specific code. This patch removes all references to specific version numbers in the code and tests and sets the default version to '9.9.9' which is syntactically correct and passes the semantic check. This solves the issue of running go build/install/test without tags for the OSS build.	2017-08-30 13:40:18 +02:00
Frank Schröder	44e6b8122d	acl: consolidate error handling (#3401 ) The error handling of the ACL code relies on the presence of certain magic error messages. Since the error values are sent via RPC between older and newer consul agents we cannot just replace the magic values with typed errors and switch to type checks since this would break compatibility with older clients. Therefore, this patch moves all magic ACL error messages into the acl package and provides default error values and helper functions which determine the type of error.	2017-08-23 16:52:48 +02:00
Frank Schroeder	d9e2a51887	agent: drop unused code This code from http://github.com/hashicorp/consul/pull/3353 is no longer required.	2017-08-22 00:02:46 +02:00
James Phillips	3518e27a76	Revert "Return 403 rather than a 404 when acls cause all results to be filter…"	2017-08-09 15:06:57 -07:00
James Phillips	91205b2cd6	Revert "Ensure that we return a permission denied only if the list of keys/en…"	2017-08-09 15:06:20 -07:00
Preetha Appan	121326161e	Added unit test case to kvs_endpointtest	2017-08-09 15:50:22 -05:00
Preetha Appan	d06002dc62	Ensure that we return a permission denied only if the list of keys/entries prior to filtering by ACL is non empty	2017-08-09 15:32:18 -05:00
Frank Schroeder	c38dcf2d17	agent: move agent/consul/agent to agent/metadata	2017-08-09 14:36:52 +02:00
Frank Schroeder	85bdb77d90	agent: move agent/consul/servers to agent/router	2017-08-09 14:36:37 +02:00
Frank Schroeder	1d0bbfed9c	agent: move agent/consul/structs to agent/structs	2017-08-09 14:32:12 +02:00
Kyle Havlovitz	8c2e422074	Merge pull request #3369 from hashicorp/metrics-enhancements Add support for labels/filters from go-metrics	2017-08-08 13:55:30 -07:00
Kyle Havlovitz	975ded2714	Add support for labels/filters from go-metrics	2017-08-08 01:45:10 -07:00
Preetha Appan	6bac9355fd	Use sanitized version of node name of server in NS record, and start with "server" rather than "ns"	2017-08-07 11:11:55 +02:00
Preetha Appan	7e9d683ab1	Removed a copy pasted irrelevant comment, and other code review feedback	2017-08-07 11:11:54 +02:00
Preetha Appan	c38906daad	Add NS records and A records for each server. Constructs ns host names using the advertise address of the server.	2017-08-07 11:11:54 +02:00
James Phillips	803ed9a245	Adds secure introduction for the ACL replication token. (#3357 ) Adds secure introduction for the ACL replication token, as well as a separate enable config for ACL replication.	2017-08-03 15:39:31 -07:00
James Phillips	c31b56a03e	Adds a new /v1/acl/bootstrap API (#3349 )	2017-08-02 17:05:18 -07:00
Preetha Appan	307049e17f	Return nil instead of empty list when returning a PermissionDenied error, updated unit test	2017-07-31 17:23:20 -05:00
Preetha Appan	da29b74d03	Return 403 rather than a 404 when acls cause all results to be filtered out. This fixes #2637	2017-07-31 13:50:29 -05:00
James Phillips	8f1f762ddd	Adds missing autopilot snapshot test and avoids snapshotting nil. (#3333 )	2017-07-28 15:48:42 -07:00
James Phillips	6b51744ddf	Adds option to prepared queries to remove empty tags. (#3330 )	2017-07-26 22:46:43 -07:00
James Phillips	6e794ea1b3	Adds support for agent-side ACL token management via API instead of config files. (#3324 ) * Adds token store and removes all runtime use of config for ACL tokens. * Adds a new API for changing agent tokens on the fly.	2017-07-26 11:03:43 -07:00
Preetha Appan	4692b1478e	Add extra test case for deleting entire tree with empty prefix	2017-07-26 09:42:07 -05:00

... 8 9 10 11 12 ...

1000 Commits