open-consul

Commit Graph

Author	SHA1	Message	Date
Kyle Havlovitz	f31582624d	state: Add terminating gateway events on updating a config entry Co-Authored-By: Daniel Nephin <dnephin@hashicorp.com>	2021-03-09 14:00:34 -05:00
Daniel Nephin	f42a2ca8a3	state: add first terminating catalog catalog event Health of a terminating gateway instance changes - Generate an event for creating/destroying this instance of the terminating gateway, duplicate it for each affected service Co-Authored-By: Kyle Havlovitz <kylehav@gmail.com>	2021-03-09 14:00:33 -05:00
Daniel Nephin	1184ceff9e	state: convert nodes.ID to new functional pattern In preparation for adding other identifiers to the index.	2021-03-05 12:30:40 -05:00
Daniel Nephin	4a44cfd676	Merge pull request #9188 from hashicorp/dnephin/more-streaming-tests Add more streaming tests	2021-02-26 12:36:55 -05:00
Daniel Nephin	4ef9578a07	Merge pull request #9703 from pierresouchay/streaming_tags_and_case_insensitive Streaming filter tags + case insensitive lookups for Service Names	2021-02-26 12:06:26 -05:00
Daniel Nephin	2cc3282d5d	catalog_events: set the right key for connect snapshots Add a test for catalog_event snapshot on connect topic	2021-02-25 14:30:39 -05:00
Daniel Nephin	85da1af04c	consul: Add integration tests of streaming. Restored from streaming-rpc-final branch. Co-authored-by: Paul Banks <banks@banksco.de>	2021-02-25 14:30:39 -05:00
Daniel Nephin	e8beda4685	state: Add a test for ServiceHealthSnapshot	2021-02-25 14:08:10 -05:00
Daniel Nephin	dd45c4cfe4	state: add a test case for memdb indexers	2021-02-19 17:14:46 -05:00
Daniel Nephin	7e4d693aaa	state: support for functional indexers These new functional indexers provide a few advantages: 1. enterprise differences can be isolated to a single function (the indexer function), making code easier to change 2. as a consequence of (1) we no longer need to wrap all the calls to Txn operations, making code easier to read. 3. by removing reflection we should increase the performance of all operations. One important change is in making all the function signatures the same. https://blog.golang.org/errors-are-values An extra boolean return value for SingleIndexer.FromObject is superfluous. The error value can indicate when the index value could not be created. By removing this extra return value we can use the same signature for both indexer functions. This has the nice properly of a function being usable for both indexing operations.	2021-02-19 17:14:46 -05:00
Daniel Nephin	88a9bd6d3c	state: remove duplicate index on the checks table By using a new pattern for more specific indexes. This allows us to use the same index for both service checks and node checks. It removes the abstraction around memdb.Txn operations, and isolates all of the enterprise differences in a single place (the indexer).	2021-02-19 17:14:46 -05:00
Daniel Nephin	b781fec664	state: remove duplicate function catalogChecksForNodeService was a duplicate of catalogListServiceChecks	2021-02-19 17:14:46 -05:00
Daniel Nephin	d33bc493af	Merge pull request #9720 from hashicorp/dnephin/ent-meta-ergo-1 structs: rename EnterpriseMeta constructor	2021-02-16 15:31:58 -05:00
Daniel Nephin	53c82cee86	Merge pull request #9772 from hashicorp/streamin-fix-bad-cached-snapshot streaming: fix snapshot cache bug	2021-02-16 15:28:00 -05:00
Daniel Nephin	b17967827d	Merge pull request #9728 from hashicorp/dnephin/state-index-table state: document how index table is used	2021-02-16 15:27:27 -05:00
Daniel Nephin	c40d063a0e	structs: rename EnterpriseMeta constructor To match the Go convention.	2021-02-16 14:45:43 -05:00
Daniel Nephin	a29b848e3b	stream: fix a snapshot cache bug Previously a snapshot created as part of a resumse-stream request could have incorrectly cached the newSnapshotToFollow event. This would cause clients to error because they received an unexpected framing event.	2021-02-16 12:52:23 -05:00
Daniel Nephin	2726c65fbe	stream: test the snapshot cache is saved correctly when the cache entry is created from resuming a stream.	2021-02-16 12:08:43 -05:00
R.B. Boyer	91d9544803	connect: connect CA Roots in the primary datacenter should use a SigningKeyID derived from their local intermediate (#9428 ) This fixes an issue where leaf certificates issued in primary datacenters using Vault as a Connect CA would be reissued very frequently (every ~20 seconds) because the logic meant to detect root rotation was errantly triggering. The hash of the rootCA was being compared against a hash of the intermediateCA and always failing. This doesn't apply to the Consul built-in CA provider because there is no intermediate in use in the primary DC. This is reminiscent of #6513	2021-02-08 13:18:51 -06:00
Daniel Nephin	cdda3b9321	state: Use the tableIndex constant	2021-02-05 18:37:45 -05:00
Daniel Nephin	de841bd459	state: Document index table And move the IndexEntry (which is stored in the table) next to the table schema definition.	2021-02-05 18:37:45 -05:00
Daniel Nephin	23cfbc8f8d	Merge pull request #9719 from hashicorp/oss/state-store-4 state: remove registerSchema	2021-02-05 14:02:38 -05:00
Daniel Nephin	dc70f583d4	Merge pull request #9718 from hashicorp/oss/dnephin/ent-meta-in-state-store-3 state: convert all table name constants to the new prefix pattern	2021-02-05 14:02:07 -05:00
Daniel Nephin	eb5d71fd19	Merge pull request #9665 from hashicorp/dnephin/state-store-indexes-2 state: move config-entries table definition to config_entries_schema.go	2021-02-05 14:01:08 -05:00
Daniel Nephin	9beadc578b	Merge pull request #9664 from hashicorp/dnephin/state-store-indexes state: move ACL schema and index definitions to acl_schema.go	2021-02-05 13:38:31 -05:00
Daniel Nephin	b747b27afd	state: remove the need for registerSchema registerSchema creates some indirection which is not necessary in this case. newDBSchema can call each of the tables. Enterprise tables can be added from the existing withEnterpriseSchema shim.	2021-02-05 12:19:56 -05:00
Daniel Nephin	33621706ac	state: rename table name constants to use pattern the 'table' prefix is shorter, and also reads better in queries.	2021-02-05 12:12:19 -05:00
Daniel Nephin	8569295116	state: rename connect constants	2021-02-05 12:12:19 -05:00
Daniel Nephin	afdbf2a8ef	state: rename table name constants to new pattern Using Apps Hungarian Notation for these constants makes the memdb queries more readable.	2021-02-05 12:12:18 -05:00
Pierre Souchay	c466b08481	Streaming filter tags + case insensitive lookups for Service Names Will fix: * https://github.com/hashicorp/consul/issues/9695 * https://github.com/hashicorp/consul/issues/9702	2021-02-04 11:00:51 +01:00
Daniel Nephin	f929a7117e	state: Remove unnecessary entMeta arg to EnsureConfigEntry	2021-02-03 18:10:38 -05:00
Kyle Havlovitz	1dee4173c1	connect/ca: Allow ForceWithoutCrossSigning for all providers This allows setting ForceWithoutCrossSigning when reconfiguring the CA for any provider, in order to forcibly move to a new root in cases where the old provider isn't reachable or able to cross-sign for whatever reason.	2021-01-29 13:38:11 -08:00
Daniel Nephin	09425b22a1	state: rename config-entries table const to match new pattern	2021-01-28 20:34:34 -05:00
Daniel Nephin	7d17e20270	state: move config-entries table to new pattern	2021-01-28 20:34:15 -05:00
Daniel Nephin	825b8ade39	state: use indexID this change was already made to enterprise, so backporting it.	2021-01-28 20:30:08 -05:00
Daniel Nephin	2a262f07fc	state: Move ACL schema indexes to match Ent and use constants for table and index names.	2021-01-28 20:05:09 -05:00
Matt Keeler	1379b5f7d6	Upgrade raft-autopilot and wait for autopilot it to stop when revoking leadership (#9644 ) Fixes: 9626	2021-01-27 11:14:52 -05:00
Hans Hasselberg	623aab5880	Add flags to support CA generation for Connect (#9585 )	2021-01-27 08:52:15 +01:00
R.B. Boyer	5777fa1f59	server: initialize mgw-wanfed to use local gateways more on startup (#9528 ) Fixes #9342	2021-01-25 17:30:38 -06:00
Daniel Nephin	d7d081f402	Merge pull request #9420 from hashicorp/dnephin/reduce-duplicate-in-catalog-schema state: reduce interface for Enterprise schema	2021-01-25 17:04:25 -05:00
R.B. Boyer	6622185d64	server: use the presense of stored federation state data as a sign that we already activated the federation state feature flag (#9519 ) This way we only have to wait for the serf barrier to pass once before we can make use of federation state APIs Without this patch every restart needs to re-compute the change.	2021-01-25 13:24:32 -06:00
R.B. Boyer	0247f409a0	server: when wan federating via mesh gateways only do heuristic primary DC bypass on the leader (#9366 ) Fixes #9341	2021-01-22 10:03:24 -06:00
Freddy	5519051c84	Update topology mapping Refs on all proxy instance deletions (#9589 ) * Insert new upstream/downstream mapping to persist new Refs * Avoid upserting mapping copy if it's a no-op * Add test with panic repro * Avoid deleting up/downstreams from inside memdb iterator * Avoid deleting gateway mappings from inside memdb iterator * Add CHANGELOG entry * Tweak changelog entry Co-authored-by: Paul Banks <banks@banksco.de>	2021-01-20 15:17:26 +00:00
Daniel Nephin	979749d86e	state: do not delete from inside an iteration Deleting from memdb inside an interation can cause a panic from Iterator.Next. This case is technically safe (for now) because the iterator is using the root radix tree not a modified one. However this could break at any time if someone adds an insert or delete to the coordinates table before this place in the function. It also sets a bad example, because generally deletes in an interator are not safe. So this commit uses the pattern we have in other places to move the deletes out of the iteration.	2021-01-19 17:00:07 -05:00
Matt Keeler	2d2ce1fb0c	Ensure that CA initialization does not block leader election. After fixing that bug I uncovered a couple more: Fix an issue where we might try to cross sign a cert when we never had a valid root. Fix a potential issue where reconfiguring the CA could cause either the Vault or AWS PCA CA providers to delete resources that are still required by the new incarnation of the CA.	2021-01-19 15:27:48 -05:00
Daniel Nephin	52a1d78e39	state: add a regression test for state store schema To allow the index to be refactored without accidental changes. To update the expected value run: 'go test ./agent/consul/state -update'	2021-01-15 18:49:55 -05:00
Daniel Nephin	aa21c1ea04	state: reduce interface for Enterprise schema Using withEnterpriseSchema() we can apply any enterprise schema changes with a single shim, removing the need to duplicate all of the table definitions. Also move all the catalog schemas to a new file to shrink catalog.go a bit.	2021-01-15 18:49:55 -05:00
Daniel Nephin	e8427a48ab	agent/consuk: Rename RPCRate -> RPCRateLimit so that the field name is consistent across config structs.	2021-01-14 17:26:00 -05:00
Daniel Nephin	e5320c2db6	agent/consul: make Client/Server config reloading more obvious I believe this commit also fixes a bug. Previously RPCMaxConnsPerClient was not being re-read from the RuntimeConfig, so passing it to Server.ReloadConfig was never changing the value. Also improve the test runtime by not doing a lot of unnecessary work.	2021-01-14 17:21:10 -05:00
Daniel Nephin	f2b504873a	Merge pull request #9460 from hashicorp/dnephin/fix-data-races Fix a couple data races in tests	2021-01-14 17:07:01 -05:00
Chris Piraino	baad708929	Fix bug in usage metrics when multiple service instances are changed in a single transaction (#9440 ) * Fix bug in usage metrics that caused a negative count to occur There were a couple of instances were usage metrics would do the wrong thing and result in incorrect counts, causing the count to attempt to decrement below zero and return an error. The usage metrics did not account for various places where a single transaction could delete/update/add multiple service instances at once. We also remove the error when attempting to decrement below zero, and instead just make sure we do not accidentally underflow the unsigned integer. This is a more graceful failure than returning an error and not allowing a transaction to commit. * Add changelog	2021-01-12 15:31:47 -06:00
Chris Piraino	2eac571276	Log replication warnings when no error suppression is defined (#9320 ) * Log replication warnings when no error suppression is defined * Add changelog file	2021-01-08 14:03:06 -06:00
Daniel Nephin	45f0afcbf4	structs: Fix printing of IDs These types are used as values (not pointers) in other structs. Using a pointer receiver causes problems when the value is printed. fmt will not call the String method if it is passed a value and the String method has a pointer receiver. By using a value receiver the correct string is printed. Also remove some unused methods.	2021-01-07 18:47:38 -05:00
Daniel Nephin	27c38bfebb	Merge pull request #9213 from hashicorp/dnephin/resolve-tokens-take-2 acl: Remove some unused things and document delegate method	2021-01-06 18:51:51 -05:00
R.B. Boyer	db62541676	acl: use the presence of a management policy in the state store as a sign that we already migrated to v2 acls (#9505 ) This way we only have to wait for the serf barrier to pass once before we can upgrade to v2 acls. Without this patch every restart needs to re-compute the change, and potentially if a stray older node joins after a migration it might regress back to v1 mode which would be problematic.	2021-01-05 17:04:27 -06:00
Matt Keeler	3a79b559f9	Special case the error returned when we have a Raft leader but are not tracking it in the ServerLookup (#9487 ) This can happen when one other node in the cluster such as a client is unable to communicate with the leader server and sees it as failed. When that happens its failing status eventually gets propagated to the other servers in the cluster and eventually this can result in RPCs returning “No cluster leader” error. That error is misleading and unhelpful for determing the root cause of the issue as its not raft stability but rather and client -> server networking issue. Therefore this commit will add a new error that will be returned in that case to differentiate between the two cases.	2021-01-04 14:05:23 -05:00
R.B. Boyer	42dea6f01e	server: deletions of intentions by name using the intention API is now idempotent (#9278 ) Restoring a behavior inadvertently changed while fixing #9254	2021-01-04 11:27:00 -06:00
Daniel Nephin	088831c91e	Maybe fix another data race in a test	2020-12-22 18:53:54 -05:00
Daniel Nephin	d0f2eca8de	Fix one race caused by t.Parallel	2020-12-22 18:27:18 -05:00
Daniel Nephin	c66a63275f	Merge pull request #9340 from hashicorp/dnephin/skip-slow-tests-with-short testing: skip slow tests with -short	2020-12-11 13:33:44 -05:00
R.B. Boyer	f9dcaf7f6b	acl: global tokens created by auth methods now correctly replicate to secondary datacenters (#9351 ) Previously the tokens would fail to insert into the secondary's state store because the AuthMethod field of the ACLToken did not point to a known auth method from the primary.	2020-12-09 15:22:29 -06:00
Daniel Nephin	ef0999547a	testing: skip slow tests with -short Add a skip condition to all tests slower than 100ms. This change was made using `gotestsum tool slowest` with data from the last 3 CI runs of master. See https://github.com/gotestyourself/gotestsum#finding-and-skipping-slow-tests With this change: ``` $ time go test -count=1 -short ./agent ok github.com/hashicorp/consul/agent 0.743s real 0m4.791s $ time go test -count=1 -short ./agent/consul ok github.com/hashicorp/consul/agent/consul 4.229s real 0m8.769s ```	2020-12-07 13:42:55 -05:00
Kyle Havlovitz	57210a59c3	connect: Fix a case where the active root would get unset even when there wasn't a new one	2020-12-02 11:42:23 -08:00
Kyle Havlovitz	91d5d6c586	Merge pull request #9009 from hashicorp/update-secondary-ca connect: Fix an issue with updating CA config in a secondary datacenter	2020-11-30 14:49:28 -08:00
Kyle Havlovitz	c5167cf9c4	Use a buffered channel for CA intermediate renew func	2020-11-30 14:37:24 -08:00
R.B. Boyer	6d6b6c15c6	server: fix panic when deleting a non existent intention (#9254 ) * server: fix panic when deleting a non existent intention * add changelog * Always return an error when deleting non-existent ixn Co-authored-by: freddygv <gh@freddygv.xyz>	2020-11-24 13:44:20 -05:00
Hans Hasselberg	25f9e232af	add missing descriptions for metrics	2020-11-23 22:06:30 +01:00
Kit Patella	7a8844ccce	add entries for missing fsm operations and mark duplicated metrics prefixes as deprecated	2020-11-23 12:42:51 -08:00
Kyle Havlovitz	a01f853aa5	Clean up the logic in persistNewRootAndConfig	2020-11-20 15:54:44 -08:00
Kyle Havlovitz	26a9c985c5	Add CA server delegate interface for testing	2020-11-19 20:08:06 -08:00
Kit Patella	4ad076207e	add telemetry and definition help entries for missing catalog and acl metrics	2020-11-19 13:29:44 -08:00
Kit Patella	46205bbf27	remove stale entries and rename/define acl.resolveToken	2020-11-19 13:06:28 -08:00
Freddy	e4e306210a	Require operator:write to get Connect CA config (#9240 ) A vulnerability was identified in Consul and Consul Enterprise (“Consul”) such that operators with `operator:read` ACL permissions are able to read the Consul Connect CA configuration when explicitly configured with the `/v1/connect/ca/configuration` endpoint, including the private key. This allows the user to effectively privilege escalate by enabling the ability to mint certificates for any Consul Connect services. This would potentially allow them to masquerade (receive/send traffic) as any service in the mesh. -- This PR increases the permissions required to read the Connect CA's private key when it was configured via the `/connect/ca/configuration` endpoint. They are now `operator:write`.	2020-11-19 10:14:48 -07:00
Kyle Havlovitz	c8d4a40a87	connect: update some function comments in CA manager	2020-11-17 16:00:19 -08:00
Daniel Nephin	b9306d8827	acl: remove a test-only method	2020-11-17 18:16:34 -05:00
Daniel Nephin	9e7c8dd19d	Remove two unused delegate methods	2020-11-17 18:16:26 -05:00
Matt Keeler	4bca029be9	Refactor to call non-voting servers read replicas (#9191 ) Co-authored-by: Kit Patella <kit@jepsen.io>	2020-11-17 10:53:57 -05:00
Kit Patella	4dfcdbab26	Merge pull request #9198 from hashicorp/mkcp/telemetry/add-all-metric-definitions Add metric definitions for all metrics known at Consul start	2020-11-16 15:54:50 -08:00
Matt Keeler	197a37a860	Prevent panic if autopilot health is requested prior to leader establishment finishing. (#9204 )	2020-11-16 17:08:17 -05:00
Daniel Nephin	de88ceed1c	Merge pull request #9114 from hashicorp/dnephin/filtering-in-stream stream: improve naming of Payload methods	2020-11-16 14:20:07 -05:00
Kit Patella	0b18f5612e	trim help strings to save a few bytes	2020-11-16 11:02:11 -08:00
Kit Patella	374748dafc	merge master	2020-11-16 10:46:53 -08:00
Kit Patella	af719981f3	finish adding static server metrics	2020-11-13 16:26:08 -08:00
Kyle Havlovitz	0a86533e20	Reorganize some CA manager code for correctness/readability	2020-11-13 14:46:01 -08:00
Kyle Havlovitz	5de81c1375	connect: Add CAManager for synchronizing CA operations	2020-11-13 14:33:44 -08:00
Kyle Havlovitz	0b4876f906	connect: Add logic for updating secondary DC intermediate on config set	2020-11-13 14:33:44 -08:00
R.B. Boyer	db1184c094	server: intentions CRUD requires connect to be enabled (#9194 ) Fixes #9123	2020-11-13 16:19:12 -06:00
Kit Patella	b486c1bce8	add the service name in the agent rather than in the definitions themselves	2020-11-13 13:18:04 -08:00
R.B. Boyer	e323014faf	server: remove config entry CAS in legacy intention API bridge code (#9151 ) Change so line-item intention edits via the API are handled via the state store instead of via CAS operations. Fixes #9143	2020-11-13 14:42:21 -06:00
R.B. Boyer	6300abed18	server: skip deleted and deleting namespaces when migrating intentions to config entries (#9186 )	2020-11-13 13:56:41 -06:00
Mike Morris	a343365da7	ci: update to Go 1.15.4 and alpine:3.12 (#9036 ) * ci: stop building darwin/386 binaries Go 1.15 drops support for 32-bit binaries on Darwin https://golang.org/doc/go1.15#darwin * tls: ConnectionState::NegotiatedProtocolIsMutual is deprecated in Go 1.15, this value is always true * correct error messages that changed slightly * Completely regenerate some TLS test data Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2020-11-13 13:02:59 -05:00
R.B. Boyer	758384893d	server: break up Intention.Apply monolithic method (#9007 ) The Intention.Apply RPC is quite large, so this PR attempts to break it down into smaller functions and dissolves the pre-config-entry approach to the breakdown as it only confused things.	2020-11-13 09:15:39 -06:00
Kit Patella	9533372ded	first pass on agent-configured prometheusDefs and adding defs for every consul metric	2020-11-12 18:12:12 -08:00
R.B. Boyer	a5bd1ba323	agent: return the default ACL policy to callers as a header (#9101 ) Header is: X-Consul-Default-ACL-Policy=<allow\|deny> This is of particular utility when fetching matching intentions, as the fallthrough for a request that doesn't match any intentions is to enforce using the default acl policy.	2020-11-12 10:38:32 -06:00
Matt Keeler	2badb01d30	Add a paramter in state store methods to indicate whether a resource insertion is from a snapshot restoration (#9156 ) The Catalog, Config Entry, KV and Session resources potentially re-validate the input as its coming in. We need to prevent snapshot restoration failures due to missing namespaces or namespaces that are being deleted in enterprise.	2020-11-11 11:21:42 -05:00
Matt Keeler	1f40f51a58	Fix a bunch of linter warnings	2020-11-09 09:22:12 -05:00
Matt Keeler	755fb72994	Switch to using the external autopilot module	2020-11-09 09:22:11 -05:00
Daniel Nephin	e4a78c977d	stream: document that Payload must be immutable If they are sent to EventPublisher.Publish. Also document that PayloadEvents is expected to come from a subscription and that it is not immutable.	2020-11-06 13:00:33 -05:00
Daniel Nephin	4fc073b1f4	stream: rename FilterByKey	2020-11-05 19:21:16 -05:00
Daniel Nephin	d4cd2fa6a8	stream: Add HasReadPermission to Payload Required now that filter is a method on PayloadEvents instead of Event	2020-11-05 19:17:18 -05:00
Daniel Nephin	8a26bca020	stream: move event filtering to PayloadEvents Removes the weirdness around PayloadEvents.FilterByKey	2020-11-05 17:50:17 -05:00
Daniel Nephin	dcacfd3548	stream: Remove unused method	2020-11-05 16:49:59 -05:00
Daniel Nephin	621f1db766	Merge pull request #9073 from hashicorp/dnephin/backport-streaming-namespaces streaming: backport namespace changes	2020-11-05 14:19:10 -05:00
Daniel Nephin	cd220e5d6c	Merge pull request #9061 from hashicorp/dnephin/event-fields stream: support filtering by namespace	2020-11-05 14:18:35 -05:00
Daniel Nephin	f6b629852f	state: test EventPayloadCheckServiceNode.FilterByKey Also fix a bug in that function when only one of key or namespace were the empty string.	2020-10-30 14:35:57 -04:00
Daniel Nephin	60df44df4f	stream: Add tests for filterByKey with namespace And fix a bug where a request with a Namespace but no Key would not be properly filtered	2020-10-30 14:35:42 -04:00
Daniel Nephin	318dfbe6e4	stream: Move FilterByKey events to a table In preparation for adding new tests.	2020-10-30 14:35:28 -04:00
Daniel Nephin	2d0030da39	state: use enterprise meta for creating events	2020-10-30 14:34:04 -04:00
Daniel Nephin	b57c7afcbb	stream: include the namespace in the snap cache key Otherwise the wrong snapshot could be returned when the same key is used in different namespaces	2020-10-30 14:34:04 -04:00
Daniel Nephin	8da30fcb9a	subscribe: set the request namespace	2020-10-30 14:34:04 -04:00
R.B. Boyer	67a0d0c426	state: ensure we unblock intentions queries upon the upgrade to config entries (#9062 ) 1. do a state store query to list intentions as the agent would do over in `agent/proxycfg` backing `agent/xds` 2. upgrade the database and do a fresh `service-intentions` config entry write 3. the blocking query inside of the agent cache in (1) doesn't notice (2)	2020-10-29 15:28:31 -05:00
R.B. Boyer	78014653b3	restore prior signature of test helper so enterprise compiles	2020-10-29 13:52:15 -05:00
Daniel Nephin	61ce0964a4	stream: remove Event.Key Makes Payload a type with FilterByKey so that Payloads can implement filtering by key. With this approach we don't need to expose a Namespace field on Event, and we don't need to invest micro formats or require a bunch of code to be aware of exactly how the key field is encoded.	2020-10-28 16:48:04 -04:00
Daniel Nephin	8ef4c0fcc5	state: use go-cmp for comparison The output of the previous assertions made it impossible to debug the tests without code changes. With go-cmp comparing the entire slice we can see the full diffs making it easier to debug failures.	2020-10-28 16:33:00 -04:00
Daniel Nephin	44da869ed4	stream: Use a no-op event publisher if streaming is disabled	2020-10-28 13:54:19 -04:00
Daniel Nephin	eea87e1acf	store: use a ReadDB for snapshots to remove the cyclic dependency between the snapshot handlers and the state.Store	2020-10-28 13:07:42 -04:00
Daniel Nephin	cfe0ffde15	Merge pull request #9026 from hashicorp/dnephin/streaming-without-cache-query-param streaming: rename config and remove requirement for cache=1	2020-10-28 12:33:25 -04:00
Daniel Nephin	03d2be03e7	Merge pull request #8618 from hashicorp/dnephin/remove-txn-readtxn state: Use ReadTxn everywhere	2020-10-28 12:32:47 -04:00
Daniel Nephin	abd8cfcfe9	state: disable streaming connect topic	2020-10-26 11:49:47 -04:00
R.B. Boyer	0a80e82f21	server: config entry replication now correctly uses namespaces in comparisons (#9024 ) Previously config entries sharing a kind & name but in different namespaces could occasionally cause "stuck states" in replication because the namespace fields were ignored during the differential comparison phase. Example: Two config entries written to the primary: kind=A,name=web,namespace=bar kind=A,name=web,namespace=foo Under the covers these both get saved to memdb, so they are sorted by all 3 components (kind,name,namespace) during natural iteration. This means that before the replication code does it's own incomplete sort, the underlying data IS sorted by namespace ascending (bar comes before foo). After one pass of replication the primary and secondary datacenters have the same set of config entries present. If "kind=A,name=web,namespace=bar" were to be deleted, then things get weird. Before replication the two sides look like: primary: [ kind=A,name=web,namespace=foo ] secondary: [ kind=A,name=web,namespace=bar kind=A,name=web,namespace=foo ] The differential comparison phase walks these two lists in sorted order and first compares "kind=A,name=web,namespace=foo" vs "kind=A,name=web,namespace=bar" and falsely determines they are the SAME and are thus cause an update of "kind=A,name=web,namespace=foo". Then it compares "<nothing>" with "kind=A,name=web,namespace=foo" and falsely determines that the latter should be DELETED. During reconciliation the deletes are processed before updates, and so for a brief moment in the secondary "kind=A,name=web,namespace=foo" is erroneously deleted and then immediately restored. Unfortunately after this replication phase the final state is identical to the initial state, so when it loops around again (rate limited) it repeats the same set of operations indefinitely.	2020-10-23 13:41:54 -05:00
Daniel Nephin	f9b2834171	state: convert the remaining functions to ReadTxn Required also converting some of the transaction functions to WriteTxn because TxnRO() called the same helper as TxnRW. This change allows us to return a memdb.Txn for read-only txn instead of wrapping them with state.txn.	2020-10-23 14:29:22 -04:00
Daniel Nephin	26387cdc0e	Merge pull request #8975 from hashicorp/dnephin/stream-close-on-unsub stream: close the subscription on Unsubscribe	2020-10-23 12:58:12 -04:00
Freddy	d23038f94f	Add HasExact to topology endpoint (#9010 )	2020-10-23 10:45:41 -06:00
Daniel Nephin	fb8b68a6ec	stream: close the subscription on Unsubscribe	2020-10-22 13:39:27 -04:00
Pierre Souchay	54f9f247f8	Consul Service meta wrongly computes and exposes non_voter meta (#8731 ) * Consul Service meta wrongly computes and exposes non_voter meta In Serf Tags, entreprise members being non-voters use the tag `nonvoter=1`, not `non_voter = false`, so non-voters in members were wrongly displayed as voter. Demonstration: ``` consul members -detailed\|grep voter consul20-hk5 10.200.100.110:8301 alive acls=1,build=1.8.4+ent,dc=hk5,expect=3,ft_fs=1,ft_ns=1,id=xxxxxxxx-5629-08f2-3a79-10a1ab3849d5,nonvoter=1,port=8300,raft_vsn=3,role=consul,segment=<all>,use_tls=1,vsn=2,vsn_max=3,vsn_min=2,wan_join_port=8302 ``` * Added changelog * Added changelog entry	2020-10-09 17:18:24 -04:00
s-christoff	a62705101f	Enhance the output of consul snapshot inspect (#8787 )	2020-10-09 14:57:29 -05:00
Kyle Havlovitz	707f4a8d26	Stop intermediate renew routine on leader stop	2020-10-09 12:30:57 -07:00
Kyle Havlovitz	926a393a5c	Merge pull request #8784 from hashicorp/renew-intermediate-primary connect: Enable renewing the intermediate cert in the primary DC	2020-10-09 12:18:59 -07:00
Daniel Nephin	dd0e8d42c4	Merge pull request #8825 from hashicorp/streaming/add-config streaming: add config and docs	2020-10-09 14:33:58 -04:00
Chris Piraino	4f77f87065	Emit service usage metrics with correct labeling strategy (#8856 ) Previously, we would emit service usage metrics both with and without a namespace label attached. This is problematic in the case when you want to aggregate metrics together, i.e. "sum(consul.state.services)". This would cause services to be counted twice in that aggregate, once via the metric emitted with a namespace label, and once in the metric emited without any namespace label.	2020-10-09 11:01:45 -05:00
Kyle Havlovitz	50543d678e	Fix intermediate refresh test comments	2020-10-09 08:53:33 -07:00
R.B. Boyer	d2f09ca306	upstream some differences from enterprise (#8902 )	2020-10-09 09:42:53 -05:00
Kyle Havlovitz	968fd8660d	Update CI for leader renew CA test using Vault	2020-10-09 05:48:15 -07:00
Kyle Havlovitz	62270c3f9a	Merge branch 'master' into renew-intermediate-primary	2020-10-09 04:40:34 -07:00
Kyle Havlovitz	b78f618beb	connect: Check for expired root cert when cross-signing	2020-10-09 04:35:56 -07:00
Freddy	89d52f41c4	Add protocol to the topology endpoint response (#8868 )	2020-10-08 17:31:54 -06:00
Matt Keeler	141eb60f06	Add per-agent reconnect timeouts (#8781 ) This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it. This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.	2020-10-08 15:02:19 -04:00
Daniel Nephin	05df7b18a9	config: add field for enabling streaming RPC endpoint	2020-10-08 12:11:20 -04:00
Freddy	de4af766f3	Support ingress gateways in mesh viz endpoint (#8864 ) Co-authored-by: R.B. Boyer <rb@hashicorp.com>	2020-10-08 09:47:09 -06:00
Daniel Nephin	a94fe054f0	Merge pull request #8809 from hashicorp/streaming/materialize-view Add StreamingHealthServices cache-type	2020-10-07 21:26:38 -04:00
Daniel Nephin	e0236b5a9f	Merge pull request #8818 from hashicorp/streaming/add-subscribe-service-batch-events stream: handle batch events as a special case of Event	2020-10-07 21:25:32 -04:00
Daniel Nephin	783627aeef	Merge pull request #8768 from hashicorp/streaming/add-subscribe-service subscribe: add subscribe service for streaming change events	2020-10-07 21:24:03 -04:00
Freddy	7d1f50d2e6	Return intention info in svc topology endpoint (#8853 )	2020-10-07 18:35:34 -06:00
R.B. Boyer	35c4efd220	connect: support defining intentions using layer 7 criteria (#8839 ) Extend Consul’s intentions model to allow for request-based access control enforcement for HTTP-like protocols in addition to the existing connection-based enforcement for unspecified protocols (e.g. tcp).	2020-10-06 17:09:13 -05:00
R.B. Boyer	d6dce2332a	connect: intentions are now managed as a new config entry kind "service-intentions" (#8834 ) - Upgrade the ConfigEntry.ListAll RPC to be kind-aware so that older copies of consul will not see new config entries it doesn't understand replicate down. - Add shim conversion code so that the old API/CLI method of interacting with intentions will continue to work so long as none of these are edited via config entry endpoints. Almost all of the read-only APIs will continue to function indefinitely. - Add new APIs that operate on individual intentions without IDs so that the UI doesn't need to implement CAS operations. - Add a new serf feature flag indicating support for intentions-as-config-entries. - The old line-item intentions way of interacting with the state store will transparently flip between the legacy memdb table and the config entry representations so that readers will never see a hiccup during migration where the results are incomplete. It uses a piece of system metadata to control the flip. - The primary datacenter will begin migrating intentions into config entries on startup once all servers in the datacenter are on a version of Consul with the intentions-as-config-entries feature flag. When it is complete the old state store representations will be cleared. We also record a piece of system metadata indicating this has occurred. We use this metadata to skip ALL of this code the next time the leader starts up. - The secondary datacenters continue to run the old intentions replicator until all servers in the secondary DC and primary DC support intentions-as-config-entries (via serf flag). Once this condition it met the old intentions replicator ceases. - The secondary datacenters replicate the new config entries as they are migrated in the primary. When they detect that the primary has zeroed it's old state store table it waits until all config entries up to that point are replicated and then zeroes its own copy of the old state store table. We also record a piece of system metadata indicating this has occurred. We use this metadata to skip ALL of this code the next time the leader starts up.	2020-10-06 13:24:05 -05:00
Daniel Nephin	83401194ab	streaming: improve godoc for cache-type And fix a bug where any error that implemented the temporary interface was considered a temporary error, even when the method would return false.	2020-10-06 13:52:02 -04:00
Daniel Nephin	f857aef4a8	submatview: add a test for handling of NewSnapshotToFollow Also add some godoc Rename some vars and functions Fix a data race in the new cache test for entry closing.	2020-10-06 13:22:02 -04:00
Daniel Nephin	ad29cf4f94	stream: Return a single event from a subscription.Next Handle batch events as a single event	2020-10-06 13:18:20 -04:00
Daniel Nephin	fa115c6249	Move agent/subscribe -> agent/rpc/subscribe	2020-10-06 12:49:35 -04:00
Daniel Nephin	011109a6f6	subscirbe: extract streamID and logging from Subscribe By extracting all of the tracing logic the core logic of the Subscribe endpoint is much easier to read.	2020-10-06 12:49:35 -04:00
Daniel Nephin	4c4441997a	subscribe: add integration test for acl token updates	2020-10-06 12:49:35 -04:00
Daniel Nephin	371ec2d70a	subscribe: add a stateless subscribe service for the gRPC server With a Backend that provides access to the necessary dependencies.	2020-10-06 12:49:35 -04:00
Daniel Nephin	ae433947a4	Merge pull request #8799 from hashicorp/streaming/rename-framing-events stream: remove EndOfEmptySnapshot, add NewSnapshotToFollow	2020-10-06 12:42:58 -04:00
R.B. Boyer	a77b518542	server: create new memdb table for storing system metadata (#8703 ) This adds a new very tiny memdb table and corresponding raft operation for updating a very small effective map[string]string collection of "system metadata". This can persistently record a fact about the Consul state machine itself. The first use of this feature will come in a later PR.	2020-10-06 10:08:37 -05:00
Daniel Nephin	2706cf9b2a	Merge pull request #8802 from hashicorp/dnephin/extract-lib-retry lib/retry - extract a new package from lib/retry.go	2020-10-05 14:22:37 -04:00
freddygv	82a17ccee6	Do not evaluate discovery chain for topology upstreams	2020-10-05 10:24:50 -06:00
freddygv	63c50e15bc	Single DB txn for ServiceTopology and other PR comments	2020-10-05 10:24:50 -06:00
freddygv	263bd9dd92	Add topology HTTP endpoint	2020-10-05 10:24:50 -06:00
freddygv	7c11580e93	Add topology RPC endpoint	2020-10-05 10:24:50 -06:00
freddygv	21c4708fe9	Add topology ACL filter	2020-10-05 10:24:50 -06:00
freddygv	ac54bf99b3	Add func to combine up+downstream queries	2020-10-05 10:24:50 -06:00
freddygv	160a6539d1	factor in discovery chain when querying up/downstreams	2020-10-05 10:24:50 -06:00
freddygv	214b25919f	support querying upstreams/downstreams from registrations	2020-10-05 10:24:50 -06:00
freddygv	3653045cb0	Add method for downstreams from disco chain	2020-10-05 10:24:50 -06:00
Daniel Nephin	40aac46cf4	lib/retry: Refactor to reduce the interface surface Reduce Jitter to one function Rename NewRetryWaiter Fix a bug in calculateWait where maxWait was applied before jitter, which would make it possible to wait longer than maxWait.	2020-10-04 18:12:42 -04:00
Daniel Nephin	0c7f9c72d7	lib/retry: extract a new package from lib	2020-10-04 17:43:01 -04:00
Daniel Nephin	9c5181c897	stream: full test coverage for EventPublisher.Subscribe	2020-10-02 13:46:24 -04:00
Daniel Nephin	0769f54fe1	stream: refactor to support change in framing events Removing EndOfEmptySnapshot, add NewSnapshotToFollow	2020-10-02 13:41:31 -04:00
Daniel Nephin	5ef630f664	Merge pull request #8769 from hashicorp/streaming/prep-for-subscribe-service state: use protobuf Topic and and export payload type	2020-10-02 13:30:06 -04:00
R.B. Boyer	e84d52ba3a	ensure these tests work fine with namespaces in enterprise (#8794 )	2020-10-01 09:54:46 -05:00
R.B. Boyer	ccd0200bd9	server: ensure that we also shutdown network segment serf instances on server shutdown (#8786 ) This really only matters for unit tests, since typically if an agent shuts down its server, it follows that up by exiting the process, which would also clean up all of the networking anyway.	2020-09-30 16:23:43 -05:00
Kyle Havlovitz	2956313f2d	connect: Enable renewing the intermediate cert in the primary DC	2020-09-30 12:31:21 -07:00
freddygv	ec6e8021c0	Resolve conflicts	2020-09-29 08:59:18 -06:00
Daniel Nephin	d192b0a080	stream: move goroutine out of New This change will make it easier to manage goroutine lifecycle from the caller. Also expose EventPublisher from state.Store	2020-09-28 18:40:10 -04:00
Daniel Nephin	e345c8d8a6	state: use pbsubscribe.Topic for topic values	2020-09-28 18:40:10 -04:00
Daniel Nephin	6e592ec485	state: rename and export EventPayload The subscribe endpoint needs to be able to inspect the payload to filter events, and convert them into the protobuf types. Use the protobuf CatalogOp type for the operation field, for now. In the future if we end up with multiple interfaces we should be able to remove the protobuf dependency by changing this to an int32 and adding a test for the mapping between the values. Make the value of the payload a concrete type instead of interface{}. We can create other payloads for other event types.	2020-09-28 18:34:30 -04:00
R.B. Boyer	45609fccdf	server: make sure that the various replication loggers use consistent logging (#8745 )	2020-09-24 15:49:38 -05:00
Daniel Nephin	4b041a018d	grpc: redeuce dependencies, unexport, and add godoc Rename GRPCClient to ClientConnPool. This type appears to be more of a conn pool than a client. The clients receive the connections from this pool. Reduce some dependencies by adjusting the interface baoundaries. Remove the need to create a second slice of Servers, just to pick one and throw the rest away. Unexport serverResolver, it is not used outside the package. Use a RWMutex for ServerResolverBuilder, some locking is read-only. Add more godoc.	2020-09-24 12:53:10 -04:00
Daniel Nephin	4b24470887	grpc: move client conn pool to grpc package	2020-09-24 12:48:12 -04:00
Daniel Nephin	fad15171ec	grpc: client conn pool and resolver Extracted from 936522a13c07e8b732b6fde61bba23d05f7b9a70 Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-24 12:46:22 -04:00
Daniel Nephin	e0119a6e92	Merge pull request #8680 from hashicorp/dnephin/replace-consul-opts-with-base-deps agent: Repalce ConsulOptions with a new struct from agent.BaseDeps	2020-09-24 12:45:54 -04:00
Paul Banks	0594667c3a	Fix bad int -> string conversions caught by go vet changes in 1.15 (#8739 )	2020-09-24 11:14:07 +01:00
Hans Hasselberg	c6fa758d6f	fix TestLeader_SecondaryCA_IntermediateRenew (#8702 ) * fix lessThanHalfTime * get lock for CAProvider() * make a var to relate both vars * rename to getCAProviderWithLock * move CertificateTimeDriftBuffer to agent/connect/ca	2020-09-18 10:13:29 +02:00
Mike Morris	fe984b3ee3	test: update tags for database service registrations and queries (#8693 )	2020-09-16 14:05:01 -04:00
Kyle Havlovitz	c8fd61abc7	Merge branch 'master' into vault-ca-renew-token	2020-09-15 14:39:04 -07:00
Daniel Nephin	c621b4a420	agent/consul: pass dependencies directly from agent In an upcoming change we will need to pass a grpc.ClientConnPool from BaseDeps into Server. While looking at that change I noticed all of the existing consulOption fields are already on BaseDeps. Instead of duplicating the fields, we can create a struct used by agent/consul, and use that struct in BaseDeps. This allows us to pass along dependencies without translating them into different representations. I also looked at moving all of BaseDeps in agent/consul, however that created some circular imports. Resolving those cycles wouldn't be too bad (it was only an error in agent/consul being imported from cache-types), however this change seems a little better by starting to introduce some structure to BaseDeps. This change is also a small step in reducing the scope of Agent. Also remove some constants that were only used by tests, and move the relevant comment to where the live configuration is set. Removed some validation from NewServer and NewClient, as these are not really runtime errors. They would be code errors, which will cause a panic anyway, so no reason to handle them specially here.	2020-09-15 17:29:32 -04:00
Daniel Nephin	0536b2047e	agent/consul: make router required	2020-09-15 17:26:26 -04:00
Kyle Havlovitz	63d3a5fc1f	Clean up CA shutdown logic and error	2020-09-15 12:28:58 -07:00
freddygv	43efb4809c	Merge master	2020-09-14 16:17:43 -06:00
Daniel Nephin	75515f3431	Merge pull request #8587 from hashicorp/streaming/add-grpc-server streaming: add gRPC server for handling connections	2020-09-14 15:24:54 -04:00
freddygv	33af8dab9a	Resolve conflicts against master	2020-09-11 18:41:58 -06:00
Kyle Havlovitz	1595add842	Clean up Vault renew tests and shutdown	2020-09-11 08:41:05 -07:00
freddygv	5871b667a5	Revert EnvoyConfig nesting	2020-09-11 09:21:43 -06:00
Kyle Havlovitz	7588e22739	Add a stop function to make sure the renewer is shut down on leader change	2020-09-10 06:12:48 -07:00
Kyle Havlovitz	1c57b72a9f	Add a test for token renewal	2020-09-09 16:36:37 -07:00
Daniel Nephin	863a9df951	server: add gRPC server for streaming events Includes a stats handler and stream interceptor for grpc metrics. Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-08 12:10:41 -04:00
Hans Hasselberg	51f079dcdd	secondaryIntermediateCertRenewalWatch abort on success (#8588 ) secondaryIntermediateCertRenewalWatch was using `retryLoopBackoff` to renew the intermediate certificate. Once it entered the inner loop and started `retryLoopBackoff` it would never leave that. `retryLoopBackoffAbortOnSuccess` will return when renewing is successful, like it was intended originally.	2020-09-04 11:47:16 +02:00
Daniel Nephin	c17a5b0628	state: handle terminating gateways in service health events	2020-09-03 16:58:05 -04:00
Daniel Nephin	b241debee7	state: improve comments in catalog_events.go Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:05 -04:00
Daniel Nephin	870823e8ed	state: use changeType in serviceChanges To be a little more explicit, instead of nil implying an indirect change	2020-09-03 16:58:05 -04:00
Daniel Nephin	68682e7e83	don't over allocate slice	2020-09-03 16:58:04 -04:00
Daniel Nephin	5f52220f53	state: fix a bug in building service health events The nodeCheck slice was being used as the first arg in append, which in some cases will modify the array backing the slice. This would lead to service checks for other services in the wrong event. Also refactor some things to reduce the arguments to functions.	2020-09-03 16:58:04 -04:00
Daniel Nephin	c61313b78a	state: Remove unused args and return values Also rename some functions to identify them as constructors for events	2020-09-03 16:58:04 -04:00
Daniel Nephin	668b98bcce	state: use an enum for tracking node changes	2020-09-03 16:58:04 -04:00
Daniel Nephin	7c3c627028	state: serviceHealthSnapshot refactored to remove unused return value and remove duplication	2020-09-03 16:58:04 -04:00
Daniel Nephin	fdfe176deb	state: Add Change processor and snapshotter for service health Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:04 -04:00
Daniel Nephin	6a1a43721d	state: fix bug in changeTrackerDB.publish Creating a new readTxn does not work because it will not see the newly created objects that are about to be committed. Instead use the active write Txn.	2020-09-03 16:58:01 -04:00
Daniel Nephin	81cc3daf69	stream: have SnapshotFunc accept a non-pointer SubscribeRequest The value is not expected to be modified. Passing a value makes that explicit.	2020-09-03 16:54:02 -04:00
freddygv	56fdae9ace	Update resolver defaulting	2020-09-03 13:08:44 -06:00
freddygv	02d6acd8fc	Ensure resolver node with LB isn't considered default	2020-09-03 08:55:57 -06:00
freddygv	daad3b9210	Remove LB infix and move injection to xds	2020-09-02 15:13:50 -06:00
Chris Piraino	df1381f77f	Merge pull request #8603 from hashicorp/feature/usage-metrics Track node and service counts in the state store and emit them periodically as metrics	2020-09-02 13:23:39 -05:00
R.B. Boyer	4197bed23b	connect: fix bug in preventing some namespaced config entry modifications (#8601 ) Whenever an upsert/deletion of a config entry happens, within the open state store transaction we speculatively test compile all discovery chains that may be affected by the pending modification to verify that the write would not create an erroneous scenario (such as splitting traffic to a subset that did not exist). If a single discovery chain evaluation references two config entries with the same kind and name in different namespaces then sometimes the upsert/deletion would be falsely rejected. It does not appear as though this bug would've let invalid writes through to the state store so the correction does not require a cleanup phase.	2020-09-02 10:47:19 -05:00
Chris Piraino	b245d60200	Set metrics reporting interval to 9 seconds This is below the 10 second interval that lib/telemetry.go implements as its aggregation interval, ensuring that we always report these metrics.	2020-09-02 10:24:23 -05:00
Chris Piraino	e9b397005c	Update godoc string for memdb wrapper functions/structs	2020-09-02 10:24:22 -05:00
Chris Piraino	80f923a47a	Refactor state store usage to track unique service names This commit refactors the state store usage code to track unique service name changes on transaction commit. This means we only need to lookup usage entries when reading the information, as opposed to iterating over a large number of service indices. - Take into account a service instance's name being changed - Do not iterate through entire list of service instances, we only care about whether there is 0, 1, or more than 1.	2020-09-02 10:24:21 -05:00
Chris Piraino	79e6534345	Use ReadTxn interface in state store helper functions	2020-09-02 10:24:20 -05:00
Chris Piraino	d90d95421d	Add WriteTxn interface and convert more functions to ReadTxn We add a WriteTxn interface for use in updating the usage memdb table, with the forward-looking prospect of incrementally converting other functions to accept interfaces. As well, we use the ReadTxn in new usage code, and as a side effect convert a couple of existing functions to use that interface as well.	2020-09-02 10:24:19 -05:00
Chris Piraino	45a4057f60	Report node/service usage metrics from every server Using the newly provided state store methods, we periodically emit usage metrics from the servers. We decided to emit these metrics from all servers, not just the leader, because that means we do not have to care about leader election flapping causing metrics turbulence, and it seems reasonable for each server to emit its own view of the state, even if they should always converge rapidly.	2020-09-02 10:24:17 -05:00
Chris Piraino	3af96930eb	Add new usage memdb table that tracks usage counts of various elements We update the usage table on Commit() by using the TrackedChanges() API of memdb. Track memdb changes on restore so that usage data can be compiled	2020-09-02 10:24:16 -05:00
freddygv	d7bda050e0	Restructure structs and other PR comments	2020-09-02 09:10:50 -06:00
Matt Keeler	335c604ced	Merge of auto-config and auto-encrypt code (#8523 ) auto-encrypt is now handled as a special case of auto-config. This also is moving all the cert-monitor code into the auto-config package.	2020-08-31 13:12:17 -04:00
freddygv	afb14b6705	Compile down LB policy to disco chain nodes	2020-08-28 13:11:04 -06:00
Daniel Nephin	845661c8af	Merge pull request #8548 from edevil/fix_flake Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve	2020-08-28 15:10:55 -04:00
R.B. Boyer	f2b8bf109c	xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569 )	2020-08-27 12:20:58 -05:00
Matt Keeler	106e1d50bd	Move RPC router from Client/Server and into BaseDeps (#8559 ) This will allow it to be a shared component which is needed for AutoConfig	2020-08-27 11:23:52 -04:00
André Cruz	673bd69f36	Decrease test flakiness Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve and TestCacheNotifyPolling	2020-08-24 20:30:02 +01:00
André Cruz	a64686fab6	testing: Fix govet errors	2020-08-21 18:01:55 +01:00
Hans Hasselberg	02de4c8b76	add primary keys to list keyring (#8522 ) During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key. Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output: ```json [ { "WAN": true, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "NumNodes": 6 }, { "WAN": false, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 }, { "WAN": false, "Datacenter": "dc1", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 } ] ``` I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later: * add a flag to show the primary keys * add a flag to show json output Fixes #3393.	2020-08-18 09:50:24 +02:00
Daniel Nephin	8d35e37b3c	testing: Remove all the defer os.Removeall Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage the removal	2020-08-14 19:58:53 -04:00
Daniel Nephin	629c34085d	state: remove unused Store method receiver And use ReadTxn interface where appropriate.	2020-08-13 11:25:22 -04:00
Daniel Nephin	fc797a279a	Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown agent/consul: Remove NotifyShutdown	2020-08-13 11:16:48 -04:00
Daniel Nephin	d8ffcd5686	Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake state: speed up tests that use watchLimit	2020-08-13 11:16:12 -04:00
R.B. Boyer	63422ca9c5	connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470 ) Fixes #8466 Since Consul 1.8.0 there was a bug in how ingress gateway protocol compatibility was enforced. At the point in time that an ingress-gateway config entry was modified the discovery chain for each upstream was checked to ensure the ingress gateway protocol matched. Unfortunately future modifications of other config entries were not validated against existing ingress-gateway definitions, such as: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. create service-defaults for 'api' setting protocol=http (worked, but not ok) 3. create service-splitter or service-router for 'api' (worked, but caused an agent panic) If you were to do these in a different order, it would fail without a crash: 1. create service-defaults for 'api' setting protocol=http (ok) 2. create service-splitter or service-router for 'api' (ok) 3. create tcp ingress-gateway pointing to 'api' (fail with message about protocol mismatch) This PR introduces the missing validation. The two new behaviors are: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat) 3. (NEW) create service-splitter or service-router for 'api' (fail with message about protocol mismatch) In consideration for any existing users that may be inadvertently be falling into item (2) above, that is now officiall a valid configuration to be in. For anyone falling into item (3) above while you cannot use the API to manufacture that scenario anymore, anyone that has old (now bad) data will still be able to have the agent use them just enough to generate a new agent/proxycfg error message rather than a panic. Unfortunately we just don't have enough information to properly fix the config entries.	2020-08-12 11:19:20 -05:00
Hans Hasselberg	7a6d916ddc	Merge pull request #8471 from hashicorp/local_only thread local-only through the layers	2020-08-12 08:54:51 +02:00
Freddy	50fee12d62	Internal endpoint to query intentions associated with a gateway (#8400 )	2020-08-11 17:20:41 -06:00
Kyle Havlovitz	8118e3db40	Fix a state store comment about version	2020-08-11 13:46:12 -07:00
Kyle Havlovitz	2601585017	fsm: Fix snapshot bug with restoring node/service/check indexes	2020-08-11 11:49:52 -07:00
Hans Hasselberg	e0297b6e99	Refactor keyring ops: * changes some functions to return data instead of modifying pointer arguments * renames globalRPC() to keyringRPCs() to make its purpose more clear * restructures KeyringOperation() to make it more understandable	2020-08-11 13:42:03 +02:00
freddygv	6dcfa11c21	Update error handling	2020-08-10 17:48:22 -06:00
Daniel Nephin	bef9348ca8	testing: remove unnecessary defers in tests The data directory is now removed by the test helper that created it.	2020-08-07 17:28:16 -04:00
Daniel Nephin	f3b63514d5	testing: Remove NotifyShutdown NotifyShutdown was only used for testing. Now that t.Cleanup exists, we can use that instead of attaching cleanup to the Server shutdown. The Autopilot test which used NotifyShutdown doesn't need this notification because Shutdown is synchronous. Waiting for the function to return is equivalent.	2020-08-07 17:14:44 -04:00
Hans Hasselberg	fdceb24323	auto_config implies connect (#8433 )	2020-08-07 12:02:02 +02:00
freddygv	83f4e32376	PR comments and addtl tests	2020-08-05 16:07:11 -06:00
Daniel Nephin	061ae94c63	Rename NewClient/NewServer Now that duplicate constructors have been removed we can use the shorter names for the single constructor.	2020-08-05 14:00:55 -04:00
Daniel Nephin	e6c94c1411	Remove LogOutput from Server	2020-08-05 14:00:44 -04:00
Daniel Nephin	fdf966896f	Remove LogOutput from Client	2020-08-05 14:00:42 -04:00
Daniel Nephin	73493ca01b	Pass a logger to ConnPool and yamux, instead of an io.Writer Allowing us to remove the LogOutput field from config.	2020-08-05 13:25:08 -04:00
Daniel Nephin	c7c941811d	config: Remove unused field	2020-08-05 13:25:08 -04:00
freddygv	c87af29506	collect GatewayServices from iter in a function	2020-07-31 13:30:40 -06:00
Freddy	7c2c8815d7	Avoid panics during shutdown routine (#8412 )	2020-07-30 11:11:10 -06:00
freddygv	94d1f0a310	end to end changes to pass gatewayservices to /ui/services/	2020-07-30 10:21:11 -06:00
Matt Keeler	76add4f24c	Allow setting verify_incoming* when using auto_encrypt or auto_config (#8394 ) Ensure that enabling AutoConfig sets the tls configurator properly This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.	2020-07-30 10:15:12 -04:00
Matt Keeler	dad0f189a2	Agent Auto Config: Implement Certificate Generation (#8360 ) Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server. This also refactors the auto-config package a bit to split things out into multiple files.	2020-07-28 15:31:48 -04:00
Matt Keeler	3a1058a06b	Move connect root retrieval and cert signing logic out of the RPC endpoints (#8364 ) The code now lives on the Server type itself. This was done so that all of this could be shared with auto config certificate signing.	2020-07-24 10:00:51 -04:00
Matt Keeler	e7d8a02ae8	Move generation of the CA Configuration from the agent code into a method on the RuntimeConfig (#8363 ) This allows this to be reused elsewhere.	2020-07-23 16:05:28 -04:00
Daniel Nephin	597dcf2bfb	Merge pull request #8323 from hashicorp/dnephin/add-event-publisher-2 stream: close subscriptions on shutdown	2020-07-23 13:12:50 -04:00
Matt Keeler	c3e7d689b7	Refactor the agentpb package (#8362 ) First move the whole thing to the top-level proto package name. Secondly change some things around internally to have sub-packages.	2020-07-23 11:24:20 -04:00
Daniel Nephin	decba06b7d	stream: close all subs when EventProcessor is shutdown.	2020-07-22 19:04:10 -04:00
Daniel Nephin	e802689bbe	stream: fix overallocation in filter And add tests	2020-07-22 19:04:10 -04:00
Daniel Nephin	f64725f7aa	state: speed up TestStateStore_ServicesByNodeMeta Make watchLimit a var so that we can patch it in tests and reduce the time spent creating state.	2020-07-22 16:57:06 -04:00
Daniel Nephin	a44ddea9ba	state: Use subtests in TestStateStore_ServicesByNodeMeta These subtests make it much easier to identify the slow part of the test, but they also help enumerate all the different cases which are being tested.	2020-07-22 16:39:09 -04:00
Daniel Nephin	6d3b042872	Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs testutil: NewLogBuffer - buffer logs until a test fails	2020-07-21 15:21:52 -04:00
Matt Keeler	8ea8a939f0	Merge pull request #8311 from hashicorp/bugfix/auto-encrypt-token-update	2020-07-21 13:15:27 -04:00
Daniel Nephin	80ff174880	testutil: NewLogBuffer - buffer logs until a test fails Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race. The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: https://github.com/golang/go/issues/38458 (may be fixed in go1.15) https://github.com/golang/go/issues/38382#issuecomment-612940030	2020-07-21 12:50:40 -04:00
Matt Keeler	133a6d99f2	Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint. This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality. Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.	2020-07-21 12:19:25 -04:00
Daniel Nephin	7599e280de	stream: handle empty event in TestEventSnapshot When the race detector is enabled we see this test fail occasionally. The reordering of execution seems to make it possible for the snapshot splice to happen before any events are published to the topicBuffers. We can handle this case in the test the same way it is handled by a subscription, by proceeding to the next event.	2020-07-20 18:20:02 -04:00
Daniel Nephin	75f10fb191	state: update calls that are no longer state methods In a previous commit these methods were changed to functions, so remove the Store paramter.	2020-07-16 15:46:10 -04:00
Daniel Nephin	3fcb2e16f4	state: un-method funcs that don't use their receiver This change was mostly automated with the following First generate a list of functions with: git grep -o 'Store) $[^(]\+$(tx \txn' ./agent/consul/state \| awk '{print $2}' \| grep -o '^[^(]\+' Then the list was curated a bit with trial/error to remove and add funcs as necessary. Finally the replacement was done with: dir=agent/consul/state file=${1-funcnames} while read fn; do echo "$fn" sed -i -e "s/(s \Store) $fn(/$fn(/" $dir/.go sed -i -e "s/s\.$fn(/$fn(/" $dir/.go sed -i -e "s/s\.store\.$fn(/$fn(/" $dir/*.go done < $file	2020-07-16 15:30:39 -04:00
Daniel Nephin	edb0a4f1f8	store: convert methods that don't use their receiver to functions Making these functions allows them to be used without introducing an artificial dependency on the struct. Many of these will be called from streaming Event processors, which do not have a store. This change is being made ahead of the streaming work to get to reduce the size of the streaming diff.	2020-07-16 15:30:10 -04:00
Daniel Nephin	a2f8605c66	stream: Add forceClose and refactor subscription filtering Move the subscription context to Next. context.Context should generally never be stored in a struct because it makes that struct only valid while the context is valid. This is rarely obvious from the caller. Adds a forceClosed channel in place of the old context, and uses the new context as a way for the caller to stop the Subscription blocking. Remove some recursion out of bufferImte.Next. The caller is already looping so we can continue in that loop instead of recursing. This ensures currentItem is updated immediately (which probably does not matter in practice), and also removes the chance that we overflow the stack. NextNoBlock and FollowAfter do not need to handle bufferItem.Err, the caller already handles it. Moves filter to a method to simplify Next, and more explicitly separate filtering from looping. Also improve some godoc Only unwrap itemBuffer.Err when necessary	2020-07-14 15:57:47 -04:00
Daniel Nephin	2595436f62	stream: Improve docstrings Also rename ResumeStrema to EndOfEmptySnapshot to be more consistent with other framing events Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:47 -04:00
Daniel Nephin	16a2b3fafc	stream: change Topic to an interface Consumers of the package can decide on which type to use for the Topic. In the future we may use a gRPC type for the topic.	2020-07-14 15:57:47 -04:00
Daniel Nephin	aa571bd0ce	state: Move change processing out of EventPublisher EventPublisher was receiving TopicHandlers, which had a couple of problems: - ChangeProcessors were being grouped by Topic, but they completely ignored the topic and were performed on every change - ChangeProcessors required EventPublisher to be aware of database changes By moving ChangeProcesors out of EventPublisher, and having Publish accept events instead of changes, EventPublisher no longer needs to be aware of these things. Handlers is now only SnapshotHandlers, which are still mapped by Topic. Also allows us to remove the small 'db' package that had only two types. They can now be unexported types in state.	2020-07-14 15:57:47 -04:00
Daniel Nephin	23a940daad	server: Abandom state store to shutdown EventPublisher So that we don't leak goroutines	2020-07-14 15:57:47 -04:00
Daniel Nephin	e1305fe80c	stream: unexport identifiers Now that EventPublisher is part of stream a lot of the internals can be hidden	2020-07-14 15:57:47 -04:00
Daniel Nephin	9e37894778	stream: Move EventPublisher to stream package The EventPublisher is the central hub of the PubSub system. It is toughly coupled with much of stream. Some stream internals were exported exclusively for EventPublisher. The two Subscribe cases (with or without index) were also awkwardly split between two packages. By moving EventPublisher into stream they are now both in the same package (although still in different files).	2020-07-14 15:57:47 -04:00
Daniel Nephin	6e87e83d77	state: Make handleACLUpdate async once again So that we keep as much as possible out of the FSM commit hot path.	2020-07-14 15:57:47 -04:00
Daniel Nephin	a92dab724d	state: Use interface for Txn Also store the index in Changes instead of the Txn. This change is in preparation for movinng EventPublisher to the stream package, and making handleACLUpdates async once again.	2020-07-14 15:57:46 -04:00
Daniel Nephin	c778d61b6a	stream.Subscription unexport fields and additiona docstrings	2020-07-14 15:57:46 -04:00
Daniel Nephin	37a38629d7	Add a context for stopping EventPublisher goroutine	2020-07-14 15:57:46 -04:00
Daniel Nephin	02bc5a26e4	EventPublisher: Make Unsubscribe a function on Subscription It is critical that Unsubscribe be called with the same pointer to a SubscriptionRequest that was used to create the Subscription. The docstring made that clear, but it sill allowed a caler to get it wrong by creating a new SubscriptionRequest. By hiding this detail from the caller, and only exposing an Unsubscribe method, it should be impossible to fail to Unsubscribe. Also update some godoc strings.	2020-07-14 15:57:46 -04:00
Daniel Nephin	86976cf23c	EventPublisher: handleACL changes synchronously Use a separate lock for subscriptions.ByToken to allow it to happen synchronously in the commit flow. This removes the need to create a new txn for the goroutine, and removes the need for EventPublisher to contain a reference to DB.	2020-07-14 15:57:46 -04:00
Daniel Nephin	606121fae6	stream.EventSnapshot: reduce the fields on the struct Many of the fields are only needed in one place, and by using a closure they can be removed from the struct. This reduces the scope of the variables making it esier to see how they are used.	2020-07-14 15:57:45 -04:00
Daniel Nephin	7196917051	stream.EventBuffer: Seed the fuzz test with time.Now() Otherwise the test will run with exactly the same values each time. By printing the seed we can attempt to reproduce the test by adding an env var to override the seed	2020-07-14 15:57:45 -04:00
Daniel Nephin	525b275a52	state: memdb_wrapper.go -> memdb.go Renaming in a separate commit so that git can merge changes to the file.	2020-07-14 15:57:45 -04:00
Daniel Nephin	b5d2bea770	state: publish changes from Commit Make topicRegistry use functions instead of unbound methods Use a regular memDB in EventPublisher to remove a reference cycle Removes the need for EventPublisher to use a store	2020-07-14 15:57:45 -04:00
Daniel Nephin	f626c3d6c5	EventPublisher: docstrings and getTopicBuffer also rename commitCh -> publishCh	2020-07-14 15:57:45 -04:00
Daniel Nephin	2020e9c7c7	ProcessChanges: use stream.Event Also remove secretHash, which was used to hash tokens. We don't expose these tokens anywhere, so we can use the string itself instead of a Hash. Fix acl_events_test.go for storing a structs type.	2020-07-14 15:57:45 -04:00
Daniel Nephin	2e45bbbb3e	stream: Use local types for Event Topic SubscriptionRequest	2020-07-14 15:57:45 -04:00
Daniel Nephin	3d62013062	Rename stream_publisher.go -> event_publisher.go	2020-07-14 15:57:44 -04:00
Daniel Nephin	526fb53f85	Add streaming package with Subscription and Snapshot components. The remaining files from 7965767de0bd62ab07669b85d6879bd5f815d157 Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:44 -04:00
Chris Piraino	b80cbb499f	Set enterprise metadata after resolving the token (#8302 ) The token can encode enterprise metadata information, and we must make sure we set that on the reply so that we can correct filter ACLs.	2020-07-13 13:39:57 -05:00
Daniel Nephin	13e0d258b5	Merge pull request #8237 from hashicorp/dnephin/remove-acls-enabled-from-delegate Remove ACLsEnabled from delegate interface	2020-07-09 16:35:43 -04:00
Matt Keeler	39d9babab3	Pass the Config and TLS Configurator into the AutoConfig constructor This is instead of having the AutoConfigBackend interface provide functions for retrieving them. NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.	2020-07-08 12:36:11 -04:00
Matt Keeler	a77ed471c8	Rename (Server).forward to (Server).ForwardRPC Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.	2020-07-08 11:05:44 -04:00
Matt Keeler	386ec3a2a2	Refactor AutoConfig RPC to not have a direct dependency on the Server type Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.	2020-07-08 11:05:44 -04:00
Daniel Nephin	8b6036c077	Remove ACLsEnabled from delegate interface In all cases (oss/ent, client/server) this method was returning a value from config. Since the value is consistent, it doesn't need to be part of the delegate interface.	2020-07-03 17:00:20 -04:00
Daniel Nephin	dfa8856e5f	agent/consul: Add support for NotModified to two endpoints A query made with AllowNotModifiedResponse and a MinIndex, where the result has the same Index as MinIndex, will return an empty response with QueryMeta.NotModified set to true. Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-02 17:05:46 -04:00
Matt Keeler	87764e5bfb	Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various	2020-07-02 09:49:49 -04:00
Yury Evtikhov	dbf3c05fa5	DNS: add IsErrQueryNotFound function for easier error evaluation	2020-07-01 03:41:44 +01:00
Matt Keeler	a97f9ff386	Overwrite agent leaf cert trust domain on the servers	2020-06-30 09:59:08 -04:00
Matt Keeler	5600069d69	Store the Connect CA rate limiter on the server This fixes a bug where auto_encrypt was operating without utilizing a common rate limiter.	2020-06-30 09:59:07 -04:00
Matt Keeler	fa42d9b34f	Fix auto_encrypt IP/DNS SANs The initial auto encrypt CSR wasn’t containing the user supplied IP and DNS SANs. This fixes that. Also We were configuring a default :: IP SAN. This should be ::1 instead and was fixed.	2020-06-30 09:59:07 -04:00
R.B. Boyer	72a515f5ec	connect: various changes to make namespaces for intentions work more like for other subsystems (#8194 ) Highlights: - add new endpoint to query for intentions by exact match - using this endpoint from the CLI instead of the dump+filter approach - enforcing that OSS can only read/write intentions with a SourceNS or DestinationNS field of "default". - preexisting OSS intentions with now-invalid namespace fields will delete those intentions on initial election or for wildcard namespaces an attempt will be made to downgrade them to "default" unless one exists. - also allow the '-namespace' CLI arg on all of the intention subcommands - update lots of docs	2020-06-26 16:59:15 -05:00
Daniel Nephin	7d5f1ba6bd	Merge pull request #8176 from hashicorp/dnephin/add-linter-unparam-1 lint: add unparam linter and fix some of the issues	2020-06-25 15:34:48 -04:00
Matt Keeler	d471977f62	Fix go routine leak in auto encrypt ca roots tracking	2020-06-24 17:09:50 -04:00
Matt Keeler	90e741c6d2	Allow cancelling blocking queries in response to shutting down.	2020-06-24 17:09:50 -04:00
Daniel Nephin	07c1081d39	Fix a bunch of unparam lint issues	2020-06-24 13:00:14 -04:00
Matt Keeler	341aedbce9	Ensure that retryLoopBackoff can be cancelled We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways. Fix go routine leak in the gateway locator	2020-06-24 12:41:08 -04:00
Matt Keeler	9dc9f7df15	Allow cancelling startup when performing auto-config (#8157 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2020-06-19 15:16:00 -04:00
Daniel Nephin	b5ef9b7ea9	Remove bytesToUint64 from agent/consul	2020-06-18 12:45:43 -04:00
Daniel Nephin	81bc082b63	Remove unused private IP code from agent/consul	2020-06-18 12:40:38 -04:00
Matt Keeler	2c7844d220	Implement Client Agent Auto Config There are a couple of things in here. First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism. This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.	2020-06-17 16:49:46 -04:00
Matt Keeler	f5d57ccd48	Allow the Agent its its child Client/Server to share a connection pool This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.	2020-06-17 16:19:33 -04:00
Matt Keeler	8c601ad8db	Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc	2020-06-17 16:07:25 -04:00
Chris Piraino	79d003d395	Remove ACLEnforceVersion8 from tests (#8138 ) The field had been deprecated for a while and was recently removed, however a PR which added these tests prior to removal was merged.	2020-06-17 14:58:01 -05:00
Daniel Nephin	1ef8279ac9	Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4 ci: enable SA4006 staticcheck check and add ineffassign	2020-06-17 12:16:02 -04:00
Matt Keeler	eda8cb39fd	Implement the insecure version of the Cluster.AutoConfig RPC endpoint Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.	2020-06-17 11:25:29 -04:00
Pierre Souchay	f7a1189dba	gossip: Ensure that metadata of Consul Service is updated (#7903 ) While upgrading servers to a new version, I saw that metadata of existing servers are not upgraded, so the version and raft meta is not up to date in catalog. The only way to do it was to: * update Consul server * make it leave the cluster, then metadata is accurate That's because the optimization to avoid updating catalog does not take into account metadata, so no update on catalog is performed.	2020-06-17 12:16:13 +02:00
Daniel Nephin	8753d1f1ba	ci: Add ineffsign linter And fix an additional ineffective assignment that was not caught by staticcheck	2020-06-16 17:32:50 -04:00
Daniel Nephin	97342de262	Merge pull request #8070 from hashicorp/dnephin/add-gofmt-simplify ci: Enable gofmt simplify	2020-06-16 17:18:38 -04:00
Matt Keeler	d994dc7b35	Agent Auto Configuration: Configuration Syntax Updates (#8003 )	2020-06-16 15:03:22 -04:00
Daniel Nephin	89d95561df	Enable gofmt simplify Code changes done automatically with 'gofmt -s -w'	2020-06-16 13:21:11 -04:00
Daniel Nephin	5f24171f13	ci: enable SA4006 staticcheck check And fix the 'value not used' issues. Many of these are not bugs, but a few are tests not checking errors, and one appears to be a missed error in non-test code.	2020-06-16 13:10:11 -04:00
Daniel Nephin	71e6534061	Rename txnWrapper to txn	2020-06-16 13:06:02 -04:00
Daniel Nephin	537ae1fd46	Rename db	2020-06-16 13:04:31 -04:00
Daniel Nephin	78c76f0773	Handle return value from txn.Commit	2020-06-16 13:04:31 -04:00
Daniel Nephin	50db8f409a	state: Update docstrings for changeTrackerDB and txn And un-embed memdb.DB to prevent accidental access to underlying methods.	2020-06-16 13:04:31 -04:00
Paul Banks	f9a6386c4a	state: track changes so that they may be used to produce change events	2020-06-16 13:04:29 -04:00
Matt Keeler	cdc4b20afa	ACL Node Identities (#7970 ) A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy). Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.	2020-06-16 12:54:27 -04:00
freddygv	cc4ff3ae02	Fixup stray sid references	2020-06-12 13:47:43 -06:00
freddygv	1e7e716742	Move compound service names to use ServiceName type	2020-06-12 13:47:43 -06:00
freddygv	806b1fb608	Move GatewayServices out of Internal	2020-06-12 13:46:47 -06:00
Daniel Nephin	6719f1a6fa	Merge pull request #7900 from hashicorp/dnephin/add-linter-staticcheck-2 intentions: fix a bug in Intention.SetHash	2020-06-09 15:40:20 -04:00
Daniel Nephin	5f14eb124c	Merge pull request #8037 from hashicorp/dnephin/add-linter-staticcheck-5 ci: Enabled SA2002 staticcheck check	2020-06-09 15:31:24 -04:00
Hans Hasselberg	7404712854	acl: do not resolve local tokens from remote dcs (#8068 )	2020-06-09 21:13:09 +02:00
Hans Hasselberg	bec21c849d	Tokens converted from legacy ACLs get their Hash computed (#8047 ) * Fixes #5606: Tokens converted from legacy ACLs get their Hash computed This allows new style token replication to work for legacy tokens as well when they change. * tests: fix timestamp comparison Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2020-06-08 21:44:06 +02:00
Daniel Nephin	1cdfc4f290	ci: Enabled SA2002 staticcheck check And handle errors in the main test goroutine	2020-06-05 17:50:11 -04:00
Daniel Nephin	b9e4544ec3	intentions: fix a bug in Intention.SetHash Found using staticcheck. binary.Write does not accept int types without a size. The error from binary.Write was ignored, so we never saw this error. Casting the data to uint64 produces a correct hash. Also deprecate the Default{Addr,Port} fields, and prevent them from being encoded. These fields will always be empty and are not used. Removing these would break backwards compatibility, so they are left in place for now. Co-authored-by: Hans Hasselberg <me@hans.io>	2020-06-05 14:51:43 -04:00
R.B. Boyer	3ad570ba99	server: don't activate federation state replication or anti-entropy until all servers are running 1.8.0+ (#8014 )	2020-06-04 16:05:27 -05:00
Hans Hasselberg	dd8cd9bc24	Merge pull request #7966 from hashicorp/pool_improvements Agent connection pool cleanup	2020-06-04 08:56:26 +02:00
Matt Keeler	2c615807af	Fix legacy management tokens in unupgraded secondary dcs (#7908 ) The ACL.GetPolicy RPC endpoint was supposed to return the “parent” policy and not always the default policy. In the case of legacy management tokens the parent policy was supposed to be “manage”. The result of us not sending this properly was that operations that required specifically a management token such as saving a snapshot would not work in secondary DCs until they were upgraded.	2020-06-03 11:22:22 -04:00
Matt Keeler	9fa9ec4ba0	Fix segfault due to race condition for checking server versions (#7957 ) The ACL monitoring routine uses c.routers to check for server version updates. Therefore it needs to be started after initializing the routers.	2020-06-03 10:36:32 -04:00
Daniel Nephin	e8a883e829	Replace goe/verify.Values with testify/require.Equal (#7993 ) * testing: replace most goe/verify.Values with require.Equal One difference between these two comparisons is that go/verify considers nil slices/maps to be equal to empty slices/maps, where as testify/require does not, and does not appear to provide any way to enable that behaviour. Because of this difference some expected values were changed from empty slices to nil slices, and some calls to verify.Values were left. * Remove github.com/pascaldekloe/goe/verify Reduce the number of assertion packages we use from 2 to 1	2020-06-02 12:41:25 -04:00
R.B. Boyer	7bd7895047	acl: allow auth methods created in the primary datacenter to optionally create global tokens (#7899 )	2020-06-01 11:44:47 -05:00
R.B. Boyer	16db20b1f3	acl: remove the deprecated `acl_enforce_version_8` option (#7991 ) Fixes #7292	2020-05-29 16:16:03 -05:00
Jono Sosulska	7a13c96a2a	Replace whitelist/blacklist terminology with allowlist/denylist (#7971 ) * Replace whitelist/blacklist terminology with allowlist/denylist	2020-05-29 14:19:16 -04:00
Hans Hasselberg	1ed91cbdf6	pool: remove timeout parameter Timeout was never used in a meaningful way by callers, which is why it is now entirely internal to the pool.	2020-05-29 08:21:28 +02:00
Hans Hasselberg	5cda505495	pool: remove useTLS and ForceTLS In the past TLS usage was enforced with these variables, but these days this decision is made by TLSConfigurator and there is no reason to keep using the variables.	2020-05-29 08:21:24 +02:00

... 5 6 7 8 9 ...

1451 Commits