open-consul

Commit Graph

Author	SHA1	Message	Date
Daniel Nephin	a44ddea9ba	state: Use subtests in TestStateStore_ServicesByNodeMeta These subtests make it much easier to identify the slow part of the test, but they also help enumerate all the different cases which are being tested.	2020-07-22 16:39:09 -04:00
Daniel Nephin	6d3b042872	Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs testutil: NewLogBuffer - buffer logs until a test fails	2020-07-21 15:21:52 -04:00
Matt Keeler	8ea8a939f0	Merge pull request #8311 from hashicorp/bugfix/auto-encrypt-token-update	2020-07-21 13:15:27 -04:00
Daniel Nephin	80ff174880	testutil: NewLogBuffer - buffer logs until a test fails Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race. The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: https://github.com/golang/go/issues/38458 (may be fixed in go1.15) https://github.com/golang/go/issues/38382#issuecomment-612940030	2020-07-21 12:50:40 -04:00
Matt Keeler	133a6d99f2	Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint. This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality. Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.	2020-07-21 12:19:25 -04:00
Daniel Nephin	7599e280de	stream: handle empty event in TestEventSnapshot When the race detector is enabled we see this test fail occasionally. The reordering of execution seems to make it possible for the snapshot splice to happen before any events are published to the topicBuffers. We can handle this case in the test the same way it is handled by a subscription, by proceeding to the next event.	2020-07-20 18:20:02 -04:00
Daniel Nephin	75f10fb191	state: update calls that are no longer state methods In a previous commit these methods were changed to functions, so remove the Store paramter.	2020-07-16 15:46:10 -04:00
Daniel Nephin	3fcb2e16f4	state: un-method funcs that don't use their receiver This change was mostly automated with the following First generate a list of functions with: git grep -o 'Store) $[^(]\+$(tx \txn' ./agent/consul/state \| awk '{print $2}' \| grep -o '^[^(]\+' Then the list was curated a bit with trial/error to remove and add funcs as necessary. Finally the replacement was done with: dir=agent/consul/state file=${1-funcnames} while read fn; do echo "$fn" sed -i -e "s/(s \Store) $fn(/$fn(/" $dir/.go sed -i -e "s/s\.$fn(/$fn(/" $dir/.go sed -i -e "s/s\.store\.$fn(/$fn(/" $dir/*.go done < $file	2020-07-16 15:30:39 -04:00
Daniel Nephin	edb0a4f1f8	store: convert methods that don't use their receiver to functions Making these functions allows them to be used without introducing an artificial dependency on the struct. Many of these will be called from streaming Event processors, which do not have a store. This change is being made ahead of the streaming work to get to reduce the size of the streaming diff.	2020-07-16 15:30:10 -04:00
Daniel Nephin	a2f8605c66	stream: Add forceClose and refactor subscription filtering Move the subscription context to Next. context.Context should generally never be stored in a struct because it makes that struct only valid while the context is valid. This is rarely obvious from the caller. Adds a forceClosed channel in place of the old context, and uses the new context as a way for the caller to stop the Subscription blocking. Remove some recursion out of bufferImte.Next. The caller is already looping so we can continue in that loop instead of recursing. This ensures currentItem is updated immediately (which probably does not matter in practice), and also removes the chance that we overflow the stack. NextNoBlock and FollowAfter do not need to handle bufferItem.Err, the caller already handles it. Moves filter to a method to simplify Next, and more explicitly separate filtering from looping. Also improve some godoc Only unwrap itemBuffer.Err when necessary	2020-07-14 15:57:47 -04:00
Daniel Nephin	2595436f62	stream: Improve docstrings Also rename ResumeStrema to EndOfEmptySnapshot to be more consistent with other framing events Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:47 -04:00
Daniel Nephin	16a2b3fafc	stream: change Topic to an interface Consumers of the package can decide on which type to use for the Topic. In the future we may use a gRPC type for the topic.	2020-07-14 15:57:47 -04:00
Daniel Nephin	aa571bd0ce	state: Move change processing out of EventPublisher EventPublisher was receiving TopicHandlers, which had a couple of problems: - ChangeProcessors were being grouped by Topic, but they completely ignored the topic and were performed on every change - ChangeProcessors required EventPublisher to be aware of database changes By moving ChangeProcesors out of EventPublisher, and having Publish accept events instead of changes, EventPublisher no longer needs to be aware of these things. Handlers is now only SnapshotHandlers, which are still mapped by Topic. Also allows us to remove the small 'db' package that had only two types. They can now be unexported types in state.	2020-07-14 15:57:47 -04:00
Daniel Nephin	23a940daad	server: Abandom state store to shutdown EventPublisher So that we don't leak goroutines	2020-07-14 15:57:47 -04:00
Daniel Nephin	e1305fe80c	stream: unexport identifiers Now that EventPublisher is part of stream a lot of the internals can be hidden	2020-07-14 15:57:47 -04:00
Daniel Nephin	9e37894778	stream: Move EventPublisher to stream package The EventPublisher is the central hub of the PubSub system. It is toughly coupled with much of stream. Some stream internals were exported exclusively for EventPublisher. The two Subscribe cases (with or without index) were also awkwardly split between two packages. By moving EventPublisher into stream they are now both in the same package (although still in different files).	2020-07-14 15:57:47 -04:00
Daniel Nephin	6e87e83d77	state: Make handleACLUpdate async once again So that we keep as much as possible out of the FSM commit hot path.	2020-07-14 15:57:47 -04:00
Daniel Nephin	a92dab724d	state: Use interface for Txn Also store the index in Changes instead of the Txn. This change is in preparation for movinng EventPublisher to the stream package, and making handleACLUpdates async once again.	2020-07-14 15:57:46 -04:00
Daniel Nephin	c778d61b6a	stream.Subscription unexport fields and additiona docstrings	2020-07-14 15:57:46 -04:00
Daniel Nephin	37a38629d7	Add a context for stopping EventPublisher goroutine	2020-07-14 15:57:46 -04:00
Daniel Nephin	02bc5a26e4	EventPublisher: Make Unsubscribe a function on Subscription It is critical that Unsubscribe be called with the same pointer to a SubscriptionRequest that was used to create the Subscription. The docstring made that clear, but it sill allowed a caler to get it wrong by creating a new SubscriptionRequest. By hiding this detail from the caller, and only exposing an Unsubscribe method, it should be impossible to fail to Unsubscribe. Also update some godoc strings.	2020-07-14 15:57:46 -04:00
Daniel Nephin	86976cf23c	EventPublisher: handleACL changes synchronously Use a separate lock for subscriptions.ByToken to allow it to happen synchronously in the commit flow. This removes the need to create a new txn for the goroutine, and removes the need for EventPublisher to contain a reference to DB.	2020-07-14 15:57:46 -04:00
Daniel Nephin	606121fae6	stream.EventSnapshot: reduce the fields on the struct Many of the fields are only needed in one place, and by using a closure they can be removed from the struct. This reduces the scope of the variables making it esier to see how they are used.	2020-07-14 15:57:45 -04:00
Daniel Nephin	7196917051	stream.EventBuffer: Seed the fuzz test with time.Now() Otherwise the test will run with exactly the same values each time. By printing the seed we can attempt to reproduce the test by adding an env var to override the seed	2020-07-14 15:57:45 -04:00
Daniel Nephin	525b275a52	state: memdb_wrapper.go -> memdb.go Renaming in a separate commit so that git can merge changes to the file.	2020-07-14 15:57:45 -04:00
Daniel Nephin	b5d2bea770	state: publish changes from Commit Make topicRegistry use functions instead of unbound methods Use a regular memDB in EventPublisher to remove a reference cycle Removes the need for EventPublisher to use a store	2020-07-14 15:57:45 -04:00
Daniel Nephin	f626c3d6c5	EventPublisher: docstrings and getTopicBuffer also rename commitCh -> publishCh	2020-07-14 15:57:45 -04:00
Daniel Nephin	2020e9c7c7	ProcessChanges: use stream.Event Also remove secretHash, which was used to hash tokens. We don't expose these tokens anywhere, so we can use the string itself instead of a Hash. Fix acl_events_test.go for storing a structs type.	2020-07-14 15:57:45 -04:00
Daniel Nephin	2e45bbbb3e	stream: Use local types for Event Topic SubscriptionRequest	2020-07-14 15:57:45 -04:00
Daniel Nephin	3d62013062	Rename stream_publisher.go -> event_publisher.go	2020-07-14 15:57:44 -04:00
Daniel Nephin	526fb53f85	Add streaming package with Subscription and Snapshot components. The remaining files from 7965767de0bd62ab07669b85d6879bd5f815d157 Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:44 -04:00
Chris Piraino	b80cbb499f	Set enterprise metadata after resolving the token (#8302 ) The token can encode enterprise metadata information, and we must make sure we set that on the reply so that we can correct filter ACLs.	2020-07-13 13:39:57 -05:00
Daniel Nephin	13e0d258b5	Merge pull request #8237 from hashicorp/dnephin/remove-acls-enabled-from-delegate Remove ACLsEnabled from delegate interface	2020-07-09 16:35:43 -04:00
Matt Keeler	39d9babab3	Pass the Config and TLS Configurator into the AutoConfig constructor This is instead of having the AutoConfigBackend interface provide functions for retrieving them. NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.	2020-07-08 12:36:11 -04:00
Matt Keeler	a77ed471c8	Rename (Server).forward to (Server).ForwardRPC Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.	2020-07-08 11:05:44 -04:00
Matt Keeler	386ec3a2a2	Refactor AutoConfig RPC to not have a direct dependency on the Server type Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.	2020-07-08 11:05:44 -04:00
Daniel Nephin	8b6036c077	Remove ACLsEnabled from delegate interface In all cases (oss/ent, client/server) this method was returning a value from config. Since the value is consistent, it doesn't need to be part of the delegate interface.	2020-07-03 17:00:20 -04:00
Daniel Nephin	dfa8856e5f	agent/consul: Add support for NotModified to two endpoints A query made with AllowNotModifiedResponse and a MinIndex, where the result has the same Index as MinIndex, will return an empty response with QueryMeta.NotModified set to true. Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-02 17:05:46 -04:00
Matt Keeler	87764e5bfb	Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various	2020-07-02 09:49:49 -04:00
Yury Evtikhov	dbf3c05fa5	DNS: add IsErrQueryNotFound function for easier error evaluation	2020-07-01 03:41:44 +01:00
Matt Keeler	a97f9ff386	Overwrite agent leaf cert trust domain on the servers	2020-06-30 09:59:08 -04:00
Matt Keeler	5600069d69	Store the Connect CA rate limiter on the server This fixes a bug where auto_encrypt was operating without utilizing a common rate limiter.	2020-06-30 09:59:07 -04:00
Matt Keeler	fa42d9b34f	Fix auto_encrypt IP/DNS SANs The initial auto encrypt CSR wasn’t containing the user supplied IP and DNS SANs. This fixes that. Also We were configuring a default :: IP SAN. This should be ::1 instead and was fixed.	2020-06-30 09:59:07 -04:00
R.B. Boyer	72a515f5ec	connect: various changes to make namespaces for intentions work more like for other subsystems (#8194 ) Highlights: - add new endpoint to query for intentions by exact match - using this endpoint from the CLI instead of the dump+filter approach - enforcing that OSS can only read/write intentions with a SourceNS or DestinationNS field of "default". - preexisting OSS intentions with now-invalid namespace fields will delete those intentions on initial election or for wildcard namespaces an attempt will be made to downgrade them to "default" unless one exists. - also allow the '-namespace' CLI arg on all of the intention subcommands - update lots of docs	2020-06-26 16:59:15 -05:00
Daniel Nephin	7d5f1ba6bd	Merge pull request #8176 from hashicorp/dnephin/add-linter-unparam-1 lint: add unparam linter and fix some of the issues	2020-06-25 15:34:48 -04:00
Matt Keeler	d471977f62	Fix go routine leak in auto encrypt ca roots tracking	2020-06-24 17:09:50 -04:00
Matt Keeler	90e741c6d2	Allow cancelling blocking queries in response to shutting down.	2020-06-24 17:09:50 -04:00
Daniel Nephin	07c1081d39	Fix a bunch of unparam lint issues	2020-06-24 13:00:14 -04:00
Matt Keeler	341aedbce9	Ensure that retryLoopBackoff can be cancelled We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways. Fix go routine leak in the gateway locator	2020-06-24 12:41:08 -04:00
Matt Keeler	9dc9f7df15	Allow cancelling startup when performing auto-config (#8157 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2020-06-19 15:16:00 -04:00
Daniel Nephin	b5ef9b7ea9	Remove bytesToUint64 from agent/consul	2020-06-18 12:45:43 -04:00
Daniel Nephin	81bc082b63	Remove unused private IP code from agent/consul	2020-06-18 12:40:38 -04:00
Matt Keeler	2c7844d220	Implement Client Agent Auto Config There are a couple of things in here. First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism. This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.	2020-06-17 16:49:46 -04:00
Matt Keeler	f5d57ccd48	Allow the Agent its its child Client/Server to share a connection pool This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.	2020-06-17 16:19:33 -04:00
Matt Keeler	8c601ad8db	Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc	2020-06-17 16:07:25 -04:00
Chris Piraino	79d003d395	Remove ACLEnforceVersion8 from tests (#8138 ) The field had been deprecated for a while and was recently removed, however a PR which added these tests prior to removal was merged.	2020-06-17 14:58:01 -05:00
Daniel Nephin	1ef8279ac9	Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4 ci: enable SA4006 staticcheck check and add ineffassign	2020-06-17 12:16:02 -04:00
Matt Keeler	eda8cb39fd	Implement the insecure version of the Cluster.AutoConfig RPC endpoint Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.	2020-06-17 11:25:29 -04:00
Pierre Souchay	f7a1189dba	gossip: Ensure that metadata of Consul Service is updated (#7903 ) While upgrading servers to a new version, I saw that metadata of existing servers are not upgraded, so the version and raft meta is not up to date in catalog. The only way to do it was to: * update Consul server * make it leave the cluster, then metadata is accurate That's because the optimization to avoid updating catalog does not take into account metadata, so no update on catalog is performed.	2020-06-17 12:16:13 +02:00
Daniel Nephin	8753d1f1ba	ci: Add ineffsign linter And fix an additional ineffective assignment that was not caught by staticcheck	2020-06-16 17:32:50 -04:00
Daniel Nephin	97342de262	Merge pull request #8070 from hashicorp/dnephin/add-gofmt-simplify ci: Enable gofmt simplify	2020-06-16 17:18:38 -04:00
Matt Keeler	d994dc7b35	Agent Auto Configuration: Configuration Syntax Updates (#8003 )	2020-06-16 15:03:22 -04:00
Daniel Nephin	89d95561df	Enable gofmt simplify Code changes done automatically with 'gofmt -s -w'	2020-06-16 13:21:11 -04:00
Daniel Nephin	5f24171f13	ci: enable SA4006 staticcheck check And fix the 'value not used' issues. Many of these are not bugs, but a few are tests not checking errors, and one appears to be a missed error in non-test code.	2020-06-16 13:10:11 -04:00
Daniel Nephin	71e6534061	Rename txnWrapper to txn	2020-06-16 13:06:02 -04:00
Daniel Nephin	537ae1fd46	Rename db	2020-06-16 13:04:31 -04:00
Daniel Nephin	78c76f0773	Handle return value from txn.Commit	2020-06-16 13:04:31 -04:00
Daniel Nephin	50db8f409a	state: Update docstrings for changeTrackerDB and txn And un-embed memdb.DB to prevent accidental access to underlying methods.	2020-06-16 13:04:31 -04:00
Paul Banks	f9a6386c4a	state: track changes so that they may be used to produce change events	2020-06-16 13:04:29 -04:00
Matt Keeler	cdc4b20afa	ACL Node Identities (#7970 ) A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy). Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.	2020-06-16 12:54:27 -04:00
freddygv	cc4ff3ae02	Fixup stray sid references	2020-06-12 13:47:43 -06:00
freddygv	1e7e716742	Move compound service names to use ServiceName type	2020-06-12 13:47:43 -06:00
freddygv	806b1fb608	Move GatewayServices out of Internal	2020-06-12 13:46:47 -06:00
Daniel Nephin	6719f1a6fa	Merge pull request #7900 from hashicorp/dnephin/add-linter-staticcheck-2 intentions: fix a bug in Intention.SetHash	2020-06-09 15:40:20 -04:00
Daniel Nephin	5f14eb124c	Merge pull request #8037 from hashicorp/dnephin/add-linter-staticcheck-5 ci: Enabled SA2002 staticcheck check	2020-06-09 15:31:24 -04:00
Hans Hasselberg	7404712854	acl: do not resolve local tokens from remote dcs (#8068 )	2020-06-09 21:13:09 +02:00
Hans Hasselberg	bec21c849d	Tokens converted from legacy ACLs get their Hash computed (#8047 ) * Fixes #5606: Tokens converted from legacy ACLs get their Hash computed This allows new style token replication to work for legacy tokens as well when they change. * tests: fix timestamp comparison Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2020-06-08 21:44:06 +02:00
Daniel Nephin	1cdfc4f290	ci: Enabled SA2002 staticcheck check And handle errors in the main test goroutine	2020-06-05 17:50:11 -04:00
Daniel Nephin	b9e4544ec3	intentions: fix a bug in Intention.SetHash Found using staticcheck. binary.Write does not accept int types without a size. The error from binary.Write was ignored, so we never saw this error. Casting the data to uint64 produces a correct hash. Also deprecate the Default{Addr,Port} fields, and prevent them from being encoded. These fields will always be empty and are not used. Removing these would break backwards compatibility, so they are left in place for now. Co-authored-by: Hans Hasselberg <me@hans.io>	2020-06-05 14:51:43 -04:00
R.B. Boyer	3ad570ba99	server: don't activate federation state replication or anti-entropy until all servers are running 1.8.0+ (#8014 )	2020-06-04 16:05:27 -05:00
Hans Hasselberg	dd8cd9bc24	Merge pull request #7966 from hashicorp/pool_improvements Agent connection pool cleanup	2020-06-04 08:56:26 +02:00
Matt Keeler	2c615807af	Fix legacy management tokens in unupgraded secondary dcs (#7908 ) The ACL.GetPolicy RPC endpoint was supposed to return the “parent” policy and not always the default policy. In the case of legacy management tokens the parent policy was supposed to be “manage”. The result of us not sending this properly was that operations that required specifically a management token such as saving a snapshot would not work in secondary DCs until they were upgraded.	2020-06-03 11:22:22 -04:00
Matt Keeler	9fa9ec4ba0	Fix segfault due to race condition for checking server versions (#7957 ) The ACL monitoring routine uses c.routers to check for server version updates. Therefore it needs to be started after initializing the routers.	2020-06-03 10:36:32 -04:00
Daniel Nephin	e8a883e829	Replace goe/verify.Values with testify/require.Equal (#7993 ) * testing: replace most goe/verify.Values with require.Equal One difference between these two comparisons is that go/verify considers nil slices/maps to be equal to empty slices/maps, where as testify/require does not, and does not appear to provide any way to enable that behaviour. Because of this difference some expected values were changed from empty slices to nil slices, and some calls to verify.Values were left. * Remove github.com/pascaldekloe/goe/verify Reduce the number of assertion packages we use from 2 to 1	2020-06-02 12:41:25 -04:00
R.B. Boyer	7bd7895047	acl: allow auth methods created in the primary datacenter to optionally create global tokens (#7899 )	2020-06-01 11:44:47 -05:00
R.B. Boyer	16db20b1f3	acl: remove the deprecated `acl_enforce_version_8` option (#7991 ) Fixes #7292	2020-05-29 16:16:03 -05:00
Jono Sosulska	7a13c96a2a	Replace whitelist/blacklist terminology with allowlist/denylist (#7971 ) * Replace whitelist/blacklist terminology with allowlist/denylist	2020-05-29 14:19:16 -04:00
Hans Hasselberg	1ed91cbdf6	pool: remove timeout parameter Timeout was never used in a meaningful way by callers, which is why it is now entirely internal to the pool.	2020-05-29 08:21:28 +02:00
Hans Hasselberg	5cda505495	pool: remove useTLS and ForceTLS In the past TLS usage was enforced with these variables, but these days this decision is made by TLSConfigurator and there is no reason to keep using the variables.	2020-05-29 08:21:24 +02:00
Hans Hasselberg	9ef44ec3da	pool: remove version The version field has been used to decide which multiplexing to use. It was introduced in 2457293dceec95ecd12ef4f01442e13710ea131a. But this is 6y ago and there is no need for this differentiation anymore.	2020-05-28 23:06:01 +02:00
Daniel Nephin	ea6c2b2adc	ci: Add staticcheck and fix most errors Three of the checks are temporarily disabled to limit the size of the diff, and allow us to enable all the other checks in CI. In a follow up we can fix the issues reported by the other checks one at a time, and enable them.	2020-05-28 11:59:58 -04:00
R.B. Boyer	54c7f825d6	create lib/stringslice package (#7934 )	2020-05-27 11:47:32 -05:00
R.B. Boyer	813d69622e	agent: handle re-bootstrapping in a secondary datacenter when WAN federation via mesh gateways is configured (#7931 ) The main fix here is to always union the `primary-gateways` list with the list of mesh gateways in the primary returned from the replicated federation states list. This will allow any replicated (incorrect) state to be supplemented with user-configured (correct) state in the config file. Eventually the game of random selection whack-a-mole will pick a winning entry and re-replicate the latest federation states from the primary. If the user-configured state is actually the incorrect one, then the same eventual correct selection process will work in that case, too. The secondary fix is actually to finish making wanfed-via-mgws actually work as originally designed. Once a secondary datacenter has replicated federation states for the primary AND managed to stand up its own local mesh gateways then all of the RPCs from a secondary to the primary SHOULD go through two sets of mesh gateways to arrive in the consul servers in the primary (one hop for the secondary datacenter's mesh gateway, and one hop through the primary datacenter's mesh gateway). This was neglected in the initial implementation. While everything works, ideally we should treat communications that go around the mesh gateways as just provided for bootstrapping purposes. Now we heuristically use the success/failure history of the federation state replicator goroutine loop to determine if our current mesh gateway route is working as intended. If it is, we try using the local gateways, and if those don't work we fall back on trying the primary via the union of the replicated state and the go-discover configuration flags. This can be improved slightly in the future by possibly initializing the gateway choice to local on startup if we already have replicated state. This PR does not address that improvement. Fixes #7339	2020-05-27 11:31:10 -05:00
R.B. Boyer	7e42819a71	connect: ensure proxy-defaults protocol is used for upstreams (#7938 )	2020-05-21 16:08:39 -05:00
Daniel Nephin	f9a89db86e	Update agent/consul/state/catalog.go Co-authored-by: Hans Hasselberg <me@hans.io>	2020-05-20 16:34:14 -04:00
Daniel Nephin	e1e1c13b35	state: use an error to indicate compare failed Errors are values. We can use the error value to identify the 'comparison failed' case which makes the function easier to use and should make it harder to miss handle the error case	2020-05-20 12:43:33 -04:00
Pierre Souchay	3b548f0d77	Allow to restrict servers that can join a given Serf Consul cluster. (#7628 ) Based on work done in https://github.com/hashicorp/memberlist/pull/196 this allows to restrict the IP ranges that can join a given Serf cluster and be a member of the cluster. Restrictions on IPs can be done separatly using 2 new differents flags and config options to restrict IPs for LAN and WAN Serf.	2020-05-20 11:31:19 +02:00
Daniel Nephin	6e3a7b0aa8	consul/state: refactor tnxService to avoid missed cases Handling errors at the end of a log switch/case block is somewhat brittle. This block included a couple cases where errors were ignored, but it was not obvious the way it was written. This change moves all error handling into each case block. There is still potentially one case where err is ignored, which will be handled in a follow up.	2020-05-19 16:50:14 -04:00
Daniel Nephin	545bd766e7	Fix a number of problems found by staticcheck Some of these problems are minor (unused vars), but others are real bugs (ignored errors). Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>	2020-05-19 16:50:14 -04:00
Daniel Nephin	aa8009ee45	Remove unused var The usage of this var was removed in b92f895c233bf8a99ae35361117a90416fea29b5. Found by using staticcheck	2020-05-19 16:50:14 -04:00
Chris Piraino	ce099c9aca	Do not return an error if requested service is not a gateway This commit converts the previous error into just a Warn-level log message. By returning an error when the requested service was not a gateway, we did not appropriately update envoy because the cache Fetch returned an error and thus did not propagate the update through proxycfg and xds packages.	2020-05-18 09:08:04 -05:00
Aleksandr Zagaevskiy	75f0607d3b	Preserve ModifyIndex for unchanged entry in KVS TXN (#7832 )	2020-05-14 13:25:04 -06:00
Matt Keeler	849eedd142	Fix identity resolution on clients and in secondary dcs (#7862 ) Previously this happened to be using the method on the Server/Client that was meant to allow the ACLResolver to locally resolve tokens. On Servers that had tokens (primary or secondary dc + token replication) this function would lookup the token from raft and return the ACLIdentity. On clients this was always a noop. We inadvertently used this function instead of creating a new one when we added logging accessor ids for permission denied RPC requests. With this commit, a new method is used for resolving the identity properly via the ACLResolver which may still resolve locally in the case of being on a server with tokens but also supports remote token resolution.	2020-05-13 13:00:08 -04:00
Chris Piraino	3eb4e4012a	Make new gateway tests compatible with enterprise (#7856 )	2020-05-12 13:48:20 -05:00
Daniel Nephin	2e0f750f1a	Add unconvert linter To find unnecessary type convertions	2020-05-12 13:47:25 -04:00
Drew Bailey	7e1734b1c6	Value is already an int, remove type cast	2020-05-12 13:13:09 -04:00
R.B. Boyer	940e5ad160	acl: add auth method for JWTs (#7846 )	2020-05-11 20:59:29 -05:00
Chris Piraino	1173b31949	Return early from updateGatewayServices if nothing to update (#7838 ) * Return early from updateGatewayServices if nothing to update Previously, we returned an empty slice of gatewayServices, which caused us to accidentally delete everything in the memdb table * PR comment and better formatting	2020-05-11 14:46:48 -05:00
Chris Piraino	1a7c99cf31	Fix TestInternal_GatewayServiceDump_Ingress (#7840 ) Protocol was added as a field on GatewayServices after GatewayServiceDump PR branch was created.	2020-05-11 14:46:31 -05:00
R.B. Boyer	c54211ad52	cli: ensure 'acl auth-method update' doesn't deep merge the Config field (#7839 )	2020-05-11 14:21:17 -05:00
Chris Piraino	107c7a9ca7	PR comment and better formatting	2020-05-11 14:04:59 -05:00
Chris Piraino	9f924400e0	Return early from updateGatewayServices if nothing to update Previously, we returned an empty slice of gatewayServices, which caused us to accidentally delete everything in the memdb table	2020-05-11 12:38:04 -05:00
Freddy	ebbb234ecb	Gateway Services Nodes UI Endpoint (#7685 ) The endpoint supports queries for both Ingress Gateways and Terminating Gateways. Used to display a gateway's linked services in the UI.	2020-05-11 11:35:17 -06:00
Chris Piraino	a635e23f86	Restoring config entries updates the gateway-services table (#7811 ) - Adds a new validateConfigEntryEnterprise function - Also fixes some state store tests that were failing in enterprise	2020-05-08 13:24:33 -05:00
Freddy	a37d7a42c9	Fix up enterprise compatibility for gateways (#7813 )	2020-05-08 09:44:34 -06:00
Jono Sosulska	44011c81f2	Fix spelling of deregister (#7804 )	2020-05-08 10:03:45 -04:00
Chris Piraino	ad8a0544f2	Require individual services in ingress entry to match protocols (#7774 ) We require any non-wildcard services to match the protocol defined in the listener on write, so that we can maintain a consistent experience through ingress gateways. This also helps guard against accidental misconfiguration by a user. - Update tests that require an updated protocol for ingress gateways	2020-05-06 16:09:24 -05:00
Chris Piraino	ac115e39b2	A proxy-default config entry only exists in the default namespace	2020-05-06 15:06:14 -05:00
Chris Piraino	9a130f2ccc	Remove outdated comment	2020-05-06 15:06:14 -05:00
Kyle Havlovitz	04b6bd637a	Filter wildcard gateway services to match listener protocol This now requires some type of protocol setting in ingress gateway tests to ensure the services are not filtered out. - small refactor to add a max(x, y) function - Use internal configEntryTxn function and add MaxUint64 to lib	2020-05-06 15:06:13 -05:00
Chris Piraino	210dda5682	Allow Hosts field to be set on an ingress config entry - Validate that this cannot be set on a 'tcp' listener nor on a wildcard service. - Add Hosts field to api and test in consul config write CLI - xds: Configure envoy with user-provided hosts from ingress gateways	2020-05-06 15:06:13 -05:00
Kyle Havlovitz	e4268c8b7f	Support multiple listeners referencing the same service in gateway definitions	2020-05-06 15:06:13 -05:00
Kyle Havlovitz	b21cd112e5	Allow ingress gateways to route traffic based on Host header This commit adds the necessary changes to allow an ingress gateway to route traffic from a single defined port to multiple different upstream services in the Consul mesh. To do this, we now require all HTTP requests coming into the ingress gateway to specify a Host header that matches "<service-name>.*" in order to correctly route traffic to the correct service. - Differentiate multiple listener's route names by port - Adds a case in xds for allowing default discovery chains to create a route configuration when on an ingress gateway. This allows default services to easily use host header routing - ingress-gateways have a single route config for each listener that utilizes domain matching to route to different services.	2020-05-06 15:06:13 -05:00
R.B. Boyer	1187d7288e	acl: oss plumbing to support auth method namespace rules in enterprise (#7794 ) This includes website docs updates.	2020-05-06 13:48:04 -05:00
R.B. Boyer	b6cc92020d	test: make the kube auth method test helper use freeport (#7788 )	2020-05-05 16:55:21 -05:00
Hans Hasselberg	e3e2b82a00	network_segments: stop advertising segment tags	2020-05-05 21:32:05 +02:00
Hans Hasselberg	854aac510f	agent: refactor to use a single addrFn	2020-05-05 21:08:10 +02:00
Hans Hasselberg	0f2e189012	agent: rename local/global to src/dst	2020-05-05 21:07:34 +02:00
Chris Piraino	837bd6f558	Construct a default destination if one does not exist for service-router (#7783 )	2020-05-05 10:49:50 -05:00
R.B. Boyer	c9c557477b	acl: add MaxTokenTTL field to auth methods (#7779 ) When set to a non zero value it will limit the ExpirationTime of all tokens created via the auth method.	2020-05-04 17:02:57 -05:00
R.B. Boyer	265d2ea9e1	acl: add DisplayName field to auth methods (#7769 ) Also add a few missing acl fields in the api.	2020-05-04 15:18:25 -05:00
Hans Hasselberg	1be90e0fa1	agent: don't let left nodes hold onto their node-id (#7747 )	2020-05-04 18:39:08 +02:00
Matt Keeler	669d22933e	Merge pull request #7714 from hashicorp/oss-sync/msp-agent-token	2020-05-04 11:33:50 -04:00
R.B. Boyer	3ac5a841ec	acl: refactor the authmethod.Validator interface (#7760 ) This is a collection of refactors that make upcoming PRs easier to digest. The main change is the introduction of the authmethod.Identity struct. In the one and only current auth method (type=kubernetes) all of the trusted identity attributes are both selectable and projectable, so they were just passed around as a map[string]string. When namespaces were added, this was slightly changed so that the enterprise metadata can also come back from the login operation, so login now returned two fields. Now with some upcoming auth methods it won't be true that all identity attributes will be both selectable and projectable, so rather than update the login function to return 3 pieces of data it seemed worth it to wrap those fields up and give them a proper name.	2020-05-01 17:35:28 -05:00
R.B. Boyer	4cd1d62e40	acl: change authmethod.Validator to take a logger (#7758 )	2020-05-01 15:55:26 -05:00
R.B. Boyer	9faf8c42d1	sdk: extracting testutil.RequireErrorContains from various places it was duplicated (#7753 )	2020-05-01 11:56:34 -05:00
Hans Hasselberg	6626cb69d6	rpc: oss changes for network area connection pooling (#7735 )	2020-04-30 22:12:17 +02:00
Freddy	c34ee5d339	Watch fallback channel for gateways that do not exist (#7715 ) Also ensure that WatchSets in tests are reset between calls to watchFired. Any time a watch fires, subsequent calls to watchFired on the same WatchSet will also return true even if there were no changes.	2020-04-29 16:52:27 -06:00
Matt Keeler	901d6739ad	Some boilerplate to allow for ACL Bootstrap disabling configurability	2020-04-28 09:42:46 -04:00
Freddy	f5c1e5268b	TLS Origination for Terminating Gateways (#7671 )	2020-04-27 16:25:37 -06:00
freddygv	e751b83a3f	Clean up dead code, issue addressed by passing ws to serviceGatewayNodes	2020-04-27 11:08:41 -06:00
freddygv	75e737b0f2	Fix internal endpoint test	2020-04-27 11:08:41 -06:00
freddygv	7667567688	Avoid deleting mappings for services linked to other gateways on dereg	2020-04-27 11:08:41 -06:00
freddygv	28fe6920fe	Re-fix bug in CheckConnectServiceNodes	2020-04-27 11:08:41 -06:00
freddygv	bab101107c	Fix ConnectQueryBlocking test	2020-04-27 11:08:40 -06:00
freddygv	65e60d02f1	Fix bug in CheckConnectServiceNodes Previously, if a blocking query called CheckConnectServiceNodes before the gateway-services memdb table had any entries, a nil watchCh would be returned when calling serviceTerminatingGatewayNodes. This means that the blocking query would not fire if a gateway config entry was added after the watch started. In cases where the blocking query started on proxy registration, the proxy could potentially never become aware of an upstream endpoint if that upstream was going to be represented by a gateway.	2020-04-27 11:08:40 -06:00
Matt Keeler	4b1b42cef5	A couple testing helper updates (#7694 )	2020-04-27 12:17:38 -04:00
Kit Patella	2b95bd7ca9	Merge pull request #7656 from hashicorp/feature/audit/oss-merge agent: stub out auditing functionality in OSS	2020-04-17 13:33:06 -07:00
Chris Piraino	c5ab43ebbc	Fix bug where non-typical services are associated with gateways (#7662 ) On every service registration, we check to see if a service should be assassociated to a wildcard gateway-service. This fixes an issue where we did not correctly check to see if the service being registered was a "typical" service or not.	2020-04-17 11:24:34 -05:00
Kit Patella	c3d24d7c3e	agent: stub out auditing functionality in OSS	2020-04-16 15:07:52 -07:00
Kyle Havlovitz	6a5eba63ab	Ingress Gateways for TCP services (#7509 ) * Implements a simple, tcp ingress gateway workflow This adds a new type of gateway for allowing Ingress traffic into Connect from external services. Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-04-16 14:00:48 -07:00
Matt Keeler	1e70ffee76	Update the Client code to use the common version checking infra… (#7558 ) Also reduce the log level of some version checking messages on the server as they can be pretty noisy during upgrades and really are more for debugging purposes.	2020-04-14 11:54:27 -04:00
Matt Keeler	1332628b67	Allow the bootstrap endpoint to be disabled in enterprise. (#7614 )	2020-04-14 11:45:39 -04:00
Pierre Souchay	2e6cd9e11a	fix flaky TestReplication_FederationStates test due to race conditions (#7612 ) The test had two racy bugs related to memdb references. The first was when we initially populated data and retained the FederationState objects in a slice. Due to how the `inmemCodec` works these were actually the identical objects passed into memdb. The second was that the `checkSame` assertion function was reading from memdb and setting the RaftIndexes to zeros to aid in equality checks. This was mutating the contents of memdb which is a no-no. With this fix, the command: ``` i=0; while /usr/local/bin/go test -count=1 -timeout 30s github.com/hashicorp/consul/agent/consul -run '^(TestReplication_FederationStates)$'; do i=$((i + 1)); printf "$i "; done ``` That used to break on my machine in less than 20 runs is now running 150+ times without any issue. Might also fix #7575	2020-04-09 15:42:41 -05:00
Freddy	c1f79c6b3c	Terminating gateway discovery (#7571 ) * Enable discovering terminating gateways * Add TerminatingGatewayServices to state store * Use GatewayServices RPC endpoint for ingress/terminating	2020-04-08 12:37:24 -06:00
Matt Keeler	42f02e80c3	Enable filtering language support for the v1/connect/intentions… (#7593 ) * Enable filtering language support for the v1/connect/intentions listing API * Update website for filtering of Intentions * Update website/source/api/connect/intentions.html.md	2020-04-07 11:48:44 -04:00
Matt Keeler	5d0e661203	Ensure that token clone copies the roles (#7577 )	2020-04-02 12:09:35 -04:00
Emre Savcı	7a99f29adc	agent: add len, cap while initializing arrays	2020-04-01 10:54:51 +02:00
Freddy	8a1e53754e	Add config entry for terminating gateways (#7545 ) This config entry will be used to configure terminating gateways. It accepts the name of the gateway and a list of services the gateway will represent. For each service users will be able to specify: its name, namespace, and additional options for TLS origination. Co-authored-by: Kyle Havlovitz <kylehav@gmail.com> Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-03-31 13:27:32 -06:00
Kyle Havlovitz	01a23b8eb4	Add config entry/state for Ingress Gateways (#7483 ) * Add Ingress gateway config entry and other relevant structs * Add api package tests for ingress gateways * Embed EnterpriseMeta into ingress service struct * Add namespace fields to api module and test consul config write decoding * Don't require a port for ingress gateways * Add snakeJSON and camelJSON cases in command test * Run Normalize on service's ent metadata Sadly cannot think of a way to test this in OSS. * Every protocol requires at least 1 service * Validate ingress protocols * Update agent/structs/config_entry_gateways.go Co-authored-by: Chris Piraino <cpiraino@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2020-03-31 11:59:10 -05:00
Matt Keeler	35c8e996c3	Ensure server requirements checks are done against ALL known se… (#7491 ) Co-authored-by: Paul Banks <banks@banksco.de>	2020-03-27 12:31:43 -04:00
Daniel Nephin	a2eb66963c	Merge pull request #7516 from hashicorp/dnephin/remove-unused-method agent: Remove unused method Encrypted from delegate interface	2020-03-26 14:17:58 -04:00
Daniel Nephin	ebb851f32d	agent: Remove unused Encrypted from interface It appears to be unused. It looks like it has been around a while, I geuss at some point we stopped using this method.	2020-03-26 12:34:31 -04:00
Freddy	cb55fa3742	Enable CLI to register terminating gateways (#7500 ) * Enable CLI to register terminating gateways * Centralize gateway proxy configuration	2020-03-26 10:20:56 -06:00
Alejandro Baez	7d68d7eaa6	Add PolicyReadByName for API (#6615 )	2020-03-25 10:34:24 -04:00
Matt Keeler	58e2969fc1	Fix ACL mode advertisement and detection (#7451 ) These changes are necessary to ensure advertisement happens correctly even when datacenters are connected via network areas in Consul enterprise. This also changes how we check if ACLs can be upgraded within the local datacenter. Previously we would iterate through all LAN members. Now we just use the ServerLookup type to iterate through all known servers in the DC.	2020-03-16 12:54:45 -04:00
Freddy	8a7ff69b19	Update MSP token and filtering (#7431 )	2020-03-11 12:08:49 -06:00
R.B. Boyer	10d3ff9a4f	server: strip local ACL tokens from RPCs during forwarding if crossing datacenters (#7419 ) Fixes #7414	2020-03-10 11:15:22 -05:00
Kyle Havlovitz	520d464c85	Merge pull request #7373 from hashicorp/acl-segments-fix Add stub methods for ACL/segment bug fix from enterprise	2020-03-09 14:25:49 -07:00
R.B. Boyer	a7fb26f50f	wan federation via mesh gateways (#6884 ) This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch: There are several distinct chunks of code that are affected: * new flags and config options for the server * retry join WAN is slightly different * retry join code is shared to discover primary mesh gateways from secondary datacenters * because retry join logic runs in the agent and the results of that operation for primary mesh gateways are needed in the server there are some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur at multiple layers of abstraction just to pass the data down to the right layer. * new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers * the function signature for RPC dialing picked up a new required field (the node name of the destination) * several new RPCs for manipulating a FederationState object: `FederationState:{Apply,Get,List,ListMeshGateways}` * 3 read-only internal APIs for debugging use to invoke those RPCs from curl * raft and fsm changes to persist these FederationStates * replication for FederationStates as they are canonically stored in the Primary and replicated to the Secondaries. * a special derivative of anti-entropy that runs in secondaries to snapshot their local mesh gateway `CheckServiceNodes` and sync them into their upstream FederationState in the primary (this works in conjunction with the replication to distribute addresses for all mesh gateways in all DCs to all other DCs) * a "gateway locator" convenience object to make use of this data to choose the addresses of gateways to use for any given RPC or gossip operation to a remote DC. This gets data from the "retry join" logic in the agent and also directly calls into the FSM. * RPC (`:8300`) on the server sniffs the first byte of a new connection to determine if it's actually doing native TLS. If so it checks the ALPN header for protocol determination (just like how the existing system uses the type-byte marker). * 2 new kinds of protocols are exclusively decoded via this native TLS mechanism: one for ferrying "packet" operations (udp-like) from the gossip layer and one for "stream" operations (tcp-like). The packet operations re-use sockets (using length-prefixing) to cut down on TLS re-negotiation overhead. * the server instances specially wrap the `memberlist.NetTransport` when running with gateway federation enabled (in a `wanfed.Transport`). The general gist is that if it tries to dial a node in the SAME datacenter (deduced by looking at the suffix of the node name) there is no change. If dialing a DIFFERENT datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh gateways to eventually end up in a server's :8300 port. * a new flag when launching a mesh gateway via `consul connect envoy` to indicate that the servers are to be exposed. This sets a special service meta when registering the gateway into the catalog. * `proxycfg/xds` notice this metadata blob to activate additional watches for the FederationState objects as well as the location of all of the consul servers in that datacenter. * `xds:` if the extra metadata is in place additional clusters are defined in a DC to bulk sink all traffic to another DC's gateways. For the current datacenter we listen on a wildcard name (`server.<dc>.consul`) that load balances all servers as well as one mini-cluster per node (`<node>.server.<dc>.consul`) * the `consul tls cert create` command got a new flag (`-node`) to help create an additional SAN in certs that can be used with this flavor of federation.	2020-03-09 15:59:02 -05:00
Matt Keeler	b684138882	Fix session backwards incompatibility with 1.6.x and earlier.	2020-03-05 15:34:55 -05:00
Kyle Havlovitz	b05ebe2507	Add stub methods for ACL/segment bug fix from enterprise	2020-03-02 10:30:23 -08:00
rerorero	b366a25179	fix: Destroying a session that doesn't exist returns status cod… (#6905 ) fix #6840	2020-02-18 11:13:15 -05:00
Matt Keeler	be0d6efac9	Allow the PolicyResolve and RoleResolve endpoints to process na… (#7296 )	2020-02-13 14:55:27 -05:00
R.B. Boyer	919741838d	fix use of hclog logger (#7264 )	2020-02-12 09:37:16 -06:00
ShimmerGlass	a27ccc7248	agent: add server raft.{last,applied}_index gauges (#6694 ) These metrics are useful for : * Tracking the rate of update to the db * Allow to have a rough idea of when an index originated	2020-02-11 10:50:18 +01:00
Hans Hasselberg	71ce832990	connect: add validations around intermediate cert ttl (#7213 )	2020-02-11 00:05:49 +01:00
R.B. Boyer	c37d00791c	make the TestRPC_RPCMaxConnsPerClient test less flaky (#7255 )	2020-02-10 15:13:53 -06:00
Sarah Christoff	85d2714c76	Fix flaky TestAutopilot_BootstrapExpect (#7242 )	2020-02-10 14:52:58 -06:00
Kit Patella	d28bc1acbe	rpc: measure blocking queries (#7224 ) * agent: measure blocking queries * agent.rpc: update docs to mention we only record blocking queries * agent.rpc: make go fmt happy * agent.rpc: fix non-atomic read and decrement with bitwise xor of uint64 0 * agent.rpc: clarify review question * agent.rpc: today I learned that one must declare all variables before interacting with goto labels * Update agent/consul/server.go agent.rpc: more precise comment on `Server.queriesBlocking` Co-Authored-By: Paul Banks <banks@banksco.de> * Update website/source/docs/agent/telemetry.html.md agent.rpc: improve queries_blocking description Co-Authored-By: Paul Banks <banks@banksco.de> * agent.rpc: fix some bugs found in review * add a note about the updated counter behavior to telemetry.md * docs: add upgrade-specific note on consul.rpc.quer{y,ies_blocking} behavior Co-authored-by: Paul Banks <banks@banksco.de>	2020-02-10 10:01:15 -08:00
Matt Keeler	966d085066	Catalog + Namespace OSS changes. (#7219 ) * Various Prepared Query + Namespace things * Last round of OSS changes for a namespaced catalog	2020-02-10 10:40:44 -05:00
R.B. Boyer	b4325dfbce	agent: ensure that we always use the same settings for msgpack (#7245 ) We set RawToString=true so that []uint8 => string when decoding an interface{}. We set the MapType so that map[interface{}]interface{} decodes to map[string]interface{}. Add tests to ensure that this doesn't break existing usages. Fixes #7223	2020-02-07 15:50:24 -06:00
Freddy	aca8b85440	Remove outdated TODO (#7244 )	2020-02-07 13:14:48 -07:00
Matt Keeler	f610d1d791	Fix a bug with ACL enforcement of reads on namespaced config entries. (#7239 )	2020-02-07 08:30:40 -05:00
Kit Patella	aa9db3f903	agent/consul server: fix LeaderTest_ChangeNodeID (#7236 ) * fix LeaderTest_ChangeNodeID to use StatusLeft and add waitForAnyLANLeave * unextract the waitFor... fn, simplify, and provide a more descriptive error	2020-02-06 16:37:53 -08:00
Matt Keeler	2524a028ea	OSS Changes for various config entry namespacing bugs (#7226 )	2020-02-06 10:52:25 -05:00
R.B. Boyer	a67001aa22	agent: differentiate wan vs lan loggers in memberlist and serf (#7205 ) This should be a helpful change until memberlist and serf can be properly switched to native hclog.	2020-02-05 09:52:43 -06:00
Matt Keeler	119168203b	Fix disco chain graph validation for namespaces (#7217 ) Previously this happened to be validating only the chains in the default namespace. Now it will validate all chains in all namespaces when the global proxy-defaults is changed.	2020-02-05 10:06:27 -05:00
Matt Keeler	3621f7090b	Minor Non-Functional Updates (#7215 ) * Cleanup the discovery chain compilation route handling Nothing functionally should be different here. The real difference is that when creating new targets or handling route destinations we use the router config entries name and namespace instead of that of the top level request. Today they SHOULD always be the same but that may not always be the case. This hopefully also makes it easier to understand how the router entries are handled. * Refactor a small bit of the service manager tests in oss We used to use the stringHash function to compute part of the filename where things would get persisted to. This has been changed in the core code to calling the StringHash method on the ServiceID type. It just so happens that the new method will output the same value for anything in the default namespace (by design actually). However, logically this filename computation in the test should do the same thing as the core code itself so I updated it here. Also of note is that newer enterprise-only tests for the service manager cannot use the old stringHash function at all because it will produce incorrect results for non-default namespaces.	2020-02-05 10:06:11 -05:00
Freddy	67e02a0752	Add managed service provider token (#7218 ) Stubs for enterprise-only ACL token to be used by managed service providers.	2020-02-04 13:58:56 -07:00
Hans Hasselberg	a9f9ed83cb	agent: increase watchLimit to 8192. (#7200 ) The previous value was too conservative and users with many instances were having problems because of it. This change increases the limit to 8192 which reportedly fixed most of the issues with that. Related: #4984, #4986, #5050.	2020-02-04 13:11:30 +01:00
Davor Kapsa	c280dd8549	auto_encrypt: check previously ignored error (#6604 )	2020-02-03 10:35:11 +01:00
Hans Hasselberg	50281032e0	Security fixes (#7182 ) * Mitigate HTTP/RPC Services Allow Unbounded Resource Usage Fixes #7159. Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>	2020-01-31 11:19:37 -05:00
Matt Keeler	26bb1584c1	Updates to the Txn API for namespaces (#7172 ) * Updates to the Txn API for namespaces * Update agent/consul/txn_endpoint.go Co-Authored-By: R.B. Boyer <rb@hashicorp.com> Co-authored-by: R.B. Boyer <public@richardboyer.net>	2020-01-30 13:12:26 -05:00
Matt Keeler	3f253080a2	Sync some feature flag support from enterprise (#7167 )	2020-01-29 13:21:38 -05:00
R.B. Boyer	01ebdff2a9	various tweaks on top of the hclog work (#7165 )	2020-01-29 11:16:08 -06:00
Chris Piraino	3dd0b59793	Allow users to configure either unstructured or JSON logging (#7130 ) * hclog Allow users to choose between unstructured and JSON logging	2020-01-28 17:50:41 -06:00
Kit Patella	49e9bbbdf9	Add accessorID of token when ops are denied by ACL system (#7117 ) * agent: add and edit doc comments * agent: add ACL token accessorID to debugging traces * agent: polish acl debugging * agent: minor fix + string fmt over value interp * agent: undo export & fix logging field names * agent: remove note and migrate up to code review * Update agent/consul/acl.go Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com> * agent: incorporate review feedback * Update agent/acl.go Co-Authored-By: R.B. Boyer <public@richardboyer.net> Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: R.B. Boyer <public@richardboyer.net>	2020-01-27 11:54:32 -08:00
Matt Keeler	485a0a65ea	Updates to Config Entries and Connect for Namespaces (#7116 )	2020-01-24 10:04:58 -05:00
Matt Keeler	90b9f87160	Add the v1/catalog/node-services/:node endpoint (#7115 ) The backing RPC already existed but the endpoint will be useful for other service syncing processes such as consul-k8s as this endpoint can return all services registered with a node regardless of namespacing.	2020-01-24 09:27:25 -05:00
Hans Hasselberg	5379cf7c67	raft: increase raft notify buffer. (#6863 ) * Increase raft notify buffer. Fixes https://github.com/hashicorp/consul/issues/6852. Increasing the buffer helps recovering from leader flapping. It lowers the chances of the flapping leader to get into a deadlock situation like described in #6852.	2020-01-22 16:15:59 +01:00
Hans Hasselberg	d52a4e3b82	tests: fix autopilot test (#7092 )	2020-01-21 14:09:51 +01:00
Hans Hasselberg	43392d5db3	raft: update raft to v1.1.2 (#7079 ) * update raft * use hclogger for raft.	2020-01-20 13:58:02 +01:00
Hans Hasselberg	315ba7d6ad	connect: check if intermediate cert needs to be renewed. (#6835 ) Currently when using the built-in CA provider for Connect, root certificates are valid for 10 years, however secondary DCs get intermediates that are valid for only 1 year. There is no mechanism currently short of rotating the root in the primary that will cause the secondary DCs to renew their intermediates. This PR adds a check that renews the cert if it is half way through its validity period. In order to be able to test these changes, a new configuration option was added: IntermediateCertTTL which is set extremely low in the tests.	2020-01-17 23:27:13 +01:00
Hans Hasselberg	b6c83e06d5	auto_encrypt: set dns and ip san for k8s and provide configuration (#6944 ) * Add CreateCSRWithSAN * Use CreateCSRWithSAN in auto_encrypt and cache * Copy DNSNames and IPAddresses to cert * Verify auto_encrypt.sign returns cert with SAN * provide configuration options for auto_encrypt dnssan and ipsan * rename CreateCSRWithSAN to CreateCSR	2020-01-17 23:25:26 +01:00
Matej Urbas	d877e091d6	agent: configurable MaxQueryTime and DefaultQueryTime. (#3777 )	2020-01-17 14:20:57 +01:00
Matt Keeler	c8294b8595	AuthMethod updates to support alternate namespace logins (#7029 )	2020-01-14 10:09:29 -05:00
Matt Keeler	baa89c7c65	Intentions ACL enforcement updates (#7028 ) * Renamed structs.IntentionWildcard to structs.WildcardSpecifier * Refactor ACL Config Get rid of remnants of enterprise only renaming. Add a WildcardName field for specifying what string should be used to indicate a wildcard. * Add wildcard support in the ACL package For read operations they can call anyAllowed to determine if any read access to the given resource would be granted. For write operations they can call allAllowed to ensure that write access is granted to everything. * Make v1/agent/connect/authorize namespace aware * Update intention ACL enforcement This also changes how intention:read is granted. Before the Intention.List RPC would allow viewing an intention if the token had intention:read on the destination. However Intention.Match allowed viewing if access was allowed for either the source or dest side. Now Intention.List and Intention.Get fall in line with Intention.Matches previous behavior. Due to this being done a few different places ACL enforcement for a singular intention is now done with the CanRead and CanWrite methods on the intention itself. * Refactor Intention.Apply to make things easier to follow.	2020-01-13 15:51:40 -05:00
Pierre Souchay	61fc4f8253	rpc: log method when a server/server RPC call fails (#4548 ) Sometimes, we have lots of errors in cross calls between DCs (several hundreds / sec) Enrich the log in order to help diagnose the root cause of issue.	2020-01-13 19:55:29 +01:00
R.B. Boyer	20f51f9181	connect: derive connect certificate serial numbers from a memdb index instead of the provider table max index (#7011 )	2020-01-09 16:32:19 +01:00
R.B. Boyer	446f0533cd	connect: ensure that updates to the secondary root CA configuration use the correct signing key ID values for comparison (#7012 ) Fixes #6886	2020-01-09 16:28:16 +01:00
R.B. Boyer	42f80367be	Restore a few more service-kind index updates so blocking in ServiceDump works in more cases (#6948 ) Restore a few more service-kind index updates so blocking in ServiceDump works in more cases Namely one omission was that check updates for dumped services were not unblocking. Also adds a ServiceDump state store test and also fix a watch bug with the normal dump. Follow-on from #6916	2019-12-19 10:15:37 -06:00
Matt Keeler	6de4eb8569	OSS changes for implementing token based namespace inferencing remove debug log	2019-12-18 14:07:08 -05:00
Matt Keeler	185654b075	Unflake the TestACLEndpoint_TokenList test In order to do this I added a waitForLeaderEstablishment helper which does the right thing to ensure that leader establishment has finished. fixup	2019-12-18 14:07:07 -05:00
Matt Keeler	8af12bf4f4	Miscellaneous acl package cleanup • Renamed EnterpriseACLConfig to just Config • Removed chained_authorizer_oss.go as it was empty • Renamed acl.go to errors.go to more closely describe its contents	2019-12-18 13:44:32 -05:00
Matt Keeler	bdf025a758	Rename EnterpriseAuthorizerContext -> AuthorizerContext	2019-12-18 13:43:24 -05:00
Preetha	f607a00138	autopilot: fix dead server removal condition to use correct failure tolerance (#4017 ) * Make dead server removal condition in autopilot use correct failure tolerance rules * Introduce func with explanation	2019-12-16 23:35:13 +01:00
Matt Keeler	9812b32155	Fix blocking for ServiceDumping by kind (#6919 )	2019-12-10 13:58:30 -05:00
Matt Keeler	442924c35a	Sync of OSS changes to support namespaces (#6909 )	2019-12-09 21:26:41 -05:00
Hans Hasselberg	a36e58c964	agent: fewer file local differences between enterprise and oss (#6820 ) (#6898 ) * Increase number to test ignore. Consul Enterprise has more flags and since we are trying to reduce the differences between both code bases, we are increasing the number in oss. The semantics don't change, it is just a cosmetic thing. * Introduce agent.initEnterprise for enterprise related hooks. * Sync test with ent version. * Fix import order. * revert error wording.	2019-12-06 21:35:58 +01:00
Matt Keeler	609c9dab02	Miscellaneous Fixes (#6896 ) Ensure we close the Sentinel Evaluator so as not to leak go routines Fix a bunch of test logging so that various warnings when starting a test agent go to the ltest logger and not straight to stdout. Various canned ent meta types always return a valid pointer (no more nils). This allows us to blindly deref + assign in various places. Update ACL index tracking to ensure oss -> ent upgrades will work as expected. Update ent meta parsing to include function to disallow wildcarding.	2019-12-06 14:01:34 -05:00
Matt Keeler	c15c81a7ed	[Feature] API: Add a internal endpoint to query for ACL authori… (#6888 ) * Implement endpoint to query whether the given token is authorized for a set of operations * Updates to allow for remote ACL authorization via RPC This is only used when making an authorization request to a different datacenter.	2019-12-06 09:25:26 -05:00
Matt Keeler	f30af37d11	Fix the TestLeader_SecondaryCA_IntermediateRefresh test flakiness	2019-12-04 19:19:55 -05:00
Matt Keeler	90ae4a1f1e	OSS KV Modifications to Support Namespaces	2019-11-25 12:57:35 -05:00
Matt Keeler	68d79142c4	OSS Modifications necessary for sessions namespacing	2019-11-25 12:07:04 -05:00
Paul Banks	a84b82b3df	connect: Add AWS PCA provider (#6795 ) * Update AWS SDK to use PCA features. * Add AWS PCA provider * Add plumbing for config, config validation tests, add test for inheriting existing CA resources created by user * Unparallel the tests so we don't exhaust PCA limits * Merge updates * More aggressive polling; rate limit pass through on sign; Timeout on Sign and CA create * Add AWS PCA docs * Fix Vault doc typo too * Doc typo * Apply suggestions from code review Co-Authored-By: R.B. Boyer <rb@hashicorp.com> Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com> * Doc fixes; tests for erroring if State is modified via API * More review cleanup * Uncomment tests! * Minor suggested clean ups	2019-11-21 17:40:29 +00:00
Paul Banks	9e17aa3b41	Change CA Configure struct to pass Datacenter through (#6775 ) * Change CA Configure struct to pass Datacenter through * Remove connect/ca/plugin as we don't have immediate plans to use it. We still intend to one day but there are likely to be several changes to the CA provider interface before we do so it's better to rebuild from history when we do that work properly. * Rename PrimaryDC; fix endpoint in secondary DCs	2019-11-18 14:22:19 +00:00
Paul Banks	1197b43c7b	Support Connect CAs that can't cross sign (#6726 ) * Support Connect CAs that can't cross sign * revert spurios mod changes from make tools * Add log warning when forcing CA rotation * Fixup SupportsCrossSigning to report errors and work with Plugin interface (fixes tests) * Fix failing snake_case test * Remove misleading comment * Revert "Remove misleading comment" This reverts commit bc4db9cabed8ad5d0e39b30e1fe79196d248349c. * Remove misleading comment * Regen proto files messed up by rebase	2019-11-11 21:36:22 +00:00
Paul Banks	ca96d5fa72	connect: Allow CA Providers to store small amount of state (#6751 ) * pass logger through to provider * test for proper operation of NeedsLogger * remove public testServer function * Ooops actually set the logger in all the places we need it - CA config set wasn't and causing segfault * Fix all the other places in tests where we set the logger * Allow CA Providers to persist some state * Update CA provider plugin interface * Fix plugin stubs to match provider changes * Update agent/connect/ca/provider.go Co-Authored-By: R.B. Boyer <rb@hashicorp.com> * Cleanup review comments	2019-11-11 20:57:16 +00:00
Todd Radel	19a3892f71	connect: Implement NeedsLogger interface for CA providers (#6556 ) * add NeedsLogger to Provider interface * implements NeedsLogger in default provider * pass logger through to provider * test for proper operation of NeedsLogger * remove public testServer function * Switch test to actually assert on logging output rather than reflection. --amend * Ooops actually set the logger in all the places we need it - CA config set wasn't and causing segfault * Fix all the other places in tests where we set the logger * Add TODO comment	2019-11-11 20:30:01 +00:00
Todd Radel	e100fda218	Make all Connect Cert Common Names valid FQDNs (#6423 )	2019-11-11 17:11:54 +00:00
Matt Keeler	7081643191	Fill the Authz Context with a Sentinel Scope (#6729 )	2019-11-01 17:05:22 -04:00
Matt Keeler	c71ea7056f	Miscellaneous fixes (#6727 )	2019-11-01 16:11:44 -04:00
Paul Banks	5f405c3277	Fix support for RSA CA keys in Connect. (#6638 ) * Allow RSA CA certs for consul and vault providers to correctly sign EC leaf certs. * Ensure key type ad bits are populated from CA cert and clean up tests * Add integration test and fix error when initializing secondary CA with RSA key. * Add more tests, fix review feedback * Update docs with key type config and output * Apply suggestions from code review Co-Authored-By: R.B. Boyer <rb@hashicorp.com>	2019-11-01 13:20:26 +00:00
Matt Keeler	21f98f426e	Add hook for validating the enterprise meta attached to a reque… (#6695 )	2019-10-30 12:42:39 -04:00
Matt Keeler	c2d9041c0f	PreVerify acl:read access for listing endpoints (#6696 ) We still will need to filter results based on the authorizer too but this helps to give an early 403.	2019-10-30 09:10:11 -04:00
Sarah Christoff	86b30bbfbe	Set MinQuorum variable in Autopilot (#6654 ) * Add MinQuorum to Autopilot	2019-10-29 09:04:41 -05:00
Matt Keeler	0fc2c95255	More Replication Abstractions (#6689 ) Also updated ACL replication to use a function to fill in the desired enterprise meta for all remote listing RPCs.	2019-10-28 13:49:57 -04:00
Matt Keeler	87c44a3b8d	Ensure that cache entries for tokens are prefixed “token-secret… (#6688 ) This will be necessary once we store other types of identities in here.	2019-10-25 13:05:43 -04:00
Matt Keeler	a688ea952d	Update the ACL Resolver to allow for Consul Enterprise specific hooks. (#6687 )	2019-10-25 11:06:16 -04:00
Matt Keeler	1270a93274	Updates to allow for Namespacing ACL resources in Consul Enterp… (#6675 ) Main Changes: • method signature updates everywhere to account for passing around enterprise meta. • populate the EnterpriseAuthorizerContext for all ACL related authorizations. • ACL resource listings now operate like the catalog or kv listings in that the returned entries are filtered down to what the token is allowed to see. With Namespaces its no longer all or nothing. • Modified the acl.Policy parsing to abstract away basic decoding so that enterprise can do it slightly differently. Also updated method signatures so that when parsing a policy it can take extra ent metadata to use during rules validation and policy creation. Secondary Changes: • Moved protobuf encoding functions out of the agentpb package to eliminate circular dependencies. • Added custom JSON unmarshalers for a few ACL resource types (to support snake case and to get rid of mapstructure) • AuthMethod validator cache is now an interface as these will be cached per-namespace for Consul Enterprise. • Added checks for policy/role link existence at the RPC API so we don’t push the request through raft to have it fail internally. • Forward ACL token delete request to the primary datacenter when the secondary DC doesn’t have the token. • Added a bunch of ACL test helpers for inserting ACL resource test data.	2019-10-24 14:38:09 -04:00
Freddy	caf658d0d3	Store check type in catalog (#6561 )	2019-10-17 20:33:11 +02:00
R.B. Boyer	e74a6c44f1	server: ensure the primary dc and ACL dc match (#6634 ) This is mostly a sanity check for server tests that skip the normal config builder equivalent fixup.	2019-10-17 10:57:17 -05:00
R.B. Boyer	bc22eb8090	unflake TestLeader_SecondaryCA_Initialize (#6631 )	2019-10-16 16:49:01 -05:00
R.B. Boyer	3ae748c7a4	fix flaky multidc acl tests that failed to wait for token replication (#6628 ) If acls have not yet replicated to the secondary then authz requests will be remotely resolved by the primary. Now these tests explicitly wait until replication has caught up first.	2019-10-16 12:24:29 -05:00
R.B. Boyer	a4c5b8e85c	appease the retry linter (#6629 )	2019-10-16 11:39:22 -05:00
Paul Banks	979ad7fecb	Allow time for secondary CA to initialize (#6627 )	2019-10-16 17:03:31 +01:00
Matt Keeler	f9a43a1e2d	ACL Authorizer overhaul (#6620 ) * ACL Authorizer overhaul To account for upcoming features every Authorization function can now take an extra acl.EnterpriseAuthorizerContext. These are unused in OSS and will always be nil. Additionally the acl package has received some thorough refactoring to enable all of the extra Consul Enterprise specific authorizations including moving sentinel enforcement into the stubbed structs. The Authorizer funcs now return an acl.EnforcementDecision instead of a boolean. This improves the overall interface as it makes multiple Authorizers easily chainable as they now indicate whether they had an authoritative decision or should use some other defaults. A ChainedAuthorizer was added to handle this Authorizer enforcement chain and will never itself return a non-authoritative decision. Include stub for extra enterprise rules in the global management policy * Allow for an upgrade of the global-management policy	2019-10-15 16:58:50 -04:00
R.B. Boyer	9a51ecc98b	agent: clients should only attempt to remove pruned nodes once per call (#6591 )	2019-10-07 16:15:23 -05:00
Sarah Christoff	9b93dd93c9	Prune Unhealthy Agents (#6571 ) * Add -prune flag to ForceLeave	2019-10-04 16:10:02 -05:00

... 3 4 5 6 7 ...

1090 Commits