open-consul

Author	SHA1	Message	Date
Chris Piraino	e9b397005c	Update godoc string for memdb wrapper functions/structs	2020-09-02 10:24:22 -05:00
Chris Piraino	80f923a47a	Refactor state store usage to track unique service names This commit refactors the state store usage code to track unique service name changes on transaction commit. This means we only need to lookup usage entries when reading the information, as opposed to iterating over a large number of service indices. - Take into account a service instance's name being changed - Do not iterate through entire list of service instances, we only care about whether there is 0, 1, or more than 1.	2020-09-02 10:24:21 -05:00
Chris Piraino	79e6534345	Use ReadTxn interface in state store helper functions	2020-09-02 10:24:20 -05:00
Chris Piraino	d90d95421d	Add WriteTxn interface and convert more functions to ReadTxn We add a WriteTxn interface for use in updating the usage memdb table, with the forward-looking prospect of incrementally converting other functions to accept interfaces. As well, we use the ReadTxn in new usage code, and as a side effect convert a couple of existing functions to use that interface as well.	2020-09-02 10:24:19 -05:00
Chris Piraino	45a4057f60	Report node/service usage metrics from every server Using the newly provided state store methods, we periodically emit usage metrics from the servers. We decided to emit these metrics from all servers, not just the leader, because that means we do not have to care about leader election flapping causing metrics turbulence, and it seems reasonable for each server to emit its own view of the state, even if they should always converge rapidly.	2020-09-02 10:24:17 -05:00
Chris Piraino	3af96930eb	Add new usage memdb table that tracks usage counts of various elements We update the usage table on Commit() by using the TrackedChanges() API of memdb. Track memdb changes on restore so that usage data can be compiled	2020-09-02 10:24:16 -05:00
freddygv	d7bda050e0	Restructure structs and other PR comments	2020-09-02 09:10:50 -06:00
Matt Keeler	335c604ced	Merge of auto-config and auto-encrypt code (#8523 ) auto-encrypt is now handled as a special case of auto-config. This also is moving all the cert-monitor code into the auto-config package.	2020-08-31 13:12:17 -04:00
freddygv	afb14b6705	Compile down LB policy to disco chain nodes	2020-08-28 13:11:04 -06:00
Daniel Nephin	845661c8af	Merge pull request #8548 from edevil/fix_flake Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve	2020-08-28 15:10:55 -04:00
R.B. Boyer	f2b8bf109c	xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569 )	2020-08-27 12:20:58 -05:00
Matt Keeler	106e1d50bd	Move RPC router from Client/Server and into BaseDeps (#8559 ) This will allow it to be a shared component which is needed for AutoConfig	2020-08-27 11:23:52 -04:00
André Cruz	673bd69f36	Decrease test flakiness Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve and TestCacheNotifyPolling	2020-08-24 20:30:02 +01:00
André Cruz	a64686fab6	testing: Fix govet errors	2020-08-21 18:01:55 +01:00
Hans Hasselberg	02de4c8b76	add primary keys to list keyring (#8522 ) During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key. Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output: ```json [ { "WAN": true, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "NumNodes": 6 }, { "WAN": false, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 }, { "WAN": false, "Datacenter": "dc1", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 } ] ``` I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later: * add a flag to show the primary keys * add a flag to show json output Fixes #3393.	2020-08-18 09:50:24 +02:00
Daniel Nephin	8d35e37b3c	testing: Remove all the defer os.Removeall Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage the removal	2020-08-14 19:58:53 -04:00
Daniel Nephin	629c34085d	state: remove unused Store method receiver And use ReadTxn interface where appropriate.	2020-08-13 11:25:22 -04:00
Daniel Nephin	fc797a279a	Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown agent/consul: Remove NotifyShutdown	2020-08-13 11:16:48 -04:00
Daniel Nephin	d8ffcd5686	Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake state: speed up tests that use watchLimit	2020-08-13 11:16:12 -04:00
R.B. Boyer	63422ca9c5	connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470 ) Fixes #8466 Since Consul 1.8.0 there was a bug in how ingress gateway protocol compatibility was enforced. At the point in time that an ingress-gateway config entry was modified the discovery chain for each upstream was checked to ensure the ingress gateway protocol matched. Unfortunately future modifications of other config entries were not validated against existing ingress-gateway definitions, such as: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. create service-defaults for 'api' setting protocol=http (worked, but not ok) 3. create service-splitter or service-router for 'api' (worked, but caused an agent panic) If you were to do these in a different order, it would fail without a crash: 1. create service-defaults for 'api' setting protocol=http (ok) 2. create service-splitter or service-router for 'api' (ok) 3. create tcp ingress-gateway pointing to 'api' (fail with message about protocol mismatch) This PR introduces the missing validation. The two new behaviors are: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat) 3. (NEW) create service-splitter or service-router for 'api' (fail with message about protocol mismatch) In consideration for any existing users that may be inadvertently be falling into item (2) above, that is now officiall a valid configuration to be in. For anyone falling into item (3) above while you cannot use the API to manufacture that scenario anymore, anyone that has old (now bad) data will still be able to have the agent use them just enough to generate a new agent/proxycfg error message rather than a panic. Unfortunately we just don't have enough information to properly fix the config entries.	2020-08-12 11:19:20 -05:00
Hans Hasselberg	7a6d916ddc	Merge pull request #8471 from hashicorp/local_only thread local-only through the layers	2020-08-12 08:54:51 +02:00
Freddy	50fee12d62	Internal endpoint to query intentions associated with a gateway (#8400 )	2020-08-11 17:20:41 -06:00
Kyle Havlovitz	8118e3db40	Fix a state store comment about version	2020-08-11 13:46:12 -07:00
Kyle Havlovitz	2601585017	fsm: Fix snapshot bug with restoring node/service/check indexes	2020-08-11 11:49:52 -07:00
Hans Hasselberg	e0297b6e99	Refactor keyring ops: * changes some functions to return data instead of modifying pointer arguments * renames globalRPC() to keyringRPCs() to make its purpose more clear * restructures KeyringOperation() to make it more understandable	2020-08-11 13:42:03 +02:00
freddygv	6dcfa11c21	Update error handling	2020-08-10 17:48:22 -06:00
Daniel Nephin	bef9348ca8	testing: remove unnecessary defers in tests The data directory is now removed by the test helper that created it.	2020-08-07 17:28:16 -04:00
Daniel Nephin	f3b63514d5	testing: Remove NotifyShutdown NotifyShutdown was only used for testing. Now that t.Cleanup exists, we can use that instead of attaching cleanup to the Server shutdown. The Autopilot test which used NotifyShutdown doesn't need this notification because Shutdown is synchronous. Waiting for the function to return is equivalent.	2020-08-07 17:14:44 -04:00
Hans Hasselberg	fdceb24323	auto_config implies connect (#8433 )	2020-08-07 12:02:02 +02:00
freddygv	83f4e32376	PR comments and addtl tests	2020-08-05 16:07:11 -06:00
Daniel Nephin	061ae94c63	Rename NewClient/NewServer Now that duplicate constructors have been removed we can use the shorter names for the single constructor.	2020-08-05 14:00:55 -04:00
Daniel Nephin	e6c94c1411	Remove LogOutput from Server	2020-08-05 14:00:44 -04:00
Daniel Nephin	fdf966896f	Remove LogOutput from Client	2020-08-05 14:00:42 -04:00
Daniel Nephin	73493ca01b	Pass a logger to ConnPool and yamux, instead of an io.Writer Allowing us to remove the LogOutput field from config.	2020-08-05 13:25:08 -04:00
Daniel Nephin	c7c941811d	config: Remove unused field	2020-08-05 13:25:08 -04:00
freddygv	c87af29506	collect GatewayServices from iter in a function	2020-07-31 13:30:40 -06:00
Freddy	7c2c8815d7	Avoid panics during shutdown routine (#8412 )	2020-07-30 11:11:10 -06:00
freddygv	94d1f0a310	end to end changes to pass gatewayservices to /ui/services/	2020-07-30 10:21:11 -06:00
Matt Keeler	76add4f24c	Allow setting verify_incoming* when using auto_encrypt or auto_config (#8394 ) Ensure that enabling AutoConfig sets the tls configurator properly This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.	2020-07-30 10:15:12 -04:00
Matt Keeler	dad0f189a2	Agent Auto Config: Implement Certificate Generation (#8360 ) Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server. This also refactors the auto-config package a bit to split things out into multiple files.	2020-07-28 15:31:48 -04:00
Matt Keeler	3a1058a06b	Move connect root retrieval and cert signing logic out of the RPC endpoints (#8364 ) The code now lives on the Server type itself. This was done so that all of this could be shared with auto config certificate signing.	2020-07-24 10:00:51 -04:00
Matt Keeler	e7d8a02ae8	Move generation of the CA Configuration from the agent code into a method on the RuntimeConfig (#8363 ) This allows this to be reused elsewhere.	2020-07-23 16:05:28 -04:00
Daniel Nephin	597dcf2bfb	Merge pull request #8323 from hashicorp/dnephin/add-event-publisher-2 stream: close subscriptions on shutdown	2020-07-23 13:12:50 -04:00
Matt Keeler	c3e7d689b7	Refactor the agentpb package (#8362 ) First move the whole thing to the top-level proto package name. Secondly change some things around internally to have sub-packages.	2020-07-23 11:24:20 -04:00
Daniel Nephin	decba06b7d	stream: close all subs when EventProcessor is shutdown.	2020-07-22 19:04:10 -04:00
Daniel Nephin	e802689bbe	stream: fix overallocation in filter And add tests	2020-07-22 19:04:10 -04:00
Daniel Nephin	f64725f7aa	state: speed up TestStateStore_ServicesByNodeMeta Make watchLimit a var so that we can patch it in tests and reduce the time spent creating state.	2020-07-22 16:57:06 -04:00
Daniel Nephin	a44ddea9ba	state: Use subtests in TestStateStore_ServicesByNodeMeta These subtests make it much easier to identify the slow part of the test, but they also help enumerate all the different cases which are being tested.	2020-07-22 16:39:09 -04:00
Daniel Nephin	6d3b042872	Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs testutil: NewLogBuffer - buffer logs until a test fails	2020-07-21 15:21:52 -04:00
Matt Keeler	8ea8a939f0	Merge pull request #8311 from hashicorp/bugfix/auto-encrypt-token-update	2020-07-21 13:15:27 -04:00
Daniel Nephin	80ff174880	testutil: NewLogBuffer - buffer logs until a test fails Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race. The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: https://github.com/golang/go/issues/38458 (may be fixed in go1.15) https://github.com/golang/go/issues/38382#issuecomment-612940030	2020-07-21 12:50:40 -04:00
Matt Keeler	133a6d99f2	Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint. This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality. Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.	2020-07-21 12:19:25 -04:00
Daniel Nephin	7599e280de	stream: handle empty event in TestEventSnapshot When the race detector is enabled we see this test fail occasionally. The reordering of execution seems to make it possible for the snapshot splice to happen before any events are published to the topicBuffers. We can handle this case in the test the same way it is handled by a subscription, by proceeding to the next event.	2020-07-20 18:20:02 -04:00
Daniel Nephin	75f10fb191	state: update calls that are no longer state methods In a previous commit these methods were changed to functions, so remove the Store paramter.	2020-07-16 15:46:10 -04:00
Daniel Nephin	3fcb2e16f4	state: un-method funcs that don't use their receiver This change was mostly automated with the following First generate a list of functions with: git grep -o 'Store) $[^(]\+$(tx \txn' ./agent/consul/state \| awk '{print $2}' \| grep -o '^[^(]\+' Then the list was curated a bit with trial/error to remove and add funcs as necessary. Finally the replacement was done with: dir=agent/consul/state file=${1-funcnames} while read fn; do echo "$fn" sed -i -e "s/(s \Store) $fn(/$fn(/" $dir/.go sed -i -e "s/s\.$fn(/$fn(/" $dir/.go sed -i -e "s/s\.store\.$fn(/$fn(/" $dir/*.go done < $file	2020-07-16 15:30:39 -04:00
Daniel Nephin	edb0a4f1f8	store: convert methods that don't use their receiver to functions Making these functions allows them to be used without introducing an artificial dependency on the struct. Many of these will be called from streaming Event processors, which do not have a store. This change is being made ahead of the streaming work to get to reduce the size of the streaming diff.	2020-07-16 15:30:10 -04:00
Daniel Nephin	a2f8605c66	stream: Add forceClose and refactor subscription filtering Move the subscription context to Next. context.Context should generally never be stored in a struct because it makes that struct only valid while the context is valid. This is rarely obvious from the caller. Adds a forceClosed channel in place of the old context, and uses the new context as a way for the caller to stop the Subscription blocking. Remove some recursion out of bufferImte.Next. The caller is already looping so we can continue in that loop instead of recursing. This ensures currentItem is updated immediately (which probably does not matter in practice), and also removes the chance that we overflow the stack. NextNoBlock and FollowAfter do not need to handle bufferItem.Err, the caller already handles it. Moves filter to a method to simplify Next, and more explicitly separate filtering from looping. Also improve some godoc Only unwrap itemBuffer.Err when necessary	2020-07-14 15:57:47 -04:00
Daniel Nephin	2595436f62	stream: Improve docstrings Also rename ResumeStrema to EndOfEmptySnapshot to be more consistent with other framing events Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:47 -04:00
Daniel Nephin	16a2b3fafc	stream: change Topic to an interface Consumers of the package can decide on which type to use for the Topic. In the future we may use a gRPC type for the topic.	2020-07-14 15:57:47 -04:00
Daniel Nephin	aa571bd0ce	state: Move change processing out of EventPublisher EventPublisher was receiving TopicHandlers, which had a couple of problems: - ChangeProcessors were being grouped by Topic, but they completely ignored the topic and were performed on every change - ChangeProcessors required EventPublisher to be aware of database changes By moving ChangeProcesors out of EventPublisher, and having Publish accept events instead of changes, EventPublisher no longer needs to be aware of these things. Handlers is now only SnapshotHandlers, which are still mapped by Topic. Also allows us to remove the small 'db' package that had only two types. They can now be unexported types in state.	2020-07-14 15:57:47 -04:00
Daniel Nephin	23a940daad	server: Abandom state store to shutdown EventPublisher So that we don't leak goroutines	2020-07-14 15:57:47 -04:00
Daniel Nephin	e1305fe80c	stream: unexport identifiers Now that EventPublisher is part of stream a lot of the internals can be hidden	2020-07-14 15:57:47 -04:00
Daniel Nephin	9e37894778	stream: Move EventPublisher to stream package The EventPublisher is the central hub of the PubSub system. It is toughly coupled with much of stream. Some stream internals were exported exclusively for EventPublisher. The two Subscribe cases (with or without index) were also awkwardly split between two packages. By moving EventPublisher into stream they are now both in the same package (although still in different files).	2020-07-14 15:57:47 -04:00
Daniel Nephin	6e87e83d77	state: Make handleACLUpdate async once again So that we keep as much as possible out of the FSM commit hot path.	2020-07-14 15:57:47 -04:00
Daniel Nephin	a92dab724d	state: Use interface for Txn Also store the index in Changes instead of the Txn. This change is in preparation for movinng EventPublisher to the stream package, and making handleACLUpdates async once again.	2020-07-14 15:57:46 -04:00
Daniel Nephin	c778d61b6a	stream.Subscription unexport fields and additiona docstrings	2020-07-14 15:57:46 -04:00
Daniel Nephin	37a38629d7	Add a context for stopping EventPublisher goroutine	2020-07-14 15:57:46 -04:00
Daniel Nephin	02bc5a26e4	EventPublisher: Make Unsubscribe a function on Subscription It is critical that Unsubscribe be called with the same pointer to a SubscriptionRequest that was used to create the Subscription. The docstring made that clear, but it sill allowed a caler to get it wrong by creating a new SubscriptionRequest. By hiding this detail from the caller, and only exposing an Unsubscribe method, it should be impossible to fail to Unsubscribe. Also update some godoc strings.	2020-07-14 15:57:46 -04:00
Daniel Nephin	86976cf23c	EventPublisher: handleACL changes synchronously Use a separate lock for subscriptions.ByToken to allow it to happen synchronously in the commit flow. This removes the need to create a new txn for the goroutine, and removes the need for EventPublisher to contain a reference to DB.	2020-07-14 15:57:46 -04:00
Daniel Nephin	606121fae6	stream.EventSnapshot: reduce the fields on the struct Many of the fields are only needed in one place, and by using a closure they can be removed from the struct. This reduces the scope of the variables making it esier to see how they are used.	2020-07-14 15:57:45 -04:00
Daniel Nephin	7196917051	stream.EventBuffer: Seed the fuzz test with time.Now() Otherwise the test will run with exactly the same values each time. By printing the seed we can attempt to reproduce the test by adding an env var to override the seed	2020-07-14 15:57:45 -04:00
Daniel Nephin	525b275a52	state: memdb_wrapper.go -> memdb.go Renaming in a separate commit so that git can merge changes to the file.	2020-07-14 15:57:45 -04:00
Daniel Nephin	b5d2bea770	state: publish changes from Commit Make topicRegistry use functions instead of unbound methods Use a regular memDB in EventPublisher to remove a reference cycle Removes the need for EventPublisher to use a store	2020-07-14 15:57:45 -04:00
Daniel Nephin	f626c3d6c5	EventPublisher: docstrings and getTopicBuffer also rename commitCh -> publishCh	2020-07-14 15:57:45 -04:00
Daniel Nephin	2020e9c7c7	ProcessChanges: use stream.Event Also remove secretHash, which was used to hash tokens. We don't expose these tokens anywhere, so we can use the string itself instead of a Hash. Fix acl_events_test.go for storing a structs type.	2020-07-14 15:57:45 -04:00
Daniel Nephin	2e45bbbb3e	stream: Use local types for Event Topic SubscriptionRequest	2020-07-14 15:57:45 -04:00
Daniel Nephin	3d62013062	Rename stream_publisher.go -> event_publisher.go	2020-07-14 15:57:44 -04:00
Daniel Nephin	526fb53f85	Add streaming package with Subscription and Snapshot components. The remaining files from 7965767de0bd62ab07669b85d6879bd5f815d157 Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:44 -04:00
Chris Piraino	b80cbb499f	Set enterprise metadata after resolving the token (#8302 ) The token can encode enterprise metadata information, and we must make sure we set that on the reply so that we can correct filter ACLs.	2020-07-13 13:39:57 -05:00
Daniel Nephin	13e0d258b5	Merge pull request #8237 from hashicorp/dnephin/remove-acls-enabled-from-delegate Remove ACLsEnabled from delegate interface	2020-07-09 16:35:43 -04:00
Matt Keeler	39d9babab3	Pass the Config and TLS Configurator into the AutoConfig constructor This is instead of having the AutoConfigBackend interface provide functions for retrieving them. NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.	2020-07-08 12:36:11 -04:00
Matt Keeler	a77ed471c8	Rename (Server).forward to (Server).ForwardRPC Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.	2020-07-08 11:05:44 -04:00
Matt Keeler	386ec3a2a2	Refactor AutoConfig RPC to not have a direct dependency on the Server type Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.	2020-07-08 11:05:44 -04:00
Daniel Nephin	8b6036c077	Remove ACLsEnabled from delegate interface In all cases (oss/ent, client/server) this method was returning a value from config. Since the value is consistent, it doesn't need to be part of the delegate interface.	2020-07-03 17:00:20 -04:00
Daniel Nephin	dfa8856e5f	agent/consul: Add support for NotModified to two endpoints A query made with AllowNotModifiedResponse and a MinIndex, where the result has the same Index as MinIndex, will return an empty response with QueryMeta.NotModified set to true. Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-02 17:05:46 -04:00
Matt Keeler	87764e5bfb	Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various	2020-07-02 09:49:49 -04:00
Yury Evtikhov	dbf3c05fa5	DNS: add IsErrQueryNotFound function for easier error evaluation	2020-07-01 03:41:44 +01:00
Matt Keeler	a97f9ff386	Overwrite agent leaf cert trust domain on the servers	2020-06-30 09:59:08 -04:00
Matt Keeler	5600069d69	Store the Connect CA rate limiter on the server This fixes a bug where auto_encrypt was operating without utilizing a common rate limiter.	2020-06-30 09:59:07 -04:00
Matt Keeler	fa42d9b34f	Fix auto_encrypt IP/DNS SANs The initial auto encrypt CSR wasn’t containing the user supplied IP and DNS SANs. This fixes that. Also We were configuring a default :: IP SAN. This should be ::1 instead and was fixed.	2020-06-30 09:59:07 -04:00
R.B. Boyer	72a515f5ec	connect: various changes to make namespaces for intentions work more like for other subsystems (#8194 ) Highlights: - add new endpoint to query for intentions by exact match - using this endpoint from the CLI instead of the dump+filter approach - enforcing that OSS can only read/write intentions with a SourceNS or DestinationNS field of "default". - preexisting OSS intentions with now-invalid namespace fields will delete those intentions on initial election or for wildcard namespaces an attempt will be made to downgrade them to "default" unless one exists. - also allow the '-namespace' CLI arg on all of the intention subcommands - update lots of docs	2020-06-26 16:59:15 -05:00
Daniel Nephin	7d5f1ba6bd	Merge pull request #8176 from hashicorp/dnephin/add-linter-unparam-1 lint: add unparam linter and fix some of the issues	2020-06-25 15:34:48 -04:00
Matt Keeler	d471977f62	Fix go routine leak in auto encrypt ca roots tracking	2020-06-24 17:09:50 -04:00
Matt Keeler	90e741c6d2	Allow cancelling blocking queries in response to shutting down.	2020-06-24 17:09:50 -04:00
Daniel Nephin	07c1081d39	Fix a bunch of unparam lint issues	2020-06-24 13:00:14 -04:00
Matt Keeler	341aedbce9	Ensure that retryLoopBackoff can be cancelled We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways. Fix go routine leak in the gateway locator	2020-06-24 12:41:08 -04:00
Matt Keeler	9dc9f7df15	Allow cancelling startup when performing auto-config (#8157 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2020-06-19 15:16:00 -04:00
Daniel Nephin	b5ef9b7ea9	Remove bytesToUint64 from agent/consul	2020-06-18 12:45:43 -04:00
Daniel Nephin	81bc082b63	Remove unused private IP code from agent/consul	2020-06-18 12:40:38 -04:00
Matt Keeler	2c7844d220	Implement Client Agent Auto Config There are a couple of things in here. First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism. This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.	2020-06-17 16:49:46 -04:00
Matt Keeler	f5d57ccd48	Allow the Agent its its child Client/Server to share a connection pool This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.	2020-06-17 16:19:33 -04:00
Matt Keeler	8c601ad8db	Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc	2020-06-17 16:07:25 -04:00
Chris Piraino	79d003d395	Remove ACLEnforceVersion8 from tests (#8138 ) The field had been deprecated for a while and was recently removed, however a PR which added these tests prior to removal was merged.	2020-06-17 14:58:01 -05:00
Daniel Nephin	1ef8279ac9	Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4 ci: enable SA4006 staticcheck check and add ineffassign	2020-06-17 12:16:02 -04:00
Matt Keeler	eda8cb39fd	Implement the insecure version of the Cluster.AutoConfig RPC endpoint Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.	2020-06-17 11:25:29 -04:00
Pierre Souchay	f7a1189dba	gossip: Ensure that metadata of Consul Service is updated (#7903 ) While upgrading servers to a new version, I saw that metadata of existing servers are not upgraded, so the version and raft meta is not up to date in catalog. The only way to do it was to: * update Consul server * make it leave the cluster, then metadata is accurate That's because the optimization to avoid updating catalog does not take into account metadata, so no update on catalog is performed.	2020-06-17 12:16:13 +02:00
Daniel Nephin	8753d1f1ba	ci: Add ineffsign linter And fix an additional ineffective assignment that was not caught by staticcheck	2020-06-16 17:32:50 -04:00
Daniel Nephin	97342de262	Merge pull request #8070 from hashicorp/dnephin/add-gofmt-simplify ci: Enable gofmt simplify	2020-06-16 17:18:38 -04:00
Matt Keeler	d994dc7b35	Agent Auto Configuration: Configuration Syntax Updates (#8003 )	2020-06-16 15:03:22 -04:00
Daniel Nephin	89d95561df	Enable gofmt simplify Code changes done automatically with 'gofmt -s -w'	2020-06-16 13:21:11 -04:00
Daniel Nephin	5f24171f13	ci: enable SA4006 staticcheck check And fix the 'value not used' issues. Many of these are not bugs, but a few are tests not checking errors, and one appears to be a missed error in non-test code.	2020-06-16 13:10:11 -04:00
Daniel Nephin	71e6534061	Rename txnWrapper to txn	2020-06-16 13:06:02 -04:00
Daniel Nephin	537ae1fd46	Rename db	2020-06-16 13:04:31 -04:00
Daniel Nephin	78c76f0773	Handle return value from txn.Commit	2020-06-16 13:04:31 -04:00
Daniel Nephin	50db8f409a	state: Update docstrings for changeTrackerDB and txn And un-embed memdb.DB to prevent accidental access to underlying methods.	2020-06-16 13:04:31 -04:00
Paul Banks	f9a6386c4a	state: track changes so that they may be used to produce change events	2020-06-16 13:04:29 -04:00
Matt Keeler	cdc4b20afa	ACL Node Identities (#7970 ) A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy). Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.	2020-06-16 12:54:27 -04:00
freddygv	cc4ff3ae02	Fixup stray sid references	2020-06-12 13:47:43 -06:00
freddygv	1e7e716742	Move compound service names to use ServiceName type	2020-06-12 13:47:43 -06:00
freddygv	806b1fb608	Move GatewayServices out of Internal	2020-06-12 13:46:47 -06:00
Daniel Nephin	6719f1a6fa	Merge pull request #7900 from hashicorp/dnephin/add-linter-staticcheck-2 intentions: fix a bug in Intention.SetHash	2020-06-09 15:40:20 -04:00
Daniel Nephin	5f14eb124c	Merge pull request #8037 from hashicorp/dnephin/add-linter-staticcheck-5 ci: Enabled SA2002 staticcheck check	2020-06-09 15:31:24 -04:00
Hans Hasselberg	7404712854	acl: do not resolve local tokens from remote dcs (#8068 )	2020-06-09 21:13:09 +02:00
Hans Hasselberg	bec21c849d	Tokens converted from legacy ACLs get their Hash computed (#8047 ) * Fixes #5606: Tokens converted from legacy ACLs get their Hash computed This allows new style token replication to work for legacy tokens as well when they change. * tests: fix timestamp comparison Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>	2020-06-08 21:44:06 +02:00
Daniel Nephin	1cdfc4f290	ci: Enabled SA2002 staticcheck check And handle errors in the main test goroutine	2020-06-05 17:50:11 -04:00
Daniel Nephin	b9e4544ec3	intentions: fix a bug in Intention.SetHash Found using staticcheck. binary.Write does not accept int types without a size. The error from binary.Write was ignored, so we never saw this error. Casting the data to uint64 produces a correct hash. Also deprecate the Default{Addr,Port} fields, and prevent them from being encoded. These fields will always be empty and are not used. Removing these would break backwards compatibility, so they are left in place for now. Co-authored-by: Hans Hasselberg <me@hans.io>	2020-06-05 14:51:43 -04:00
R.B. Boyer	3ad570ba99	server: don't activate federation state replication or anti-entropy until all servers are running 1.8.0+ (#8014 )	2020-06-04 16:05:27 -05:00
Hans Hasselberg	dd8cd9bc24	Merge pull request #7966 from hashicorp/pool_improvements Agent connection pool cleanup	2020-06-04 08:56:26 +02:00
Matt Keeler	2c615807af	Fix legacy management tokens in unupgraded secondary dcs (#7908 ) The ACL.GetPolicy RPC endpoint was supposed to return the “parent” policy and not always the default policy. In the case of legacy management tokens the parent policy was supposed to be “manage”. The result of us not sending this properly was that operations that required specifically a management token such as saving a snapshot would not work in secondary DCs until they were upgraded.	2020-06-03 11:22:22 -04:00
Matt Keeler	9fa9ec4ba0	Fix segfault due to race condition for checking server versions (#7957 ) The ACL monitoring routine uses c.routers to check for server version updates. Therefore it needs to be started after initializing the routers.	2020-06-03 10:36:32 -04:00
Daniel Nephin	e8a883e829	Replace goe/verify.Values with testify/require.Equal (#7993 ) * testing: replace most goe/verify.Values with require.Equal One difference between these two comparisons is that go/verify considers nil slices/maps to be equal to empty slices/maps, where as testify/require does not, and does not appear to provide any way to enable that behaviour. Because of this difference some expected values were changed from empty slices to nil slices, and some calls to verify.Values were left. * Remove github.com/pascaldekloe/goe/verify Reduce the number of assertion packages we use from 2 to 1	2020-06-02 12:41:25 -04:00
R.B. Boyer	7bd7895047	acl: allow auth methods created in the primary datacenter to optionally create global tokens (#7899 )	2020-06-01 11:44:47 -05:00
R.B. Boyer	16db20b1f3	acl: remove the deprecated `acl_enforce_version_8` option (#7991 ) Fixes #7292	2020-05-29 16:16:03 -05:00
Jono Sosulska	7a13c96a2a	Replace whitelist/blacklist terminology with allowlist/denylist (#7971 ) * Replace whitelist/blacklist terminology with allowlist/denylist	2020-05-29 14:19:16 -04:00
Hans Hasselberg	1ed91cbdf6	pool: remove timeout parameter Timeout was never used in a meaningful way by callers, which is why it is now entirely internal to the pool.	2020-05-29 08:21:28 +02:00
Hans Hasselberg	5cda505495	pool: remove useTLS and ForceTLS In the past TLS usage was enforced with these variables, but these days this decision is made by TLSConfigurator and there is no reason to keep using the variables.	2020-05-29 08:21:24 +02:00
Hans Hasselberg	9ef44ec3da	pool: remove version The version field has been used to decide which multiplexing to use. It was introduced in 2457293dceec95ecd12ef4f01442e13710ea131a. But this is 6y ago and there is no need for this differentiation anymore.	2020-05-28 23:06:01 +02:00
Daniel Nephin	ea6c2b2adc	ci: Add staticcheck and fix most errors Three of the checks are temporarily disabled to limit the size of the diff, and allow us to enable all the other checks in CI. In a follow up we can fix the issues reported by the other checks one at a time, and enable them.	2020-05-28 11:59:58 -04:00
R.B. Boyer	54c7f825d6	create lib/stringslice package (#7934 )	2020-05-27 11:47:32 -05:00
R.B. Boyer	813d69622e	agent: handle re-bootstrapping in a secondary datacenter when WAN federation via mesh gateways is configured (#7931 ) The main fix here is to always union the `primary-gateways` list with the list of mesh gateways in the primary returned from the replicated federation states list. This will allow any replicated (incorrect) state to be supplemented with user-configured (correct) state in the config file. Eventually the game of random selection whack-a-mole will pick a winning entry and re-replicate the latest federation states from the primary. If the user-configured state is actually the incorrect one, then the same eventual correct selection process will work in that case, too. The secondary fix is actually to finish making wanfed-via-mgws actually work as originally designed. Once a secondary datacenter has replicated federation states for the primary AND managed to stand up its own local mesh gateways then all of the RPCs from a secondary to the primary SHOULD go through two sets of mesh gateways to arrive in the consul servers in the primary (one hop for the secondary datacenter's mesh gateway, and one hop through the primary datacenter's mesh gateway). This was neglected in the initial implementation. While everything works, ideally we should treat communications that go around the mesh gateways as just provided for bootstrapping purposes. Now we heuristically use the success/failure history of the federation state replicator goroutine loop to determine if our current mesh gateway route is working as intended. If it is, we try using the local gateways, and if those don't work we fall back on trying the primary via the union of the replicated state and the go-discover configuration flags. This can be improved slightly in the future by possibly initializing the gateway choice to local on startup if we already have replicated state. This PR does not address that improvement. Fixes #7339	2020-05-27 11:31:10 -05:00
R.B. Boyer	7e42819a71	connect: ensure proxy-defaults protocol is used for upstreams (#7938 )	2020-05-21 16:08:39 -05:00
Daniel Nephin	f9a89db86e	Update agent/consul/state/catalog.go Co-authored-by: Hans Hasselberg <me@hans.io>	2020-05-20 16:34:14 -04:00
Daniel Nephin	e1e1c13b35	state: use an error to indicate compare failed Errors are values. We can use the error value to identify the 'comparison failed' case which makes the function easier to use and should make it harder to miss handle the error case	2020-05-20 12:43:33 -04:00
Pierre Souchay	3b548f0d77	Allow to restrict servers that can join a given Serf Consul cluster. (#7628 ) Based on work done in https://github.com/hashicorp/memberlist/pull/196 this allows to restrict the IP ranges that can join a given Serf cluster and be a member of the cluster. Restrictions on IPs can be done separatly using 2 new differents flags and config options to restrict IPs for LAN and WAN Serf.	2020-05-20 11:31:19 +02:00
Daniel Nephin	6e3a7b0aa8	consul/state: refactor tnxService to avoid missed cases Handling errors at the end of a log switch/case block is somewhat brittle. This block included a couple cases where errors were ignored, but it was not obvious the way it was written. This change moves all error handling into each case block. There is still potentially one case where err is ignored, which will be handled in a follow up.	2020-05-19 16:50:14 -04:00
Daniel Nephin	545bd766e7	Fix a number of problems found by staticcheck Some of these problems are minor (unused vars), but others are real bugs (ignored errors). Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>	2020-05-19 16:50:14 -04:00
Daniel Nephin	aa8009ee45	Remove unused var The usage of this var was removed in b92f895c233bf8a99ae35361117a90416fea29b5. Found by using staticcheck	2020-05-19 16:50:14 -04:00
Chris Piraino	ce099c9aca	Do not return an error if requested service is not a gateway This commit converts the previous error into just a Warn-level log message. By returning an error when the requested service was not a gateway, we did not appropriately update envoy because the cache Fetch returned an error and thus did not propagate the update through proxycfg and xds packages.	2020-05-18 09:08:04 -05:00
Aleksandr Zagaevskiy	75f0607d3b	Preserve ModifyIndex for unchanged entry in KVS TXN (#7832 )	2020-05-14 13:25:04 -06:00
Matt Keeler	849eedd142	Fix identity resolution on clients and in secondary dcs (#7862 ) Previously this happened to be using the method on the Server/Client that was meant to allow the ACLResolver to locally resolve tokens. On Servers that had tokens (primary or secondary dc + token replication) this function would lookup the token from raft and return the ACLIdentity. On clients this was always a noop. We inadvertently used this function instead of creating a new one when we added logging accessor ids for permission denied RPC requests. With this commit, a new method is used for resolving the identity properly via the ACLResolver which may still resolve locally in the case of being on a server with tokens but also supports remote token resolution.	2020-05-13 13:00:08 -04:00

1 2 3 4 5 ...

1037 commits