open-consul

Author	SHA1	Message	Date
Daniel Nephin	87f0eb6790	agent/grpc: seed the rand for shuffling servers	2020-09-24 12:53:14 -04:00
Daniel Nephin	ff3610850e	agent/grpc: use router.Manager to handle the rebalance The router.Manager is already rebalancing servers for other connection pools, so it can call into our resolver to do the same. This change allows us to remove the serf dependency from resolverBuilder, and remove Datacenter from the config. Also revert the change to refreshServerRebalanceTimer	2020-09-24 12:53:14 -04:00
Daniel Nephin	5e7b9bffdb	grpc: restore integration tests for grpc client conn pool Add a fake rpc Listener	2020-09-24 12:53:14 -04:00
Daniel Nephin	e160475fba	router: remove grpcServerTracker from managers It only needs to be refereced from the Router, because there is only 1 instance, and the Router can call AddServer/RemoveServer like it does on the Manager.	2020-09-24 12:53:14 -04:00
Daniel Nephin	4b041a018d	grpc: redeuce dependencies, unexport, and add godoc Rename GRPCClient to ClientConnPool. This type appears to be more of a conn pool than a client. The clients receive the connections from this pool. Reduce some dependencies by adjusting the interface baoundaries. Remove the need to create a second slice of Servers, just to pick one and throw the rest away. Unexport serverResolver, it is not used outside the package. Use a RWMutex for ServerResolverBuilder, some locking is read-only. Add more godoc.	2020-09-24 12:53:10 -04:00
Daniel Nephin	4b24470887	grpc: move client conn pool to grpc package	2020-09-24 12:48:12 -04:00
Daniel Nephin	fad15171ec	grpc: client conn pool and resolver Extracted from 936522a13c07e8b732b6fde61bba23d05f7b9a70 Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-24 12:46:22 -04:00
Daniel Nephin	e0119a6e92	Merge pull request #8680 from hashicorp/dnephin/replace-consul-opts-with-base-deps agent: Repalce ConsulOptions with a new struct from agent.BaseDeps	2020-09-24 12:45:54 -04:00
Paul Banks	0594667c3a	Fix bad int -> string conversions caught by go vet changes in 1.15 (#8739 )	2020-09-24 11:14:07 +01:00
Alexander Mykolaichuk	e4f9c19772	added permission denied error message (#8044 )	2020-09-22 20:36:07 +02:00
Hans Hasselberg	d48d2bf550	use service datacenter for dns name (#8704 ) * Use args.Datacenter instead of configured datacenter	2020-09-22 20:34:09 +02:00
Aliaksandr Mianzhynski	c79180980c	Return grpc serving status in health check errors	2020-09-22 21:16:58 +03:00
Daniel Nephin	b3ec7df80f	api: rename HTTPServer to HTTPHandlers Resolves a TODO about naming. This type is a set of handlers for an http.Server, it is not itself a Server. It provides http.Handler functions.	2020-09-18 17:38:23 -04:00
Hans Hasselberg	c6fa758d6f	fix TestLeader_SecondaryCA_IntermediateRenew (#8702 ) * fix lessThanHalfTime * get lock for CAProvider() * make a var to relate both vars * rename to getCAProviderWithLock * move CertificateTimeDriftBuffer to agent/connect/ca	2020-09-18 10:13:29 +02:00
Daniel Nephin	2d3540d6b5	Merge pull request #8620 from hashicorp/dnephin/better-impl-of-TestAgent.HTTPAddr http: fix tests incorrectly using HTTPAddr to get the address of the https server	2020-09-17 11:48:57 -04:00
Mike Morris	fe984b3ee3	test: update tags for database service registrations and queries (#8693 )	2020-09-16 14:05:01 -04:00
Kyle Havlovitz	8f83f7ac13	Merge pull request #8560 from hashicorp/vault-ca-renew-token Automatically renew the token used by the Vault CA provider	2020-09-16 07:30:30 -07:00
Daniel Nephin	9f83eb3dc9	Merge pull request #8685 from pierresouchay/do_not_flood_logs_with_Non-server_in_server-only_area [BUGFIX] Avoid GetDatacenter* methods to flood Consul servers logs	2020-09-15 17:57:05 -04:00
Kyle Havlovitz	c8fd61abc7	Merge branch 'master' into vault-ca-renew-token	2020-09-15 14:39:04 -07:00
Daniel Nephin	c621b4a420	agent/consul: pass dependencies directly from agent In an upcoming change we will need to pass a grpc.ClientConnPool from BaseDeps into Server. While looking at that change I noticed all of the existing consulOption fields are already on BaseDeps. Instead of duplicating the fields, we can create a struct used by agent/consul, and use that struct in BaseDeps. This allows us to pass along dependencies without translating them into different representations. I also looked at moving all of BaseDeps in agent/consul, however that created some circular imports. Resolving those cycles wouldn't be too bad (it was only an error in agent/consul being imported from cache-types), however this change seems a little better by starting to introduce some structure to BaseDeps. This change is also a small step in reducing the scope of Agent. Also remove some constants that were only used by tests, and move the relevant comment to where the live configuration is set. Removed some validation from NewServer and NewClient, as these are not really runtime errors. They would be code errors, which will cause a panic anyway, so no reason to handle them specially here.	2020-09-15 17:29:32 -04:00
Daniel Nephin	0536b2047e	agent/consul: make router required	2020-09-15 17:26:26 -04:00
Daniel Nephin	49086dd5ae	Merge pull request #8679 from hashicorp/streaming/fix-TestHandler_EmitsStats streaming: Fix TestHandler_EmitsStats	2020-09-15 17:04:55 -04:00
Kyle Havlovitz	316600a685	Update vault CA for latest api client	2020-09-15 13:33:55 -07:00
Paul Banks	0062106c46	Update UI Config passing to not use an inline script (#8645 ) * Update UI Config passing to not use an inline script * Update agent/http.go * Fix incorrect placeholder name	2020-09-15 20:57:37 +01:00
Kyle Havlovitz	63d3a5fc1f	Clean up CA shutdown logic and error	2020-09-15 12:28:58 -07:00
Kyle Havlovitz	6f1dd25139	Merge pull request #8646 from hashicorp/common-intermediate-ttl Move IntermediateCertTTL to common CA config	2020-09-15 12:03:29 -07:00
Pierre Souchay	617e0d2364	[BUGFIX] Avoid GetDatacenter* methods to flood Consul servers logs When calling `GetDatacentersByDistance()` or `GetDatacentersMap()`, an incorrect condition was used to diplay log message, thus flooding Consul's logs. Example of message: ``` [WARN] agent.router: Non-server in server-only area: non_server=myClientNode area=lan ``` This message is only valid for WAN areas, filter to avoid creating hundreds of logs/s on our clusters, each time someone is calling this method. Our logs were flooded by such messages when migrating our Consul servers from 1.7.7 to 1.8.4. This will issue fix #8663	2020-09-15 11:54:59 +02:00
Daniel Nephin	1e40f00567	agent/grpc: make TestHandler_EmitsStats predictable Occasionally this test would flake. The flakes were fixed by: 1. Stopping the service and retrying to check on metrics. This way we also include the active_streams going to 0 in the metric calls. 2. Using a reference to the global Metrics. This way when other tests have background goroutines that are still shutting down, they won't emit metrics to the metric instance with the fake Sink. The stats test can patch the local reference to the global, so the existing statHandlers will continue to emit to the global, but the stats test will send all metrics to the replacement.	2020-09-14 19:05:22 -04:00
Daniel Nephin	c827e7f1e9	grpc: add Datacenter field to testing service response	2020-09-14 19:02:09 -04:00
freddygv	e0db834148	Fix text type assertion	2020-09-14 16:28:40 -06:00
freddygv	43efb4809c	Merge master	2020-09-14 16:17:43 -06:00
freddygv	66e5c5989a	Fix type assertion	2020-09-14 16:12:21 -06:00
Daniel Nephin	75515f3431	Merge pull request #8587 from hashicorp/streaming/add-grpc-server streaming: add gRPC server for handling connections	2020-09-14 15:24:54 -04:00
freddygv	33af8dab9a	Resolve conflicts against master	2020-09-11 18:41:58 -06:00
freddygv	60cb306524	Add session flag to cookie config	2020-09-11 18:34:03 -06:00
freddygv	ae8c609f10	PR comments	2020-09-11 10:49:26 -06:00
Kyle Havlovitz	1595add842	Clean up Vault renew tests and shutdown	2020-09-11 08:41:05 -07:00
freddygv	5871b667a5	Revert EnvoyConfig nesting	2020-09-11 09:21:43 -06:00
Kyle Havlovitz	dc393336d1	Use mapstructure for decoding vault data	2020-09-10 06:31:04 -07:00
Kyle Havlovitz	7588e22739	Add a stop function to make sure the renewer is shut down on leader change	2020-09-10 06:12:48 -07:00
Kyle Havlovitz	c52dfeb633	Move IntermediateCertTTL to common CA config	2020-09-10 00:23:22 -07:00
Kyle Havlovitz	1c57b72a9f	Add a test for token renewal	2020-09-09 16:36:37 -07:00
Daniel Nephin	1a930fcf75	grpc: Add a simple test service for testing the gRPC server	2020-09-08 12:10:43 -04:00
Daniel Nephin	863a9df951	server: add gRPC server for streaming events Includes a stats handler and stream interceptor for grpc metrics. Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-08 12:10:41 -04:00
Daniel Nephin	4eb514a59f	http: fix tests incorrectly using HTTPAddr to get the address of the https server. In #8234 I changed a few tests to use TestAgent.HTTPAddr() to find the addr used in the test. Due to the way HTTPAddr() was implemented these tests were passing, but I think the pass was incidental. HTTPAddr() was not matching any servers, and was instead returning the last server, which happened to be the one these tests wanted. This commit fixes the implementation of HTTPAddr to panic if no match was found. The tests which require an HTTPS server are changed to use a new firstAddr() to look up the correct address.	2020-09-04 15:29:17 -04:00
freddygv	1ee039ed95	Set tgw filter router config name to cluster name	2020-09-04 12:45:05 -06:00
Hans Hasselberg	51f079dcdd	secondaryIntermediateCertRenewalWatch abort on success (#8588 ) secondaryIntermediateCertRenewalWatch was using `retryLoopBackoff` to renew the intermediate certificate. Once it entered the inner loop and started `retryLoopBackoff` it would never leave that. `retryLoopBackoffAbortOnSuccess` will return when renewing is successful, like it was intended originally.	2020-09-04 11:47:16 +02:00
freddygv	3e4bc36941	Add server receiver to routes and log tgw err	2020-09-03 16:19:58 -06:00
Daniel Nephin	670b7cbd99	Merge pull request #8357 from hashicorp/streaming/add-service-health-events streaming: add ServiceHealth events	2020-09-03 17:53:56 -04:00
Daniel Nephin	ec5d20b0de	Merge pull request #8554 from hashicorp/dnephin/agent-setup-persisted-tokens agent: move token persistence from agent into token.Store	2020-09-03 17:29:21 -04:00
Daniel Nephin	c17a5b0628	state: handle terminating gateways in service health events	2020-09-03 16:58:05 -04:00
Daniel Nephin	b241debee7	state: improve comments in catalog_events.go Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:05 -04:00
Daniel Nephin	870823e8ed	state: use changeType in serviceChanges To be a little more explicit, instead of nil implying an indirect change	2020-09-03 16:58:05 -04:00
Daniel Nephin	68682e7e83	don't over allocate slice	2020-09-03 16:58:04 -04:00
Daniel Nephin	5f52220f53	state: fix a bug in building service health events The nodeCheck slice was being used as the first arg in append, which in some cases will modify the array backing the slice. This would lead to service checks for other services in the wrong event. Also refactor some things to reduce the arguments to functions.	2020-09-03 16:58:04 -04:00
Daniel Nephin	c61313b78a	state: Remove unused args and return values Also rename some functions to identify them as constructors for events	2020-09-03 16:58:04 -04:00
Daniel Nephin	668b98bcce	state: use an enum for tracking node changes	2020-09-03 16:58:04 -04:00
Daniel Nephin	7c3c627028	state: serviceHealthSnapshot refactored to remove unused return value and remove duplication	2020-09-03 16:58:04 -04:00
Daniel Nephin	fdfe176deb	state: Add Change processor and snapshotter for service health Co-authored-by: Paul Banks <banks@banksco.de>	2020-09-03 16:58:04 -04:00
Daniel Nephin	6a1a43721d	state: fix bug in changeTrackerDB.publish Creating a new readTxn does not work because it will not see the newly created objects that are about to be committed. Instead use the active write Txn.	2020-09-03 16:58:01 -04:00
Daniel Nephin	81cc3daf69	stream: have SnapshotFunc accept a non-pointer SubscribeRequest The value is not expected to be modified. Passing a value makes that explicit.	2020-09-03 16:54:02 -04:00
freddygv	56fdae9ace	Update resolver defaulting	2020-09-03 13:08:44 -06:00
freddygv	b149185794	Update golden files after default route fix for tgw	2020-09-03 12:35:11 -06:00
Daniel Nephin	330b73725f	agent: add apiServers type for managing HTTP servers Remove Server field from HTTPServer. The field is no longer used.	2020-09-03 13:40:12 -04:00
freddygv	23147c1d5b	Fix http assertion in route creation	2020-09-03 10:21:20 -06:00
freddygv	0c50b8e769	Add explicit protocol overrides in tgw xds test cases	2020-09-03 08:57:48 -06:00
freddygv	02d6acd8fc	Ensure resolver node with LB isn't considered default	2020-09-03 08:55:57 -06:00
freddygv	c4bce2154b	Move valid policies to pkg level	2020-09-02 15:49:03 -06:00
freddygv	daad3b9210	Remove LB infix and move injection to xds	2020-09-02 15:13:50 -06:00
R.B. Boyer	b0bde51e70	connect: all config entries pick up a meta field (#8596 ) Fixes #8595	2020-09-02 14:10:25 -05:00
Chris Piraino	df1381f77f	Merge pull request #8603 from hashicorp/feature/usage-metrics Track node and service counts in the state store and emit them periodically as metrics	2020-09-02 13:23:39 -05:00
R.B. Boyer	4197bed23b	connect: fix bug in preventing some namespaced config entry modifications (#8601 ) Whenever an upsert/deletion of a config entry happens, within the open state store transaction we speculatively test compile all discovery chains that may be affected by the pending modification to verify that the write would not create an erroneous scenario (such as splitting traffic to a subset that did not exist). If a single discovery chain evaluation references two config entries with the same kind and name in different namespaces then sometimes the upsert/deletion would be falsely rejected. It does not appear as though this bug would've let invalid writes through to the state store so the correction does not require a cleanup phase.	2020-09-02 10:47:19 -05:00
Chris Piraino	b245d60200	Set metrics reporting interval to 9 seconds This is below the 10 second interval that lib/telemetry.go implements as its aggregation interval, ensuring that we always report these metrics.	2020-09-02 10:24:23 -05:00
Chris Piraino	e9b397005c	Update godoc string for memdb wrapper functions/structs	2020-09-02 10:24:22 -05:00
Chris Piraino	80f923a47a	Refactor state store usage to track unique service names This commit refactors the state store usage code to track unique service name changes on transaction commit. This means we only need to lookup usage entries when reading the information, as opposed to iterating over a large number of service indices. - Take into account a service instance's name being changed - Do not iterate through entire list of service instances, we only care about whether there is 0, 1, or more than 1.	2020-09-02 10:24:21 -05:00
Chris Piraino	79e6534345	Use ReadTxn interface in state store helper functions	2020-09-02 10:24:20 -05:00
Chris Piraino	d90d95421d	Add WriteTxn interface and convert more functions to ReadTxn We add a WriteTxn interface for use in updating the usage memdb table, with the forward-looking prospect of incrementally converting other functions to accept interfaces. As well, we use the ReadTxn in new usage code, and as a side effect convert a couple of existing functions to use that interface as well.	2020-09-02 10:24:19 -05:00
Chris Piraino	45a4057f60	Report node/service usage metrics from every server Using the newly provided state store methods, we periodically emit usage metrics from the servers. We decided to emit these metrics from all servers, not just the leader, because that means we do not have to care about leader election flapping causing metrics turbulence, and it seems reasonable for each server to emit its own view of the state, even if they should always converge rapidly.	2020-09-02 10:24:17 -05:00
Chris Piraino	3af96930eb	Add new usage memdb table that tracks usage counts of various elements We update the usage table on Commit() by using the TrackedChanges() API of memdb. Track memdb changes on restore so that usage data can be compiled	2020-09-02 10:24:16 -05:00
freddygv	d7bda050e0	Restructure structs and other PR comments	2020-09-02 09:10:50 -06:00
Daniel Nephin	9535a1b57d	token: OSS support for enterprise tokens	2020-08-31 15:10:15 -04:00
Daniel Nephin	8e477feb22	config: use token.Config for ACLToken config Using the target Config struct reduces the amount of copying and translating of configuration structs.	2020-08-31 15:10:15 -04:00
Daniel Nephin	b64ce07ef7	agent/token: Move token persistence out of agent And into token.Store. This change isolates any awareness of token persistence in a single place. It is a small step in allowing Agent.New to accept its dependencies.	2020-08-31 15:00:34 -04:00
Daniel Nephin	fbae521775	fix TestStore_RegularTokens This test was only passing because t.Parallel was causing every subtest to run with the last value in the iteration, which sets a value for all tokens. The test started to fail once t.Parallel was removed, but the same failure could have been produced by adding 'tt := tt' to the t.Run() func. These tests run in under 10ms, so there is no reason to use t.Parallel.	2020-08-31 14:59:14 -04:00
Matt Keeler	335c604ced	Merge of auto-config and auto-encrypt code (#8523 ) auto-encrypt is now handled as a special case of auto-config. This also is moving all the cert-monitor code into the auto-config package.	2020-08-31 13:12:17 -04:00
freddygv	58a018c20b	Add documentation for resolver LB cfg	2020-08-28 14:46:13 -06:00
freddygv	194d34b09d	Pass LB config to Envoy via xDS	2020-08-28 14:27:40 -06:00
freddygv	8f470b30d7	Log error as error	2020-08-28 13:11:55 -06:00
freddygv	afb14b6705	Compile down LB policy to disco chain nodes	2020-08-28 13:11:04 -06:00
Daniel Nephin	845661c8af	Merge pull request #8548 from edevil/fix_flake Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve	2020-08-28 15:10:55 -04:00
Daniel Nephin	39b06a0c0b	Merge pull request #8552 from pierresouchay/reload_cache_throttling_config Ensure that Cache options are reloaded when `consul reload` is performed	2020-08-28 15:04:42 -04:00
Pierre Souchay	ee50b55163	Added Unit test for cache reloading	2020-08-28 13:03:58 +02:00
freddygv	391d569a45	Add LB policy to service-resolver	2020-08-27 19:44:02 -06:00
Jack	145bcdc2bb	Add http2 and grpc support to ingress gateways (#8458 )	2020-08-27 15:34:08 -06:00
R.B. Boyer	f2b8bf109c	xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569 )	2020-08-27 12:20:58 -05:00
R.B. Boyer	a7a8b8d6d9	agent: ensure that we normalize bootstrapped config entries (#8547 )	2020-08-27 11:37:25 -05:00
Pierre Souchay	f92ae5e6ca	Also test reload of EntryFetchMaxBurst	2020-08-27 18:14:05 +02:00
Matt Keeler	106e1d50bd	Move RPC router from Client/Server and into BaseDeps (#8559 ) This will allow it to be a shared component which is needed for AutoConfig	2020-08-27 11:23:52 -04:00
Pierre Souchay	4983e093a0	Tests that changes in rate limit are taken into account by agent	2020-08-27 16:41:20 +02:00
Pierre Souchay	084d0e8015	Added `options.Equals()` and minor fixes indentation fixes	2020-08-27 13:44:45 +02:00
R.B. Boyer	6fad634512	agent: expose the list of supported envoy versions on /v1/agent/self (#8545 )	2020-08-26 10:04:11 -05:00
Kyle Havlovitz	6f7152841f	Automatically renew the token used by the Vault CA provider	2020-08-25 10:34:49 -07:00
Pierre Souchay	dd385f05e6	Ensure that Cache options are reloaded when `consul reload` is performed. This will apply cache throttling parameters are properly applied: * cache.EntryFetchMaxBurst * cache.EntryFetchRate When values are updated, a log is displayed in info.	2020-08-24 23:33:10 +02:00
André Cruz	673bd69f36	Decrease test flakiness Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve and TestCacheNotifyPolling	2020-08-24 20:30:02 +01:00
André Cruz	a64686fab6	testing: Fix govet errors	2020-08-21 18:01:55 +01:00
Daniel Nephin	1c3a638d69	Merge pull request #8537 from hashicorp/dnephin/fix-panic-on-connect-nil Fix panic when decoding 'Connect: null'	2020-08-20 18:00:25 -04:00
Daniel Nephin	4155cae1cb	Fix panic when decoding 'Connect: null' Surprisingly the json Unmarshal updates the aux pointer to a nil.	2020-08-20 17:52:14 -04:00
Daniel Nephin	a97adadd2b	config: use logging.Config in RuntimeConfig To add structure to RuntimeConfig, and remove the need to translate into a third type.	2020-08-19 13:21:00 -04:00
Daniel Nephin	e4578aace8	logging: move init of grpclog This line initializes global state. Moving it out of the constructor and closer to where logging is setup helps keep related things together.	2020-08-19 13:21:00 -04:00
Daniel Nephin	7349018ff3	logging: Setup accept io.Writer instead of []io.Writer Also accept a non-pointer Config, since the config is not modified	2020-08-19 13:20:41 -04:00
Daniel Nephin	a520cf3ea7	testing: disable global metrics sink in tests This might be better handled by allowing configuration for the InMemSink interval and retail, and disabling the global. For now this is a smaller change to remove the goroutine leak caused by tests because go-metrics does not provide any way of shutting down the global goroutine.	2020-08-18 19:04:57 -04:00
Daniel Nephin	84642486b9	agent: extract dependency creation from New With this change, Agent.New() accepts many of the dependencies instead of creating them in New. Accepting fully constructed dependencies from a constructor makes the type easier to test, and easier to change. There are still a number of dependencies created in Start() which can be addressed in a follow up.	2020-08-18 19:04:55 -04:00
Daniel Nephin	b204e342c5	Merge pull request #8514 from hashicorp/dnephin/testing-improvements-1 testing: small improvements to TestSessionCreate and testutil.retry	2020-08-18 18:26:05 -04:00
Daniel Nephin	ab0d206eac	Merge pull request #8528 from hashicorp/dnephin/move-node-name-validation config: Move some config validation from Agent.Start to config.Builder.Validate	2020-08-18 18:25:41 -04:00
Hans Hasselberg	02de4c8b76	add primary keys to list keyring (#8522 ) During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key. Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output: ```json [ { "WAN": true, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6 }, "NumNodes": 6 }, { "WAN": false, "Datacenter": "dc2", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 }, { "WAN": false, "Datacenter": "dc1", "Segment": "", "Keys": { "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3, "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "PrimaryKeys": { "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8 }, "NumNodes": 8 } ] ``` I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later: * add a flag to show the primary keys * add a flag to show json output Fixes #3393.	2020-08-18 09:50:24 +02:00
Daniel Nephin	7078ca07fa	config: Move remote-script-checks warning to config Previously it was done in Agent.Start, but it can be done much earlier	2020-08-17 17:39:49 -04:00
Daniel Nephin	a0dc4222b6	config: move NodeName validation to config validation Previsouly it was done in Agent.Start, which is much later then it needs to be. The new 'dns' package was required, because otherwise there would be an import cycle. In the future we should move more of the dns server into the dns package.	2020-08-17 17:25:02 -04:00
Daniel Nephin	647236bb17	Merge pull request #8515 from hashicorp/dnephin/unexport-testing-shims config: unexport fields and resolve TODOs in config.Builder	2020-08-17 16:03:07 -04:00
Daniel Nephin	3e0d63a6b7	testing: use t.Cleanup in testutil.TempFile So that it has the same behaviour as TempDir. Also remove the now unnecessary 'defer os.Remove'	2020-08-14 20:06:01 -04:00
Daniel Nephin	8d35e37b3c	testing: Remove all the defer os.Removeall Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage the removal	2020-08-14 19:58:53 -04:00
Daniel Nephin	fe8790da9e	config: unexport and resolve TODOs in config.Builder - unexport testing shims, and document their purpose - resolve a TODO by moving validation to NewBuilder and storing the one field that is used instead of all of Options - create a slice with the correct size to avoid extra allocations	2020-08-14 19:23:32 -04:00
Daniel Nephin	85655098be	testing: Improve session_endpoint_test While working on another change I caused a bunch of these tests to fail. Unfortunately the failure messages were not super helpful at first. One problem was that the request and response were created outside of the retry. This meant that when the second attempt happened, the request body was empty (because the buffer had been consumed), and so the request was not actually being retried. This was fixed by moving more of the request creation into the retry block. Another problem was that these functions can return errors in two ways, and are not consistent about which way they use. Some errors are returned to the response writer, but the tests were not checking those errors, which was causing a panic later on. This was fixed by adding a check for the response code. Also adds some missing t.Helper(), and has assertIndex use checkIndex so that it is clear these are the same implementation.	2020-08-14 18:55:52 -04:00
Daniel Nephin	2725513eea	testutil: Add t.Cleanup to TempDir TempDir registers a Cleanup so that the directory is always removed. To disable to cleanup, set the TEST_NOCLEANUP env var.	2020-08-14 13:19:10 -04:00
Daniel Nephin	6b0ac22c1b	testing: fix flaky test TestDNS_NonExistentDC_RPC I saw this test flake locally, and it was easy to reproduce with -count=10. The failure was: 'TestAgent.dns: rpc error: error=No known Consul servers'. Waiting for the agent seems to fix it.	2020-08-13 18:03:04 -04:00
Daniel Nephin	512a523a3e	testing: wait until monitor has started before shutdown This commit fixes a test that I saw flake locally while running tests. The test output from the monitor started immediately after the line the test was looking for. To fix the problem a channel is closed when the goroutine starts. Shutdown is not called until this channel is closed, which seems to greatly reduce the chance of a flake.	2020-08-13 17:53:29 -04:00
Daniel Nephin	b6d91d59f3	testing: Remove TestAgent.Key and change TestAgent.DataDir TestAgent.Key was only used by 3 tests. Extracting it from the common helper that is used in hundreds of tests helps keep the shared part small and more focused. This required a second change (which I was planning on making anyway), which was to change the behaviour of DataDir. Now in all cases the TestAgent will use the DataDir, and clean it up once the test is complete.	2020-08-13 17:53:24 -04:00
Daniel Nephin	3abe4e43d3	testing: use t.Cleanup in TestAgent for returnPorts	2020-08-13 17:09:37 -04:00
Daniel Nephin	edf6e74a14	testing: remove unused fields from TestACLAgent	2020-08-13 17:03:55 -04:00
Daniel Nephin	24d8db906f	agent: rename vars in newConsulConfig 'base' is a bit misleading, since it is the return value. Renamed to cfg.	2020-08-13 11:58:21 -04:00
Daniel Nephin	88ddd8e0e7	agent: Move setupKeyring functions to keyring.go There are a couple reasons for this change: 1. agent.go is way too big. Smaller files makes code eaasier to read because tools that show usage also include filename which can give a lot more context to someone trying to understand which functions call other functions. 2. these two functions call into a large number of functions already in keyring.go.	2020-08-13 11:58:21 -04:00
Daniel Nephin	119df79d7c	agent: unmethod consulConfig To allow us to move newConsulConfig out of Agent.	2020-08-13 11:58:21 -04:00
Daniel Nephin	055e7e8ca3	Fix conflict in merged PRs One PR renamed the var from config->cfg, and another used the old name config, which caused the build to fail on master.	2020-08-13 11:28:26 -04:00
Daniel Nephin	629c34085d	state: remove unused Store method receiver And use ReadTxn interface where appropriate.	2020-08-13 11:25:22 -04:00
Daniel Nephin	20f94bf9ab	Merge pull request #8463 from hashicorp/dnephin/unmethod-make-node-id agent: convert NodeID methods to functions	2020-08-13 11:18:11 -04:00
Daniel Nephin	fc797a279a	Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown agent/consul: Remove NotifyShutdown	2020-08-13 11:16:48 -04:00
Daniel Nephin	d8ffcd5686	Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake state: speed up tests that use watchLimit	2020-08-13 11:16:12 -04:00
Daniel Nephin	45dae87ee7	auto-config: reduce awareness of config This is a small step to allowing Agent to accept its dependencies instead of creating them in New. There were two fields in autoconfig.Config that were used exclusively to load config. These were replaced with a single function, allowing us to move LoadConfig back to the config package. Also removed the WithX functions for building a Config. Since these were simple assignment, it appeared we were not getting much value from them.	2020-08-12 13:23:23 -04:00
Daniel Nephin	62e402c4f9	Remove check that hostID is a uuid. Immediately afterward we hash the ID, so it does not need to be a uuid anymore.	2020-08-12 13:05:10 -04:00
Daniel Nephin	55b074f0eb	agent: convert NodeID methods to functions Making these functions allows us to cleanup how an agent is initialized. They only make use of a config and a logger, so they do not need to be agent methods. Also cleanup the testing to use t.Run and require.	2020-08-12 13:05:10 -04:00
Daniel Nephin	6be568119b	Extract nodeID functions to a different file In preparation for turning them into functions. To reduce the scope of Agent, and refactor how Agent is created and started.	2020-08-12 13:05:10 -04:00
R.B. Boyer	63422ca9c5	connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470 ) Fixes #8466 Since Consul 1.8.0 there was a bug in how ingress gateway protocol compatibility was enforced. At the point in time that an ingress-gateway config entry was modified the discovery chain for each upstream was checked to ensure the ingress gateway protocol matched. Unfortunately future modifications of other config entries were not validated against existing ingress-gateway definitions, such as: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. create service-defaults for 'api' setting protocol=http (worked, but not ok) 3. create service-splitter or service-router for 'api' (worked, but caused an agent panic) If you were to do these in a different order, it would fail without a crash: 1. create service-defaults for 'api' setting protocol=http (ok) 2. create service-splitter or service-router for 'api' (ok) 3. create tcp ingress-gateway pointing to 'api' (fail with message about protocol mismatch) This PR introduces the missing validation. The two new behaviors are: 1. create tcp ingress-gateway pointing to 'api' (ok) 2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat) 3. (NEW) create service-splitter or service-router for 'api' (fail with message about protocol mismatch) In consideration for any existing users that may be inadvertently be falling into item (2) above, that is now officiall a valid configuration to be in. For anyone falling into item (3) above while you cannot use the API to manufacture that scenario anymore, anyone that has old (now bad) data will still be able to have the agent use them just enough to generate a new agent/proxycfg error message rather than a panic. Unfortunately we just don't have enough information to properly fix the config entries.	2020-08-12 11:19:20 -05:00
Freddy	77de9bbe22	Notify alias checks when aliased service is [de]registered (#8456 )	2020-08-12 09:47:41 -06:00
Daniel Nephin	b936f84c07	Merge pull request #8469 from hashicorp/dnephin/config-source config: make Source an interface to avoid the marshal/unmarshal cycle in auto-config	2020-08-12 11:17:15 -04:00
Hans Hasselberg	7a6d916ddc	Merge pull request #8471 from hashicorp/local_only thread local-only through the layers	2020-08-12 08:54:51 +02:00
Freddy	50fee12d62	Internal endpoint to query intentions associated with a gateway (#8400 )	2020-08-11 17:20:41 -06:00
Kyle Havlovitz	8118e3db40	Fix a state store comment about version	2020-08-11 13:46:12 -07:00
Kyle Havlovitz	2601585017	fsm: Fix snapshot bug with restoring node/service/check indexes	2020-08-11 11:49:52 -07:00
Hans Hasselberg	e0297b6e99	Refactor keyring ops: * changes some functions to return data instead of modifying pointer arguments * renames globalRPC() to keyringRPCs() to make its purpose more clear * restructures KeyringOperation() to make it more understandable	2020-08-11 13:42:03 +02:00
Hans Hasselberg	08b1fea379	thread local-only through the layers $ consul keyring -list -local-only ==> Gathering installed encryption keys... dc1 (LAN): aUlAW4ST3+vwseI61so24CoORkyjZofcmHk+j7QPSYQ= [1/1]	2020-08-11 13:41:53 +02:00
Daniel Nephin	3a4242c121	auto-config: Avoid the marshal/unmarshal cycle in auto-config Use a LiteralConfig and return a config.Config from translate.	2020-08-10 20:07:52 -04:00
freddygv	6dcfa11c21	Update error handling	2020-08-10 17:48:22 -06:00
Daniel Nephin	cbdceeb044	config: Make Source an interface This will allow us to accept config from auto-config without needing to go through a serialziation cycle.	2020-08-10 12:46:28 -04:00
Mike Morris	d9ef146d82	changelog: Update for 1.8.2, 1.7.6, 1.7.5 and 1.6.7 (#8462 ) * update bindata_assetfs.go * Release v1.8.2 * Putting source back into Dev Mode * changelog: add entries for 1.7.6, 1.7.5 and 1.6.7 Co-authored-by: hashicorp-ci <hashicorp-ci@users.noreply.github.com>	2020-08-07 18:58:09 -04:00
Daniel Nephin	bef9348ca8	testing: remove unnecessary defers in tests The data directory is now removed by the test helper that created it.	2020-08-07 17:28:16 -04:00
Daniel Nephin	f3b63514d5	testing: Remove NotifyShutdown NotifyShutdown was only used for testing. Now that t.Cleanup exists, we can use that instead of attaching cleanup to the Server shutdown. The Autopilot test which used NotifyShutdown doesn't need this notification because Shutdown is synchronous. Waiting for the function to return is equivalent.	2020-08-07 17:14:44 -04:00
Matt Keeler	05ebf9b8c5	Require token replication to be enabled in secondary dcs when ACLs are enabled with AutoConfig (#8451 ) AutoConfig will generate local tokens for clients and the ability to use local tokens is gated off of token replication being enabled and being configured with a replication token. Therefore we already have a hard requirement on having token replication enabled, this commit just makes sure to surface that to the operator instead of having to discern what the issue is from RPC errors.	2020-08-07 10:20:27 -04:00
Hans Hasselberg	fdceb24323	auto_config implies connect (#8433 )	2020-08-07 12:02:02 +02:00
freddygv	4ac644f401	Fix test build	2020-08-06 11:31:56 -06:00
Hans Hasselberg	417d4adfb7	Mark its own cluster as healthy when rebalancing. (#8406 ) This code started as an optimization to avoid doing an RPC Ping to itself. But in a single server cluster the rebalancing was led to believe that there were no healthy servers because foundHealthyServer was not set. Now this is being set properly. Fixes #8401 and #8403.	2020-08-06 10:42:09 +02:00
freddygv	83f4e32376	PR comments and addtl tests	2020-08-05 16:07:11 -06:00
Daniel Nephin	62641b820a	Merge pull request #8404 from hashicorp/dnephin/remove-log-output-field Use Logger consistently, instead of LogOutput	2020-08-05 14:31:43 -04:00
Daniel Nephin	061ae94c63	Rename NewClient/NewServer Now that duplicate constructors have been removed we can use the shorter names for the single constructor.	2020-08-05 14:00:55 -04:00
Daniel Nephin	67c505cd90	Remove LogOutput from Agent Now that it is no longer used, we can remove this unnecessary field. This is a pre-step in cleanup up RuntimeConfig->Consul.Config, which is a pre-step to adding a gRPCHandler component to Server for streaming. Removing this field also allows us to remove one of the return values from logging.Setup.	2020-08-05 14:00:44 -04:00
Daniel Nephin	e6c94c1411	Remove LogOutput from Server	2020-08-05 14:00:44 -04:00
Daniel Nephin	fdf966896f	Remove LogOutput from Client	2020-08-05 14:00:42 -04:00
Daniel Nephin	73493ca01b	Pass a logger to ConnPool and yamux, instead of an io.Writer Allowing us to remove the LogOutput field from config.	2020-08-05 13:25:08 -04:00
Daniel Nephin	7c4566f116	api: Use a Logger instead of an io.Writer in api.Watch So that we can pass around only a Logger, not a LogOutput	2020-08-05 13:25:08 -04:00
Daniel Nephin	c7c941811d	config: Remove unused field	2020-08-05 13:25:08 -04:00
Daniel Nephin	21fa99a83b	Return nil value on error. The main bug was fixed in cb050b280ceb4186de765118611a7a92d8158c3f, but the return value of 'result' is still misleading. Change the return value to nil to make the code more clear.	2020-08-05 13:10:17 -04:00
R.B. Boyer	8ea4c482b3	xds: add support for envoy 1.15.0 and drop support for 1.11.x (#8424 ) Related changes: - hard-fail the xDS connection attempt if the envoy version is known to be too old to be supported - remove the RouterMatchSafeRegex proxy feature since all supported envoy versions have it - stop using --max-obj-name-len (due to: envoyproxy/envoy#11740)	2020-07-31 15:52:49 -05:00
freddygv	c87af29506	collect GatewayServices from iter in a function	2020-07-31 13:30:40 -06:00
Freddy	7c2c8815d7	Avoid panics during shutdown routine (#8412 )	2020-07-30 11:11:10 -06:00
freddygv	94d1f0a310	end to end changes to pass gatewayservices to /ui/services/	2020-07-30 10:21:11 -06:00
Matt Keeler	a0ebab5bac	Ensure certificates retrieved through the cache get persisted with auto-config (#8409 )	2020-07-30 11:37:18 -04:00
freddygv	521ec63339	Support ConnectedWithProxy	2020-07-30 09:32:12 -06:00
Matt Keeler	76add4f24c	Allow setting verify_incoming* when using auto_encrypt or auto_config (#8394 ) Ensure that enabling AutoConfig sets the tls configurator properly This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.	2020-07-30 10:15:12 -04:00
Hans Hasselberg	ffdc3057fe	agent/cache test for cache throttling. (#8396 )	2020-07-30 14:41:13 +02:00
Matt Keeler	dad0f189a2	Agent Auto Config: Implement Certificate Generation (#8360 ) Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server. This also refactors the auto-config package a bit to split things out into multiple files.	2020-07-28 15:31:48 -04:00
Matt Keeler	2ec4e46eb2	Default Cache rate limiting options in New Also get rid of the TestCache helper which was where these defaults were happening previously.	2020-07-28 12:34:35 -04:00
Matt Keeler	8df112526d	Fix some broken code in master There were several PRs that while all passed CI independently, when they all got merged into the same branch caused compilation errors in test code. The main changes that caused issues where changing agent/cache.Cache.New to require a concrete options struct instead of a pointer. This broke the cert monitor tests and the catalog_list_services_test.go. Another change was made to unembed the http.Server from the agent.HTTPServer struct. That coupled with another change to add a test to ensure cache rate limiting coming from HTTP requests was working as expected caused compilation failures.	2020-07-28 09:50:10 -04:00
Pierre Souchay	947d8eb039	Added ratelimit to handle throtling cache (#8226 ) This implements a solution for #7863 It does: Add a new config cache.entry_fetch_rate to limit the number of calls/s for a given cache entry, default value = rate.Inf Add cache.entry_fetch_max_burst size of rate limit (default value = 2) The new configuration now supports the following syntax for instance to allow 1 query every 3s: command line HCL: -hcl 'cache = { entry_fetch_rate = 0.333}' in JSON { "cache": { "entry_fetch_rate": 0.333 } }	2020-07-27 23:11:11 +02:00
Matt Keeler	3a1058a06b	Move connect root retrieval and cert signing logic out of the RPC endpoints (#8364 ) The code now lives on the Server type itself. This was done so that all of this could be shared with auto config certificate signing.	2020-07-24 10:00:51 -04:00
Matt Keeler	e7d8a02ae8	Move generation of the CA Configuration from the agent code into a method on the RuntimeConfig (#8363 ) This allows this to be reused elsewhere.	2020-07-23 16:05:28 -04:00
Daniel Nephin	597dcf2bfb	Merge pull request #8323 from hashicorp/dnephin/add-event-publisher-2 stream: close subscriptions on shutdown	2020-07-23 13:12:50 -04:00
Matt Keeler	c3e7d689b7	Refactor the agentpb package (#8362 ) First move the whole thing to the top-level proto package name. Secondly change some things around internally to have sub-packages.	2020-07-23 11:24:20 -04:00
Paul Coignet	9faa0a2a02	Fix tests	2020-07-23 11:04:10 +02:00
Daniel Nephin	decba06b7d	stream: close all subs when EventProcessor is shutdown.	2020-07-22 19:04:10 -04:00
Daniel Nephin	e802689bbe	stream: fix overallocation in filter And add tests	2020-07-22 19:04:10 -04:00
Daniel Nephin	f64725f7aa	state: speed up TestStateStore_ServicesByNodeMeta Make watchLimit a var so that we can patch it in tests and reduce the time spent creating state.	2020-07-22 16:57:06 -04:00
Daniel Nephin	a44ddea9ba	state: Use subtests in TestStateStore_ServicesByNodeMeta These subtests make it much easier to identify the slow part of the test, but they also help enumerate all the different cases which are being tested.	2020-07-22 16:39:09 -04:00
Daniel Nephin	6d3b042872	Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs testutil: NewLogBuffer - buffer logs until a test fails	2020-07-21 15:21:52 -04:00
Matt Keeler	8ea8a939f0	Merge pull request #8311 from hashicorp/bugfix/auto-encrypt-token-update	2020-07-21 13:15:27 -04:00
Daniel Nephin	dbd7e9da39	Merge pull request #8344 from hashicorp/dnephin/fix-flakes-in-stream stream: handle empty event in TestEventSnapshot	2020-07-21 13:14:35 -04:00
Daniel Nephin	80ff174880	testutil: NewLogBuffer - buffer logs until a test fails Replaces #7559 Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case. Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race. The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long. This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful. All of the logs are printed from the test goroutine, so they should be associated with the correct test. Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly. Related: https://github.com/golang/go/issues/38458 (may be fixed in go1.15) https://github.com/golang/go/issues/38382#issuecomment-612940030	2020-07-21 12:50:40 -04:00
Matt Keeler	6d94900cd7	Disable background cache refresh for Connect Leaf Certs The rationale behind removing them is that all of our own code (xDS, builtin connect proxy) use the cache notification mechanism. This ensures that the blocking fetch behind the scenes is always executing. Therefore the only way you might go to get a certificate and have to wait is when 1) the request has never been made for that cert before or 2) you are using the v1/agent/connect/ca/leaf API for retrieving the cert yourself. In the first case, the refresh change doesn’t alter the behavior. In the second case, it can be mitigated by using blocking queries with that API which just like normal cache notification mechanism will cause the blocking fetch to be initiated and to get leaf certs as soon as needed. If you are not using blocking queries, or Envoy/xDS, or the builtin connect proxy but are retrieving the certs yourself then the HTTP endpoint might take a little longer to respond. This also renames the RefreshTimeout field on the register options to QueryTimeout to more accurately reflect that it is used for any type that supports blocking queries.	2020-07-21 12:19:25 -04:00
Matt Keeler	133a6d99f2	Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint. This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality. Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.	2020-07-21 12:19:25 -04:00
Daniel Nephin	f913bed325	checks: wait for goroutine to complete CheckAlias already had a waitGroup, but the Add() call was happening too late, which was causing a race in tests. The add must happen before the goroutine is started. CheckHTTP did not have a waitGroup, so I added it to match CheckAlias. It looks like a lot of the implementation could be shared, and may not need all of channel, waitgroup and bool, but I will leave that refactor for another time.	2020-07-20 18:55:39 -04:00
Daniel Nephin	7599e280de	stream: handle empty event in TestEventSnapshot When the race detector is enabled we see this test fail occasionally. The reordering of execution seems to make it possible for the snapshot splice to happen before any events are published to the topicBuffers. We can handle this case in the test the same way it is handled by a subscription, by proceeding to the next event.	2020-07-20 18:20:02 -04:00
Daniel Nephin	9f5f8abcbf	Merge pull request #8245 from hashicorp/dnephin/use-not-modified-in-cache agent/cache: Use AllowNotModified in CatalogListServices	2020-07-20 15:30:52 -04:00
Daniel Nephin	235845af21	Merge pull request #8290 from hashicorp/dnephin/watch-decode watch: fix script watches with single arg	2020-07-20 14:41:17 -04:00
Paul Coignet	6f80d33a22	Add default prefix_filter	2020-07-20 10:39:58 +02:00
Daniel Nephin	75f10fb191	state: update calls that are no longer state methods In a previous commit these methods were changed to functions, so remove the Store paramter.	2020-07-16 15:46:10 -04:00
Daniel Nephin	3fcb2e16f4	state: un-method funcs that don't use their receiver This change was mostly automated with the following First generate a list of functions with: git grep -o 'Store) $[^(]\+$(tx \txn' ./agent/consul/state \| awk '{print $2}' \| grep -o '^[^(]\+' Then the list was curated a bit with trial/error to remove and add funcs as necessary. Finally the replacement was done with: dir=agent/consul/state file=${1-funcnames} while read fn; do echo "$fn" sed -i -e "s/(s \Store) $fn(/$fn(/" $dir/.go sed -i -e "s/s\.$fn(/$fn(/" $dir/.go sed -i -e "s/s\.store\.$fn(/$fn(/" $dir/*.go done < $file	2020-07-16 15:30:39 -04:00
Daniel Nephin	edb0a4f1f8	store: convert methods that don't use their receiver to functions Making these functions allows them to be used without introducing an artificial dependency on the struct. Many of these will be called from streaming Event processors, which do not have a store. This change is being made ahead of the streaming work to get to reduce the size of the streaming diff.	2020-07-16 15:30:10 -04:00
André	e93084be6c	minor: fix docstring of DNSOnlyPassing (#8318 ) In runtime.go it had "duration" but it is actually a boolean.	2020-07-16 09:47:33 -04:00
Daniel Nephin	797abe1f00	agent/cache: Use AllowNotModifiedResponse in CatalogListServices Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-14 18:58:20 -04:00
Daniel Nephin	8aa3335b22	agent/cache: Update some docstrings	2020-07-14 18:58:20 -04:00
Daniel Nephin	a2f8605c66	stream: Add forceClose and refactor subscription filtering Move the subscription context to Next. context.Context should generally never be stored in a struct because it makes that struct only valid while the context is valid. This is rarely obvious from the caller. Adds a forceClosed channel in place of the old context, and uses the new context as a way for the caller to stop the Subscription blocking. Remove some recursion out of bufferImte.Next. The caller is already looping so we can continue in that loop instead of recursing. This ensures currentItem is updated immediately (which probably does not matter in practice), and also removes the chance that we overflow the stack. NextNoBlock and FollowAfter do not need to handle bufferItem.Err, the caller already handles it. Moves filter to a method to simplify Next, and more explicitly separate filtering from looping. Also improve some godoc Only unwrap itemBuffer.Err when necessary	2020-07-14 15:57:47 -04:00
Daniel Nephin	2595436f62	stream: Improve docstrings Also rename ResumeStrema to EndOfEmptySnapshot to be more consistent with other framing events Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:47 -04:00
Daniel Nephin	16a2b3fafc	stream: change Topic to an interface Consumers of the package can decide on which type to use for the Topic. In the future we may use a gRPC type for the topic.	2020-07-14 15:57:47 -04:00
Daniel Nephin	aa571bd0ce	state: Move change processing out of EventPublisher EventPublisher was receiving TopicHandlers, which had a couple of problems: - ChangeProcessors were being grouped by Topic, but they completely ignored the topic and were performed on every change - ChangeProcessors required EventPublisher to be aware of database changes By moving ChangeProcesors out of EventPublisher, and having Publish accept events instead of changes, EventPublisher no longer needs to be aware of these things. Handlers is now only SnapshotHandlers, which are still mapped by Topic. Also allows us to remove the small 'db' package that had only two types. They can now be unexported types in state.	2020-07-14 15:57:47 -04:00
Daniel Nephin	23a940daad	server: Abandom state store to shutdown EventPublisher So that we don't leak goroutines	2020-07-14 15:57:47 -04:00
Daniel Nephin	e1305fe80c	stream: unexport identifiers Now that EventPublisher is part of stream a lot of the internals can be hidden	2020-07-14 15:57:47 -04:00
Daniel Nephin	9e37894778	stream: Move EventPublisher to stream package The EventPublisher is the central hub of the PubSub system. It is toughly coupled with much of stream. Some stream internals were exported exclusively for EventPublisher. The two Subscribe cases (with or without index) were also awkwardly split between two packages. By moving EventPublisher into stream they are now both in the same package (although still in different files).	2020-07-14 15:57:47 -04:00
Daniel Nephin	6e87e83d77	state: Make handleACLUpdate async once again So that we keep as much as possible out of the FSM commit hot path.	2020-07-14 15:57:47 -04:00
Daniel Nephin	a92dab724d	state: Use interface for Txn Also store the index in Changes instead of the Txn. This change is in preparation for movinng EventPublisher to the stream package, and making handleACLUpdates async once again.	2020-07-14 15:57:46 -04:00
Daniel Nephin	c778d61b6a	stream.Subscription unexport fields and additiona docstrings	2020-07-14 15:57:46 -04:00
Daniel Nephin	37a38629d7	Add a context for stopping EventPublisher goroutine	2020-07-14 15:57:46 -04:00
Daniel Nephin	02bc5a26e4	EventPublisher: Make Unsubscribe a function on Subscription It is critical that Unsubscribe be called with the same pointer to a SubscriptionRequest that was used to create the Subscription. The docstring made that clear, but it sill allowed a caler to get it wrong by creating a new SubscriptionRequest. By hiding this detail from the caller, and only exposing an Unsubscribe method, it should be impossible to fail to Unsubscribe. Also update some godoc strings.	2020-07-14 15:57:46 -04:00
Daniel Nephin	86976cf23c	EventPublisher: handleACL changes synchronously Use a separate lock for subscriptions.ByToken to allow it to happen synchronously in the commit flow. This removes the need to create a new txn for the goroutine, and removes the need for EventPublisher to contain a reference to DB.	2020-07-14 15:57:46 -04:00
Daniel Nephin	606121fae6	stream.EventSnapshot: reduce the fields on the struct Many of the fields are only needed in one place, and by using a closure they can be removed from the struct. This reduces the scope of the variables making it esier to see how they are used.	2020-07-14 15:57:45 -04:00
Daniel Nephin	7196917051	stream.EventBuffer: Seed the fuzz test with time.Now() Otherwise the test will run with exactly the same values each time. By printing the seed we can attempt to reproduce the test by adding an env var to override the seed	2020-07-14 15:57:45 -04:00
Daniel Nephin	525b275a52	state: memdb_wrapper.go -> memdb.go Renaming in a separate commit so that git can merge changes to the file.	2020-07-14 15:57:45 -04:00
Daniel Nephin	b5d2bea770	state: publish changes from Commit Make topicRegistry use functions instead of unbound methods Use a regular memDB in EventPublisher to remove a reference cycle Removes the need for EventPublisher to use a store	2020-07-14 15:57:45 -04:00
Daniel Nephin	f626c3d6c5	EventPublisher: docstrings and getTopicBuffer also rename commitCh -> publishCh	2020-07-14 15:57:45 -04:00
Daniel Nephin	2020e9c7c7	ProcessChanges: use stream.Event Also remove secretHash, which was used to hash tokens. We don't expose these tokens anywhere, so we can use the string itself instead of a Hash. Fix acl_events_test.go for storing a structs type.	2020-07-14 15:57:45 -04:00
Daniel Nephin	2e45bbbb3e	stream: Use local types for Event Topic SubscriptionRequest	2020-07-14 15:57:45 -04:00
Daniel Nephin	3d62013062	Rename stream_publisher.go -> event_publisher.go	2020-07-14 15:57:44 -04:00
Daniel Nephin	526fb53f85	Add streaming package with Subscription and Snapshot components. The remaining files from 7965767de0bd62ab07669b85d6879bd5f815d157 Co-authored-by: Paul Banks <banks@banksco.de>	2020-07-14 15:57:44 -04:00
Matt Keeler	adf53f5113	Add ability for notifications when one of the agent tokens is updated (#8301 ) Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>	2020-07-14 09:53:55 -04:00
Hans Hasselberg	0c39b2c820	add support for envoy 1.14.4, 1.13.4, 1.12.6 (#8216 )	2020-07-13 15:44:44 -05:00
Chris Piraino	b80cbb499f	Set enterprise metadata after resolving the token (#8302 ) The token can encode enterprise metadata information, and we must make sure we set that on the reply so that we can correct filter ACLs.	2020-07-13 13:39:57 -05:00
Daniel Nephin	979a3d2844	watch: Allow args from different types Fixes a bug where specifying a slice of args with a single item was being converted to a string when config was loaded, causing an error.	2020-07-10 17:18:32 -04:00
Freddy	a30c10a18c	Add api mod support for /catalog/gateway-services (#8278 )	2020-07-10 13:01:45 -06:00
Daniel Nephin	90acc6a920	watch: extract makeWatchPlan to facilitate testing There is a bug in here now that slices in opaque config are unsliced. But to test that bug fix we need a function that can be easily tested.	2020-07-10 13:33:45 -04:00
Paul Coignet	8f373e7d6e	Keep both metrics	2020-07-10 11:27:22 +02:00
R.B. Boyer	6e3d07c995	xds: version sniff envoy and switch regular expressions from 'regex' to 'safe_regex' on newer envoy versions (#8222 ) - cut down on extra node metadata transmission - split the golden file generation to compare all envoy version	2020-07-09 17:04:51 -05:00
Daniel Nephin	43cd6e91d1	Merge pull request #8231 from hashicorp/dnephin/unembed-HTTPServer-Server agent/http: un-embed the http.Server	2020-07-09 17:42:33 -04:00
Daniel Nephin	b80a0e0def	agent/http: Update TestSetupHTTPServer_HTTP2 To remove the need to store the http.Server. This will allow us to remove the http.Server field from the HTTPServer struct.	2020-07-09 16:42:19 -04:00
Daniel Nephin	13e0d258b5	Merge pull request #8237 from hashicorp/dnephin/remove-acls-enabled-from-delegate Remove ACLsEnabled from delegate interface	2020-07-09 16:35:43 -04:00
Paul Coignet	a1e698dcff	Use method and path as labels	2020-07-09 10:31:27 +02:00
Matt Keeler	39d9babab3	Pass the Config and TLS Configurator into the AutoConfig constructor This is instead of having the AutoConfigBackend interface provide functions for retrieving them. NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.	2020-07-08 12:36:11 -04:00
Matt Keeler	a77ed471c8	Rename (Server).forward to (Server).ForwardRPC Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.	2020-07-08 11:05:44 -04:00
Matt Keeler	386ec3a2a2	Refactor AutoConfig RPC to not have a direct dependency on the Server type Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.	2020-07-08 11:05:44 -04:00
Chris Piraino	9d92c42c90	Append port number to ingress host domain (#8190 ) A port can be sent in the Host header as defined in the HTTP RFC, so we take any hosts that we want to match traffic to and also add another host with the listener port added. Also fix an issue with envoy integration tests not running the case-ingress-gateway-tls test.	2020-07-07 10:43:04 -05:00
Daniel Nephin	8b6036c077	Remove ACLsEnabled from delegate interface In all cases (oss/ent, client/server) this method was returning a value from config. Since the value is consistent, it doesn't need to be part of the delegate interface.	2020-07-03 17:00:20 -04:00
Daniel Nephin	24c6bcfbe8	Merge pull request #8215 from hashicorp/dnephin/support-not-modified-response-server agent/consul: Add support for NotModified to two endpoints	2020-07-03 16:15:31 -04:00
Pierre Souchay	f77182aa51	Upgrade go-connlimit to v0.3.0 / return http 429 on too many connections (#8221 ) Fixes #7527 I want to highlight this and explain what I think the implications are and make sure we are aware: * `HTTPConnStateFunc` closes the connection when it is beyond the limit. `Close` does not block. * `HTTPConnStateFuncWithDefault429Handler(10 * time.Millisecond)` blocks until the following is done (worst case): 1) `conn.SetDeadline(10*time.Millisecond)` so that 2) `conn.Write(429error)` is guaranteed to timeout after 10ms, so that the http 429 can be written and 3) `conn.Close` can happen The implication of this change is that accepting any new connection is worst case delayed by 10ms. But only after a client reached the limit already.	2020-07-03 09:25:07 +02:00
Daniel Nephin	8a8644edd5	agent/http: un-embed the HTTPServer The embedded HTTPServer struct is not used by the large HTTPServer struct. It is used by tests and the agent. This change is a small first step in the process of removing that field. The eventual goal is to reduce the scope of HTTPServer making it easier to test, and split into separate packages.	2020-07-02 17:21:12 -04:00
Daniel Nephin	dfa8856e5f	agent/consul: Add support for NotModified to two endpoints A query made with AllowNotModifiedResponse and a MinIndex, where the result has the same Index as MinIndex, will return an empty response with QueryMeta.NotModified set to true. Co-authored-by: Pierre Souchay <pierresouchay@users.noreply.github.com>	2020-07-02 17:05:46 -04:00
Matt Keeler	87764e5bfb	Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various	2020-07-02 09:49:49 -04:00
Yury Evtikhov	dbf3c05fa5	DNS: add IsErrQueryNotFound function for easier error evaluation	2020-07-01 03:41:44 +01:00
Yury Evtikhov	c594dfa1e6	DNS: fix agent returning SERVFAIL where NXDOMAIN should be returned	2020-07-01 01:51:21 +01:00
Yury Evtikhov	4e5ce6d4a7	DNS: add test to verify NXDOMAIN is returned when a non-existent domain is queried over RPC	2020-07-01 01:51:16 +01:00
Matt Keeler	fa6a2b38d9	Add an AutoEncrypt “integration” test Also fix a bug where Consul could segfault if TLS was enabled but no client certificate was provided. How no one has reported this as a problem I am not sure.	2020-06-30 15:23:29 -04:00
Matt Keeler	a97f9ff386	Overwrite agent leaf cert trust domain on the servers	2020-06-30 09:59:08 -04:00
Matt Keeler	5600069d69	Store the Connect CA rate limiter on the server This fixes a bug where auto_encrypt was operating without utilizing a common rate limiter.	2020-06-30 09:59:07 -04:00
Matt Keeler	e9e88e4527	Initialize the agent leaf cert cache result with a state to prevent unnecessary second certificate signing	2020-06-30 09:59:07 -04:00
Matt Keeler	fa42d9b34f	Fix auto_encrypt IP/DNS SANs The initial auto encrypt CSR wasn’t containing the user supplied IP and DNS SANs. This fixes that. Also We were configuring a default :: IP SAN. This should be ::1 instead and was fixed.	2020-06-30 09:59:07 -04:00
Matt Keeler	3d9ca75e34	Merge pull request #8193 from hashicorp/feature/auto-config/suppress-config-warnings	2020-06-27 10:06:52 -04:00
R.B. Boyer	72a515f5ec	connect: various changes to make namespaces for intentions work more like for other subsystems (#8194 ) Highlights: - add new endpoint to query for intentions by exact match - using this endpoint from the CLI instead of the dump+filter approach - enforcing that OSS can only read/write intentions with a SourceNS or DestinationNS field of "default". - preexisting OSS intentions with now-invalid namespace fields will delete those intentions on initial election or for wildcard namespaces an attempt will be made to downgrade them to "default" unless one exists. - also allow the '-namespace' CLI arg on all of the intention subcommands - update lots of docs	2020-06-26 16:59:15 -05:00
Matt Keeler	a1275c6e7d	Use the DNS and IP SANs from the auto config stanza when set	2020-06-26 16:01:30 -04:00
Matt Keeler	1fcb9ef03f	Overhaul the auto-config translation This fixes some issues around spurious warnings about using enterprise configuration in OSS.	2020-06-26 15:25:21 -04:00
Freddy	6ef032fefe	Split up unused key validation for oss/ent (#8189 ) Split up unused key validation in config entry decode for oss/ent. This is needed so that we can return an informative error in OSS if namespaces are provided.	2020-06-25 13:58:29 -06:00
Daniel Nephin	7d5f1ba6bd	Merge pull request #8176 from hashicorp/dnephin/add-linter-unparam-1 lint: add unparam linter and fix some of the issues	2020-06-25 15:34:48 -04:00
Matt Keeler	a448e3e636	Merge pull request #8184 from hashicorp/bugfix/goroutine-leaks	2020-06-25 09:22:19 -04:00
Chris Piraino	8db22c166a	Merge pull request #7932 from hashicorp/ingress/internal-ui-endpoint-multiple-ports Update gateway-services-nodes API endpoint to allow multiple addresses	2020-06-24 17:11:01 -05:00
Chris Piraino	58eb3710fc	remove obsolete comments about test parallelization	2020-06-24 16:36:13 -05:00
Chris Piraino	abe2ca94c5	Update gateway-services-nodes API endpoint to allow multiple addresses Previously, we were only returning a single ListenerPort for a single service. However, we actually allow a single service to be serviced over multiple ports, as well as allow users to define what hostnames they expect their services to be contacted over. When no hosts are defined, we return the default ingress domain for any configured DNS domain. To show this in the UI, we modify the gateway-services-nodes API to return a GatewayConfig.Addresses field, which is a list of addresses over which the specific service can be contacted.	2020-06-24 16:35:23 -05:00
Matt Keeler	2ab8af4093	Add a test for go routine leaks This is in its own separate package so that it will be a separate test binary that runs thus isolating the go runtime from other tests and allowing accurate go routine leak checking. This test would ideally use goleak.VerifyTestMain but that will fail 100% of the time due to some architectural things (blocking queries and net/rpc uncancellability). This test is not comprehensive. We should enable/exercise more features and more cluster configurations. However its a start.	2020-06-24 17:09:50 -04:00
Matt Keeler	d471977f62	Fix go routine leak in auto encrypt ca roots tracking	2020-06-24 17:09:50 -04:00
Matt Keeler	90e741c6d2	Allow cancelling blocking queries in response to shutting down.	2020-06-24 17:09:50 -04:00
Daniel Nephin	0285956fac	Update TestAgent_GetCoordinate The old test case was a very specific regresion test for a case that is no longer possible. Replaced with a new test that checks the default coordinate is returned.	2020-06-24 13:00:15 -04:00
Daniel Nephin	56ecfce5fa	Remove unused return values	2020-06-24 13:00:15 -04:00
Daniel Nephin	07c1081d39	Fix a bunch of unparam lint issues	2020-06-24 13:00:14 -04:00
Matt Keeler	341aedbce9	Ensure that retryLoopBackoff can be cancelled We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways. Fix go routine leak in the gateway locator	2020-06-24 12:41:08 -04:00
Matt Keeler	934f92f83b	Don’t leak metrics go routines in tests (#8182 )	2020-06-24 10:15:25 -04:00
gitforbit	aebb8e09a7	agent-http: cleanup: return nil instead of err (#8043 ) Since err is already checked, it should return `nil`	2020-06-24 14:29:21 +02:00
R.B. Boyer	ba83b52b32	connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165 )	2020-06-23 15:19:56 -05:00
freddygv	224f486aef	Update namespaces subject-verb agreement	2020-06-23 10:57:30 -06:00
freddygv	c631e7496a	Remove break	2020-06-22 19:59:04 -06:00
freddygv	e81b6d25ad	Let users know namespaces are ent only in config entry decode	2020-06-22 19:59:04 -06:00
Pierre Souchay	667b36cdba	Returns DNS Error NSDOMAIN when DC does not exists (#8103 ) This will allow to increase cache value when DC is not valid (aka return SOA to avoid too many consecutive requests) and will distinguish DC being temporarily not available from DC not existing. Implements https://github.com/hashicorp/consul/issues/8102	2020-06-22 09:01:48 -04:00
Matt Keeler	7b49fc1529	Require enabling TLS to enable Auto Config (#8159 ) On the servers they must have a certificate. On the clients they just have to set verify_outgoing to true to attempt TLS connections for RPCs. Eventually we may relax these restrictions but right now all of the settings we push down (acl tokens, acl related settings, certificates, gossip key) are sensitive and shouldn’t be transmitted over an unencrypted connection. Our guides and docs should recoommend verify_server_hostname on the clients as well. Another reason to do this is weird things happen when making an insecure RPC when TLS is not enabled. Basically it tries TLS anyways. We should probably fix that to make it clearer what is going on.	2020-06-19 16:38:14 -04:00
Freddy	7e7c783c8f	Always return a gateway cluster (#8158 )	2020-06-19 13:31:39 -06:00
Matt Keeler	9dc9f7df15	Allow cancelling startup when performing auto-config (#8157 ) Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>	2020-06-19 15:16:00 -04:00
Daniel Nephin	896b057a9f	Merge pull request #8147 from hashicorp/dnephin/remove-private-ip-2 Remove some dead code from agent/consul/util.go	2020-06-18 15:51:09 -04:00
Matt Keeler	7086a50353	Change auto config authorizer to allow for future extension The envisioned changes would allow extra settings to enable dynamically defined auth methods to be used instead of or in addition to the statically defined one in the configuration.	2020-06-18 15:22:24 -04:00
Daniel Nephin	b5ef9b7ea9	Remove bytesToUint64 from agent/consul	2020-06-18 12:45:43 -04:00
Daniel Nephin	81bc082b63	Remove unused private IP code from agent/consul	2020-06-18 12:40:38 -04:00
Matt Keeler	2c7844d220	Implement Client Agent Auto Config There are a couple of things in here. First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism. This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.	2020-06-17 16:49:46 -04:00
Matt Keeler	f5d57ccd48	Allow the Agent its its child Client/Server to share a connection pool This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.	2020-06-17 16:19:33 -04:00
Matt Keeler	8c601ad8db	Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc	2020-06-17 16:07:25 -04:00
Chris Piraino	79d003d395	Remove ACLEnforceVersion8 from tests (#8138 ) The field had been deprecated for a while and was recently removed, however a PR which added these tests prior to removal was merged.	2020-06-17 14:58:01 -05:00
Daniel Nephin	3d03d72727	Merge pull request #7762 from hashicorp/dnephin/warn-on-unknown-service-file config: warn if a config file is being skipped because of its file extension	2020-06-17 15:14:40 -04:00
Daniel Nephin	cb736b6947	config: warn when a config file is skipped All commands which read config (agent, services, and validate) will now print warnings when one of the config files is skipped because it did not match an expected format. Also ensures that config validate prints all warnings.	2020-06-17 13:08:54 -04:00
Daniel Nephin	1ef8279ac9	Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4 ci: enable SA4006 staticcheck check and add ineffassign	2020-06-17 12:16:02 -04:00
Matt Keeler	eda8cb39fd	Implement the insecure version of the Cluster.AutoConfig RPC endpoint Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.	2020-06-17 11:25:29 -04:00
Pierre Souchay	f7a1189dba	gossip: Ensure that metadata of Consul Service is updated (#7903 ) While upgrading servers to a new version, I saw that metadata of existing servers are not upgraded, so the version and raft meta is not up to date in catalog. The only way to do it was to: * update Consul server * make it leave the cluster, then metadata is accurate That's because the optimization to avoid updating catalog does not take into account metadata, so no update on catalog is performed.	2020-06-17 12:16:13 +02:00
Daniel Nephin	8753d1f1ba	ci: Add ineffsign linter And fix an additional ineffective assignment that was not caught by staticcheck	2020-06-16 17:32:50 -04:00

... 4 5 6 7 8 ...

2646 commits