Commit Graph

203 Commits

Author SHA1 Message Date
Kyle Havlovitz 475afd0300 docs: deprecate acl_datacenter and replace it with primary_datacenter 2018-10-10 12:16:47 -07:00
Paul Banks 7a8023a57f Fix up tests broken by master merge; add proxy tests to services command (and fix it!); actually run the proxycfg.Manager 2018-10-10 16:55:34 +01:00
Paul Banks 1e4c5a1811 Connect Envoy Command (#4735)
* Plumb xDS server and proxyxfg into the agent startup

* Add `consul connect envoy` command to allow running Envoy as a connect sidecar.

* Add test for help tabs; typos and style fixups from review
2018-10-10 16:55:34 +01:00
Paul Banks 979e1c9c94 Add -sidecar-for and new /agent/service/:service_id endpoint (#4691)
- A new endpoint `/v1/agent/service/:service_id` which is a generic way to look up the service for a single instance. The primary value here is that it:
   - **supports hash-based blocking** and so;
   - **replaces `/agent/connect/proxy/:proxy_id`** as the mechanism the built-in proxy uses to read its config.
   - It's not proxy specific and so works for any service.
   - It has a temporary shim to call through to the existing endpoint to preserve current managed proxy config defaulting behaviour until that is removed entirely (tested).
 - The built-in proxy now uses the new endpoint exclusively for it's config
 - The built-in proxy now has a `-sidecar-for` flag that allows the service ID of the _target_ service to be specified, on the condition that there is exactly one "sidecar" proxy (that is one that has `Proxy.DestinationServiceID` set) for the service registered.
 - Several fixes for edge cases for SidecarService
 - A fix for `Alias` checks - when running locally they didn't update their state until some external thing updated the target. If the target service has no checks registered as below, then the alias never made it past critical.
2018-10-10 16:55:34 +01:00
Paul Banks 7038fe6b71 Add SidecarService Syntax sugar to Service Definition (#4686)
* Added new Config for SidecarService in ServiceDefinitions.

* WIP: all the code needed for SidecarService is written... none of it is tested other than config :). Need API updates too.

* Test coverage for the new sidecarServiceFromNodeService method.

* Test API registratrion with SidecarService

* Recursive Key Translation 🤦

* Add tests for nested sidecar defintion arrays to ensure they are translated correctly

* Use dedicated internal state rather than Service Meta for tracking sidecars for deregistration.

Add tests for deregistration.

* API struct for agent register. No other endpoint should be affected yet.

* Additional test cases to cover updates to API registrations
2018-10-10 16:55:34 +01:00
Paul Banks bed72f6078 Rename proxy package (re-run of #4550) (#4638)
* Rename agent/proxy package to reflect that it is limited to managed proxy processes

Rationale: we have several other components of the agent that relate to Connect proxies for example the ProxyConfigManager component needed for Envoy work. Those things are pretty separate from the focus of this package so far which is only concerned with managing external proxy processes so it's nota good fit to put code for that in here, yet there is a naming clash if we have other packages related to proxy functionality that are not in the `agent/proxy` package.

Happy to bikeshed the name. I started by calling it `managedproxy` but `managedproxy.Manager` is especially unpleasant. `proxyprocess` seems good in that it's more specific about purpose but less clearly connected with the concept of "managed proxies". The names in use are cleaner though e.g. `proxyprocess.Manager`.

This rename was completed automatically using golang.org/x/tools/cmd/gomvpkg.

Depends on #4541

* Fix missed windows tagged files
2018-10-10 16:55:34 +01:00
Paul Banks 5b0d4db6bc Support Agent Caching for Service Discovery Results (#4541)
* Add cache types for catalog/services and health/services and basic test that caching works

* Support non-blocking cache types with Cache-Control semantics.

* Update API docs to include caching info for every endpoint.

* Comment updates per PR feedback.

* Add note on caching to the 10,000 foot view on the architecture page to make the new data path more clear.

* Document prepared query staleness quirk and force all background requests to AllowStale so we can spread service discovery load across servers.
2018-10-10 16:55:34 +01:00
Paul Banks 77e0577ff6
Add a Close method to cache that stops background goroutines. (#4746)
In a real agent the `cache` instance is alive until the agent shuts down so this is not a real leak in production, however in out test suite, every testAgent that is started and stops leaks goroutines that never get cleaned up which accumulate consuming CPU and memory through subsequent test in the `agent` package which doesn't help our test flakiness.

This adds a Close method that doesn't invalidate or clean up the cache, and still allows concurrent blocking queries to run (for up to 10 mins which might still affect tests). But at least it doesn't maintain them forever with background refresh and an expiry watcher routine.

It would be nice to cancel any outstanding blocking requests as well when we close but that requires much more invasive surgery right into our RPC protocol since we don't have a way to cancel requests currently.

Unscientifically this seems to make tests pass a bit quicker and more reliably locally but I can't really be sure of that!
2018-10-04 11:27:11 +01:00
Hans Hasselberg 318bcb9bbb
Allow disabling the HTTP API again. (#4655)
If you provide an invalid HTTP configuration consul will still start again instead of failing. But if you do so the build-in proxy won't be able to start which you might need for connect.
2018-09-13 16:06:04 +02:00
Pierre Souchay 508b67c32a Ensure that Proxies ARE always cleaned up, event with DeregisterCriticalServiceAfter (#4649)
This fixes https://github.com/hashicorp/consul/issues/4648
2018-09-11 17:34:09 +01:00
Matt Keeler 61a5c965c9
Ensure that errors setting up the DNS servers get propagated back to the shell (#4598)
Fixes: #4578 

Prior to this fix if there was an error binding to ports for the DNS servers the error would be swallowed by the gated log writer and never output. This fix propagates the DNS server errors back to the shell with a multierror.
2018-09-07 10:48:29 -04:00
Matt Keeler da3ed5dc76
Fix #4515: Segfault when serf_wan port was -1 but reconnect_time_wan was set (#4531)
Fixes #4515 

This just slightly refactors the logic to only attempt to set the serf wan reconnect timeout when the rest of the serf wan settings are configured - thus avoiding a segfault.
2018-08-17 14:44:25 -04:00
Matt Keeler 5c7c58ed26
Gossip tuneables (#4444)
Expose a few gossip tuneables for both lan and wan interfaces

gossip_nodes
gossip_interval
probe_timeout
probe_interval
retransmit_mult
suspicion_mult
2018-07-26 11:39:49 -04:00
Mitchell Hashimoto 5c42dacef4
Merge pull request #4320 from hashicorp/f-alias-check
Add "Alias" Check Type
2018-07-20 13:01:33 -05:00
Matt Keeler 95e8f795df Use the agent logger instead of log module 2018-07-19 11:22:01 -04:00
Matt Keeler 953b72318f Persist proxies from config files
Also change how loadProxies works. Now it will load all persisted proxies into a map, then when loading config file proxies will look up the previous proxy token in that map.
2018-07-18 17:04:35 -04:00
Matt Keeler 9f8991e0cc Fix issue with choosing a client addr that is 0.0.0.0 or :: 2018-07-16 16:30:15 -04:00
Mitchell Hashimoto 65bbc12d69
agent: use the correct ACL token for alias checks 2018-07-12 10:17:53 -07:00
Mitchell Hashimoto 99ead8324f
agent: alias checks have no interval 2018-07-12 09:36:11 -07:00
Mitchell Hashimoto 75ea0a1ee7
agent: run alias checks 2018-07-12 09:36:10 -07:00
Paul Banks 6fe7faa554
Merge pull request #4381 from hashicorp/proxy-check-default
Proxy check default
2018-07-12 17:08:35 +01:00
Matt Keeler 0a365b1a4f
Merge pull request #4374 from hashicorp/feature/proxy-env-vars
Setup managed proxy environment with API client env vars
2018-07-12 09:13:54 -04:00
Paul Banks 9223102331
Default managed proxy TCP check address sanely when proxy is bound to 0.0.0.0.
This also provides a mechanism to configure custom address or disable the check entirely from managed proxy config.
2018-07-12 12:57:10 +01:00
Matt Keeler 1e5e9fd8cd PR Updates
Proxy now doesn’t need to know anything about the api as we pass env vars to it instead of the api config.
2018-07-11 09:44:54 -04:00
Matt Keeler 358e6c8f6a Pass around an API Config object and convert to env vars for the managed proxy 2018-07-10 12:13:51 -04:00
Matt Keeler 115893b7d8 Remove https://prefix from TLSConfig.Address 2018-07-09 12:31:15 -04:00
mkeeler 1da3c42867 Merge remote-tracking branch 'connect/f-connect' 2018-06-25 19:42:51 +00:00
Mitchell Hashimoto 54ad6fc050 agent: convert the proxy bind_port to int if it is a float 2018-06-25 12:26:18 -07:00
Paul Banks 1d6e1ace11 register TCP check for managed proxies 2018-06-25 12:25:40 -07:00
Paul Banks d1810ba338 Make proxy only listen after initial certs are fetched 2018-06-25 12:25:40 -07:00
Paul Banks 42e28fa4d1 Limit proxy telemetry config to only be visible with authenticated with a proxy token 2018-06-25 12:25:39 -07:00
Paul Banks ca68136ac7 Refactor to use embedded struct. 2018-06-25 12:25:39 -07:00
Paul Banks 2df422e1e5 Disable TestAgent proxy execution properly 2018-06-25 12:25:38 -07:00
Mitchell Hashimoto 0d457a3e71 agent: RemoveProxy also removes the proxy service 2018-06-25 12:25:12 -07:00
Mitchell Hashimoto c30affa4b6 agent/proxy: AllowRoot to disable executing managed proxies when root 2018-06-25 12:25:11 -07:00
Paul Banks d0674cdd7a Warn about killing proxies in dev mode 2018-06-25 12:24:16 -07:00
Paul Banks d140612350 Fixs a few issues that stopped this working in real life but not caught by tests:
- Dev mode assumed no persistence of services although proxy state is persisted which caused proxies to be killed on startup as their services were no longer registered. Fixed.
 - Didn't snapshot the ProxyID which meant that proxies were adopted OK from snapshot but failed to restart if they died since there was no proxyID in the ENV on restart
 - Dev mode with no persistence just kills all proxies on shutdown since it can't recover them later
 - Naming things
2018-06-25 12:24:14 -07:00
Paul Banks 3df45ac7f1 Don't kill proxies on agent shutdown; backport manager close fix 2018-06-25 12:24:13 -07:00
Paul Banks 3bac52480e Abandon daemonize for simpler solution (preserving history):
Reverts:
  - bdb274852ae469c89092d6050697c0ff97178465
  - 2c689179c4f61c11f0016214c0fc127a0b813bfe
  - d62e25c4a7ab753914b6baccd66f88ffd10949a3
  - c727ffbcc98e3e0bf41e1a7bdd40169bd2d22191
  - 31b4d18933fd0acbe157e28d03ad59c2abf9a1fb
  - 85c3f8df3eabc00f490cd392213c3b928a85aa44
2018-06-25 12:24:10 -07:00
Paul Banks 9cea27c66e Sanity check that we are never trying to self-exec a test binary. Add daemonize bypass for TestAgent so that we don't have to jump through ridiculous self-execution hooks for every package that might possibly invoke a managed proxy 2018-06-25 12:24:09 -07:00
Paul Banks c97db00903 Run daemon processes as a detached child.
This turns out to have a lot more subtelty than we accounted for. The test suite is especially prone to races now we can only poll the child and many extra levels of indirectoin are needed to correctly run daemon process without it becoming a Zombie.

I ran this test suite in a loop with parallel enabled to verify for races (-race doesn't find any as they are logical inter-process ones not actual data races). I made it through ~50 runs before hitting an error due to timing which is much better than before. I want to go back and see if we can do better though. Just getting this up.
2018-06-25 12:24:08 -07:00
Paul Banks 3a00574a13 Persist proxy state through agent restart 2018-06-25 12:24:08 -07:00
Mitchell Hashimoto 9249662c6c
agent: leaf endpoint accepts name, not service ID
This change is important so that requests can made representing a
service that may not be registered with the same local agent.
2018-06-14 09:42:20 -07:00
Paul Banks bd5e569dc7
Make invalid clusterID be fatal 2018-06-14 09:42:17 -07:00
Paul Banks 834ed1d25f
Fixed many tests after rebase. Some still failing and seem unrelated to any connect changes. 2018-06-14 09:42:16 -07:00
Mitchell Hashimoto c42510e1ec
agent/cache: implement refresh backoff 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto b4f990bc6c
agent: verify local proxy tokens for CA leaf + tests 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto 8f7b5f93cd
agent: verify proxy token for ProxyConfig endpoint + tests 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto 1dfb4762f5
agent: increase timer for blocking cache endpoints 2018-06-14 09:42:12 -07:00
Mitchell Hashimoto 7bb13246a8
agent: clarify why we Kill still 2018-06-14 09:42:12 -07:00