open-consul

Commit Graph

Author	SHA1	Message	Date
R.B. Boyer	9a51ecc98b	agent: clients should only attempt to remove pruned nodes once per call (#6591 )	2019-10-07 16:15:23 -05:00
Sarah Christoff	9b93dd93c9	Prune Unhealthy Agents (#6571 ) * Add -prune flag to ForceLeave	2019-10-04 16:10:02 -05:00
Matt Keeler	b0b57588d1	Implement Leader Routine Management (#6580 ) * Implement leader routine manager Switch over the following to use it for go routine management: • Config entry Replication • ACL replication - tokens, policies, roles and legacy tokens • ACL legacy token upgrade • ACL token reaping • Intention Replication • Secondary CA Roots Watching • CA Root Pruning Also added the StopAll call into the Server Shutdown method to ensure all leader routines get killed off when shutting down. This should be mostly unnecessary as `revokeLeadership` should manually stop each one but just in case we really want these to go away (eventually).	2019-10-04 13:08:45 -04:00
Matt Keeler	9bd378a95c	Add EnterpriseConfig stubs (#6566 )	2019-10-01 14:34:55 -04:00
R.B. Boyer	8433ef02a8	connect: connect CA Roots in secondary datacenters should use a SigningKeyID derived from their local intermediate (#6513 ) This fixes an issue where leaf certificates issued in secondary datacenters would be reissued very frequently (every ~20 seconds) because the logic meant to detect root rotation was errantly triggering because a hash of the ultimate root (in the primary) was being compared against a hash of the local intermediate root (in the secondary) and always failing.	2019-09-26 11:54:14 -05:00
Matt Keeler	5b83f589da	Expand the QueryOptions and QueryMeta interfaces (#6545 ) In a previous PR I made it so that we had interfaces that would work enough to allow blockingQueries to work. However to complete this we need all fields to be settable and gettable. Notes: • If Go ever gets contracts/generics then we could get rid of all the Getters/Setters • protoc / protoc-gen-gogo are going to generate all the getters for us. • I copied all the getters/setters from the protobuf funcs into agent/structs/protobuf_compat.go • Also added JSON marshaling funcs that use jsonpb for protobuf types.	2019-09-26 09:55:02 -04:00
Freddy	5eace88ce2	Expose HTTP-based paths through Connect proxy (#6446 ) Fixes: #5396 This PR adds a proxy configuration stanza called expose. These flags register listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only listening on the loopback interface, while still accepting traffic from non Connect-enabled services. Under expose there is a boolean checks flag that would automatically expose all registered HTTP and gRPC check paths. This stanza also accepts a paths list to expose individual paths. The primary use case for this functionality would be to expose paths for third parties like Prometheus or the kubelet. Listeners for requests to exposed paths are be configured dynamically at run time. Any time a proxy, or check can be registered, a listener can also be created. In this initial implementation requests to these paths are not authenticated/encrypted.	2019-09-25 20:55:52 -06:00
Matt Keeler	8885c8d318	Allow for enterprise only leader routines (#6533 ) Eventually I am thinking we may need a way to register these at different priority levels but for now sticking this here is fine	2019-09-23 20:09:56 -04:00
R.B. Boyer	cc889443a5	connect: don't colon-hex-encode the AuthorityKeyId and SubjectKeyId fields in connect certs (#6492 ) The fields in the certs are meant to hold the original binary representation of this data, not some ascii-encoded version. The only time we should be colon-hex-encoding fields is for display purposes or marshaling through non-TLS mediums (like RPC).	2019-09-23 12:52:35 -05:00
Matt Keeler	8431c5f533	Add support for implementing new requests with protobufs instea… (#6502 ) * Add build system support for protobuf generation This is done generically so that we don’t have to keep updating the makefile to add another proto generation. Note: anything not in the vendor directory and with a .proto extension will be run through protoc if the corresponding namespace.pb.go file is not up to date. If you want to rebuild just a single proto file you can do so with: make proto-rebuild PROTOFILES=<list of proto files to rebuild> Providing the PROTOFILES var will override the default behavior of finding all the .proto files. * Start adding types to the agent/proto package These will be needed for some other work and are by no means comprehensive. * Add ability to resolve/fixup the agentpb.ACLLinks structure in the state store. * Use protobuf marshalling of raft requests instead of msgpack for protoc generated types. This does not change any encoding of existing types. * Removed structs package automatically encoding with protobuf marshalling Instead the caller of raftApply that wants to opt-in to protobuf encoding will have to call `raftApplyProtobuf` * Run update-vendor to fixup modules.txt Nothing changed as far as dependencies go but the ordering of modules in that file depends on the time they are first seen and its not alphabetical. * Rename some things and implement the structs.RPCInfo interface bits agentpb.QueryOptions and agentpb.WriteRequest implement 3 of the 4 RPCInfo funcs and the new TargetDatacenter message type implements the fourth. * Use the right encoding function. * Renamed agent/proto package to agent/agentpb to prevent package name conflicts * Update modules.txt to fix ordering * Change blockingQuery to take in interfaces for the query options and meta * Add %T to error output. * Add/Update some comments	2019-09-20 14:37:22 -04:00
R.B. Boyer	5c5f21088c	sdk: add freelist tracking and ephemeral port range skipping to freeport This should cut down on test flakiness. Problems handled: - If you had enough parallel test cases running, the former circular approach to handling the port block could hand out the same port to multiple cases before they each had a chance to bind them, leading to one of the two tests to fail. - The freeport library would allocate out of the ephemeral port range. This has been corrected for Linux (which should cover CI). - The library now waits until a formerly-in-use port is verified to be free before putting it back into circulation.	2019-09-17 14:30:43 -05:00
R.B. Boyer	edf5347d3c	fix typo of 'unknown' in log messages	2019-09-13 15:59:49 -05:00
Hans Hasselberg	f025a7440d	agent: handleEnterpriseLeave (#6453 )	2019-09-11 11:01:37 +02:00
Pierre Souchay	6d13efa828	Distinguish between DC not existing and not being available (#6399 )	2019-09-03 09:46:24 -06:00
Matt Keeler	31d9d2e557	Store primaries root in secondary after intermediate signature (#6333 ) * Store primaries root in secondary after intermediate signature This ensures that the intermediate exists within the CA root stored in raft and not just in the CA provider state. This has the very nice benefit of actually outputting the intermediate cert within the ca roots HTTP/RPC endpoints. This change means that if signing the intermediate fails it will not set the root within raft. So far I have not come up with a reason why that is bad. The secondary CA roots watch will pull the root again and go through all the motions. So as soon as getting an intermediate CA works the root will get set. * Make TestAgentAntiEntropy_Check_DeferSync less flaky I am not sure this is the full fix but it seems to help for me.	2019-08-30 11:38:46 -04:00
Pierre Souchay	35d90fc899	Display IPs of machines when node names conflict to ease troubleshooting When there is an node name conflicts, such messages are displayed within Consul: `consul.fsm: EnsureRegistration failed: failed inserting node: Error while renaming Node ID: "e1d456bc-f72d-98e5-ebb3-26ae80d785cf": Node name node001 is reserved by node 05f10209-1b9c-b90c-e3e2-059e64556d4a with name node001` While it is easy to find the node that has reserved the name, it is hard to find the node trying to aquire the name since it is not registered, because it is not part of `consul members` output This PR will display the IP of the offender and solve far more easily those issues.	2019-08-28 15:57:05 -04:00
Alvin Huang	e4e9381851	revert commits on master (#6413 )	2019-08-27 17:45:58 -04:00
tradel	2838a1550a	update tests to match new method signatures	2019-08-27 14:16:39 -07:00
tradel	93c839b76c	confi\gure providers with DC and domain	2019-08-27 14:16:25 -07:00
tradel	1acde6e30a	create a common name for autoTLS agent certs	2019-08-27 14:15:53 -07:00
Alvin Huang	9662b7c01a	add nil pointer check for pointer to ACLToken struct (#6407 )	2019-08-27 11:23:28 -04:00
Hans Hasselberg	4f7a3e8fa8	make sure auto_encrypt has private key type and bits	2019-08-26 13:09:50 +02:00
R.B. Boyer	2d4a3b51d0	Merge pull request #6388 from hashicorp/release/1-6 merging release/1-6 into master	2019-08-23 13:44:46 -05:00
Matt Keeler	89ac998e8b	Secondary CA `establishLeadership` fix (#6383 ) This prevents ACL issues (or other issues) during intermediate CA cert signing from failing leader establishment.	2019-08-23 11:32:37 -04:00
Hans Hasselberg	aada537d87	auto_encrypt: use server-port (#6287 ) AutoEncrypt needs the server-port because it wants to talk via RPC. Information from gossip might not be available at that point and thats why the server-port is being used.	2019-08-23 10:18:46 +02:00
Matt Keeler	8cb0560f52	Ensure that config entry writes are forwarded to the primary DC (#6339 )	2019-08-20 12:01:13 -04:00
R.B. Boyer	0675e0606e	connect: generate the full SNI names for discovery targets in the compiler rather than in the xds package (#6340 )	2019-08-19 13:03:03 -05:00
R.B. Boyer	d6456fddeb	connect: introduce ExternalSNI field on service-defaults (#6324 ) Compiling this will set an optional SNI field on each DiscoveryTarget. When set this value should be used for TLS connections to the instances of the target. If not set the default should be used. Setting ExternalSNI will disable mesh gateway use for that target. It also disables several service-resolver features that do not make sense for an external service.	2019-08-19 12:19:44 -05:00
R.B. Boyer	f84f509ce4	connect: updating a service-defaults config entry should leave an unset protocol alone (#6342 ) If the entry is updated for reasons other than protocol it is surprising that the value is explicitly persisted as 'tcp' rather than leaving it empty and letting it fall back dynamically on the proxy-defaults value.	2019-08-19 10:44:06 -05:00
Matt Keeler	73888eed36	Filter out left/leaving serf members when determining if new AC… (#6332 )	2019-08-16 10:34:18 -04:00
R.B. Boyer	22ee60d1ba	agent: blocking central config RPCs iterations should not interfere with each other (#6316 )	2019-08-14 09:08:46 -05:00
hashicorp-ci	29767157ed	Merge Consul OSS branch 'master' at commit 8f7586b339dbb518eff3a2eec27d7b8eae7a3fbb	2019-08-13 02:00:43 +00:00
Sarah Adams	2f7a90bc52	add flag to allow /operator/keyring requests to only hit local servers (#6279 ) Add parameter local-only to operator keyring list requests to force queries to only hit local servers (no WAN traffic). HTTP API: GET /operator/keyring?local-only=true CLI: consul keyring -list --local-only Sending the local-only flag with any non-GET/list request will result in an error.	2019-08-12 11:11:11 -07:00
Mike Morris	88df658243	connect: remove managed proxies (#6220 ) * connect: remove managed proxies implementation and all supporting config options and structs * connect: remove deprecated ProxyDestination * command: remove CONNECT_PROXY_TOKEN env var * agent: remove entire proxyprocess proxy manager * test: remove all managed proxy tests * test: remove irrelevant managed proxy note from TestService_ServerTLSConfig * test: update ContentHash to reflect managed proxy removal * test: remove deprecated ProxyDestination test * telemetry: remove managed proxy note * http: remove /v1/agent/connect/proxy endpoint * ci: remove deprecated test exclusion * website: update managed proxies deprecation page to note removal * website: remove managed proxy configuration API docs * website: remove managed proxy note from built-in proxy config * website: add note on removing proxy subdirectory of data_dir	2019-08-09 15:19:30 -04:00
R.B. Boyer	357ca39868	connect: ensure intention replication continues to work when the replication ACL token changes (#6288 )	2019-08-07 11:34:09 -05:00
hashicorp-ci	3ac803da5e	Merge Consul OSS branch 'master' at commit d84863799deca45ccf4bec5ab9f645ccae6b3aeb	2019-08-06 02:00:30 +00:00
Sarah Adams	9ed3e64510	fallback to proxy config global protocol when upstream services' protocol is unset (#6277 ) fallback to proxy config global protocol when upstream services' protocol is unset Fixes #5857	2019-08-05 12:52:35 -07:00
R.B. Boyer	64fc002e03	connect: fix failover through a mesh gateway to a remote datacenter (#6259 ) Failover is pushed entirely down to the data plane by creating envoy clusters and putting each successive destination in a different load assignment priority band. For example this shows that normally requests go to 1.2.3.4:8080 but when that fails they go to 6.7.8.9:8080: - name: foo load_assignment: cluster_name: foo policy: overprovisioning_factor: 100000 endpoints: - priority: 0 lb_endpoints: - endpoint: address: socket_address: address: 1.2.3.4 port_value: 8080 - priority: 1 lb_endpoints: - endpoint: address: socket_address: address: 6.7.8.9 port_value: 8080 Mesh gateways route requests based solely on the SNI header tacked onto the TLS layer. Envoy currently only lets you configure the outbound SNI header at the cluster layer. If you try to failover through a mesh gateway you ideally would configure the SNI value per endpoint, but that's not possible in envoy today. This PR introduces a simpler way around the problem for now: 1. We identify any target of failover that will use mesh gateway mode local or remote and then further isolate any resolver node in the compiled discovery chain that has a failover destination set to one of those targets. 2. For each of these resolvers we will perform a small measurement of comparative healths of the endpoints that come back from the health API for the set of primary target and serial failover targets. We walk the list of targets in order and if any endpoint is healthy we return that target, otherwise we move on to the next target. 3. The CDS and EDS endpoints both perform the measurements in (2) for the affected resolver nodes. 4. For CDS this measurement selects which TLS SNI field to use for the cluster (note the cluster is always going to be named for the primary target) 5. For EDS this measurement selects which set of endpoints will populate the cluster. Priority tiered failover is ignored. One of the big downsides to this approach to failover is that the failover detection and correction is going to be controlled by consul rather than deferring that entirely to the data plane as with the prior version. This also means that we are bound to only failover using official health signals and cannot make use of data plane signals like outlier detection to affect failover. In this specific scenario the lack of data plane signals is ok because the effectiveness is already muted by the fact that the ultimate destination endpoints will have their data plane signals scrambled when they pass through the mesh gateway wrapper anyway so we're not losing much. Another related fix is that we now use the endpoint health from the underlying service, not the health of the gateway (regardless of failover mode).	2019-08-05 13:30:35 -05:00
R.B. Boyer	0165e93517	connect: expose an API endpoint to compile the discovery chain (#6248 ) In addition to exposing compilation over the API cleaned up the structures that would be exchanged to be cleaner and easier to support and understand. Also removed ability to configure the envoy OverprovisioningFactor.	2019-08-02 15:34:54 -05:00
Todd Radel	295abd82c3	connect: generate intermediate at same time as root (#6272 ) Generate intermediate at same time as root Co-Authored-By: Freddy <freddygv@users.noreply.github.com>	2019-08-02 15:36:03 -04:00
R.B. Boyer	4e2fb5730c	connect: detect and prevent circular discovery chain references (#6246 )	2019-08-02 09:18:45 -05:00
R.B. Boyer	6c9edb17c2	server: if inserting bootstrap config entries fails don't silence the errors (#6256 )	2019-08-01 23:07:11 -05:00
R.B. Boyer	782c647bf4	connect: simplify the compiled discovery chain data structures (#6242 ) This should make them better for sending over RPC or the API. Instead of a chain implemented explicitly like a linked list (nodes holding pointers to other nodes) instead switch to a flat map of named nodes with nodes linking other other nodes by name. The shipped structure is just a map and a string to indicate which key to start from. Other changes: * inline the compiler option InferDefaults as true * introduce compiled target config to avoid needing to send back additional maps of Resolvers; future target-specific compiled state can go here * move compiled MeshGateway out of the Resolver and into the TargetConfig where it makes more sense.	2019-08-01 22:44:05 -05:00
R.B. Boyer	4666599e18	connect: reconcile how upstream configuration works with discovery chains (#6225 ) * connect: reconcile how upstream configuration works with discovery chains The following upstream config fields for connect sidecars sanely integrate into discovery chain resolution: - Destination Namespace/Datacenter: Compilation occurs locally but using different default values for namespaces and datacenters. The xDS clusters that are created are named as they normally would be. - Mesh Gateway Mode (single upstream): If set this value overrides any value computed for any resolver for the entire discovery chain. The xDS clusters that are created may be named differently (see below). - Mesh Gateway Mode (whole sidecar): If set this value overrides any value computed for any resolver for the entire discovery chain. If this is specifically overridden for a single upstream this value is ignored in that case. The xDS clusters that are created may be named differently (see below). - Protocol (in opaque config): If set this value overrides the value computed when evaluating the entire discovery chain. If the normal chain would be TCP or if this override is set to TCP then the result is that we explicitly disable L7 Routing and Splitting. The xDS clusters that are created may be named differently (see below). - Connect Timeout (in opaque config): If set this value overrides the value for any resolver in the entire discovery chain. The xDS clusters that are created may be named differently (see below). If any of the above overrides affect the actual result of compiling the discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op override to "tcp") then the relevant parameters are hashed and provided to the xDS layer as a prefix for use in naming the Clusters. This is to ensure that if one Upstream discovery chain has no overrides and tangentially needs a cluster named "api.default.XXX", and another Upstream does have overrides for "api.default.XXX" that they won't cross-pollinate against the operator's wishes. Fixes #6159	2019-08-01 22:03:34 -05:00
Paul Banks	a5c70d79d0	Revert "connect: support AWS PCA as a CA provider" (#6251 ) This reverts commit 3497b7c00d49c4acbbf951d84f2bba93f3da7510.	2019-07-31 09:08:10 -04:00
Todd Radel	d3b7fd83fe	connect: support AWS PCA as a CA provider (#6189 ) Port AWS PCA provider from consul-ent	2019-07-30 22:57:51 -04:00
Todd Radel	1b14d6595e	connect: Support RSA keys in addition to ECDSA (#6055 ) Support RSA keys in addition to ECDSA	2019-07-30 17:47:39 -04:00
Matt Keeler	a7c4b7af7c	Fix CA Replication when ACLs are enabled (#6201 ) Secondary CA initialization steps are: • Wait until the primary will be capable of signing intermediate certs. We use serf metadata to check the versions of servers in the primary which avoids needing a token like the previous implementation that used RPCs. We require at least one alive server in the primary and the all alive servers meet the version requirement. • Initialize the secondary CA by getting the primary to sign an intermediate When a primary dc is configured, if no existing CA is initialized and for whatever reason we cannot initialize a secondary CA the secondary DC will remain without a CA. As soon as it can it will initialize the secondary CA by pulling the primaries roots and getting the primary to sign an intermediate. This also fixes a segfault that can happen during leadership revocation. There was a spot in the secondaryCARootsWatch that was getting the CA Provider and executing methods on it without nil checking. Under normal circumstances it wont be nil but during leadership revocation it gets nil'ed out. Therefore there is a period of time between closing the stop chan and when the go routine is actually stopped where it could read a nil provider and cause a segfault.	2019-07-26 15:57:57 -04:00
R.B. Boyer	1b95d2e5e3	Merge Consul OSS branch master at commit b3541c4f34d43ab92fe52256420759f17ea0ed73	2019-07-26 10:34:24 -05:00
Matt Keeler	c4a34602b6	Allow forwarding of some status RPCs (#6198 ) * Allow forwarding of some status RPCs * Update docs * add comments about not using the regular forward	2019-07-25 14:26:22 -04:00

1 2 3 4 5 ...

642 Commits