open-consul

Commit Graph

Author	SHA1	Message	Date
R.B. Boyer	a7fb26f50f	wan federation via mesh gateways (#6884 ) This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch: There are several distinct chunks of code that are affected: * new flags and config options for the server * retry join WAN is slightly different * retry join code is shared to discover primary mesh gateways from secondary datacenters * because retry join logic runs in the agent and the results of that operation for primary mesh gateways are needed in the server there are some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur at multiple layers of abstraction just to pass the data down to the right layer. * new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers * the function signature for RPC dialing picked up a new required field (the node name of the destination) * several new RPCs for manipulating a FederationState object: `FederationState:{Apply,Get,List,ListMeshGateways}` * 3 read-only internal APIs for debugging use to invoke those RPCs from curl * raft and fsm changes to persist these FederationStates * replication for FederationStates as they are canonically stored in the Primary and replicated to the Secondaries. * a special derivative of anti-entropy that runs in secondaries to snapshot their local mesh gateway `CheckServiceNodes` and sync them into their upstream FederationState in the primary (this works in conjunction with the replication to distribute addresses for all mesh gateways in all DCs to all other DCs) * a "gateway locator" convenience object to make use of this data to choose the addresses of gateways to use for any given RPC or gossip operation to a remote DC. This gets data from the "retry join" logic in the agent and also directly calls into the FSM. * RPC (`:8300`) on the server sniffs the first byte of a new connection to determine if it's actually doing native TLS. If so it checks the ALPN header for protocol determination (just like how the existing system uses the type-byte marker). * 2 new kinds of protocols are exclusively decoded via this native TLS mechanism: one for ferrying "packet" operations (udp-like) from the gossip layer and one for "stream" operations (tcp-like). The packet operations re-use sockets (using length-prefixing) to cut down on TLS re-negotiation overhead. * the server instances specially wrap the `memberlist.NetTransport` when running with gateway federation enabled (in a `wanfed.Transport`). The general gist is that if it tries to dial a node in the SAME datacenter (deduced by looking at the suffix of the node name) there is no change. If dialing a DIFFERENT datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh gateways to eventually end up in a server's :8300 port. * a new flag when launching a mesh gateway via `consul connect envoy` to indicate that the servers are to be exposed. This sets a special service meta when registering the gateway into the catalog. * `proxycfg/xds` notice this metadata blob to activate additional watches for the FederationState objects as well as the location of all of the consul servers in that datacenter. * `xds:` if the extra metadata is in place additional clusters are defined in a DC to bulk sink all traffic to another DC's gateways. For the current datacenter we listen on a wildcard name (`server.<dc>.consul`) that load balances all servers as well as one mini-cluster per node (`<node>.server.<dc>.consul`) * the `consul tls cert create` command got a new flag (`-node`) to help create an additional SAN in certs that can be used with this flavor of federation.	2020-03-09 15:59:02 -05:00
Pierre Souchay	49dc891737	agent: configuration reload preserves check's statuses for services (#7345 ) This fixes issue #7318 Between versions 1.5.2 and 1.5.3, a regression has been introduced regarding health of services. A patch #6144 had been issued for HealthChecks of nodes, but not for healthchecks of services. What happened when a reload was: 1. save all healthcheck statuses 2. cleanup everything 3. add new services with healthchecks In step 3, the state of healthchecks was taken into account locally, so at step 3, but since we cleaned up at step 2, state was lost. This PR introduces the snap parameter, so step 3 can use information from step 1	2020-03-09 12:59:41 +01:00
Hans Hasselberg	2bba591906	agent: sensible keyring error (#7272 ) Fixes #7231. Before an agent would always emit a warning when there is an encrypt key in the configuration and an existing keyring stored, which is happening on restart. Now it only emits that warning when the encrypt key from the configuration is not part of the keyring.	2020-02-13 20:35:09 +01:00
Akshay Ganeshen	fd32016ce9	feat: support sending body in HTTP checks (#6602 )	2020-02-10 09:27:12 -07:00
Freddy	67e02a0752	Add managed service provider token (#7218 ) Stubs for enterprise-only ACL token to be used by managed service providers.	2020-02-04 13:58:56 -07:00
Hans Hasselberg	50281032e0	Security fixes (#7182 ) * Mitigate HTTP/RPC Services Allow Unbounded Resource Usage Fixes #7159. Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>	2020-01-31 11:19:37 -05:00
R.B. Boyer	01ebdff2a9	various tweaks on top of the hclog work (#7165 )	2020-01-29 11:16:08 -06:00
Chris Piraino	3dd0b59793	Allow users to configure either unstructured or JSON logging (#7130 ) * hclog Allow users to choose between unstructured and JSON logging	2020-01-28 17:50:41 -06:00
Kit Patella	49e9bbbdf9	Add accessorID of token when ops are denied by ACL system (#7117 ) * agent: add and edit doc comments * agent: add ACL token accessorID to debugging traces * agent: polish acl debugging * agent: minor fix + string fmt over value interp * agent: undo export & fix logging field names * agent: remove note and migrate up to code review * Update agent/consul/acl.go Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com> * agent: incorporate review feedback * Update agent/acl.go Co-Authored-By: R.B. Boyer <public@richardboyer.net> Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: R.B. Boyer <public@richardboyer.net>	2020-01-27 11:54:32 -08:00
Matt Keeler	485a0a65ea	Updates to Config Entries and Connect for Namespaces (#7116 )	2020-01-24 10:04:58 -05:00
Hans Hasselberg	e00effa325	agent: setup grpc server with auto_encrypt certs and add -https-port (#7086 ) * setup grpc server with TLS config used across consul. * add -https-port flag	2020-01-22 11:32:17 +01:00
Aestek	8c799447cf	agent: remove service sidecars in Agent.cleanupRegistration (#7022 ) Sidecar proxies were left behind when cleaning up after an unsuccessful registration. There are now also removed when the service is cleanup up.	2020-01-20 14:01:40 +01:00
Hans Hasselberg	b6c83e06d5	auto_encrypt: set dns and ip san for k8s and provide configuration (#6944 ) * Add CreateCSRWithSAN * Use CreateCSRWithSAN in auto_encrypt and cache * Copy DNSNames and IPAddresses to cert * Verify auto_encrypt.sign returns cert with SAN * provide configuration options for auto_encrypt dnssan and ipsan * rename CreateCSRWithSAN to CreateCSR	2020-01-17 23:25:26 +01:00
Aestek	9329cbac0a	Add support for dual stack IPv4/IPv6 network (#6640 ) * Use consts for well known tagged adress keys * Add ipv4 and ipv6 tagged addresses for node lan and wan * Add ipv4 and ipv6 tagged addresses for service lan and wan * Use IPv4 and IPv6 address in DNS	2020-01-17 09:54:17 -05:00
Matej Urbas	d877e091d6	agent: configurable MaxQueryTime and DefaultQueryTime. (#3777 )	2020-01-17 14:20:57 +01:00
Matt Keeler	6de4eb8569	OSS changes for implementing token based namespace inferencing remove debug log	2019-12-18 14:07:08 -05:00
Matt Keeler	442924c35a	Sync of OSS changes to support namespaces (#6909 )	2019-12-09 21:26:41 -05:00
Hans Hasselberg	a36e58c964	agent: fewer file local differences between enterprise and oss (#6820 ) (#6898 ) * Increase number to test ignore. Consul Enterprise has more flags and since we are trying to reduce the differences between both code bases, we are increasing the number in oss. The semantics don't change, it is just a cosmetic thing. * Introduce agent.initEnterprise for enterprise related hooks. * Sync test with ent version. * Fix import order. * revert error wording.	2019-12-06 21:35:58 +01:00
Sarah Adams	1f5b333290	give feedback to CLI user on forceleave command if node does not exist (#6841 )	2019-12-02 11:06:15 -08:00
Matt Keeler	036ab56f17	Track the correct check id for idempotent service/check updates	2019-11-14 11:30:44 -05:00
Sarah Christoff	86b30bbfbe	Set MinQuorum variable in Autopilot (#6654 ) * Add MinQuorum to Autopilot	2019-10-29 09:04:41 -05:00
Freddy	caf658d0d3	Store check type in catalog (#6561 )	2019-10-17 20:33:11 +02:00
PHBourquin	16ca8340c1	Checks to passing/critical only after reaching a consecutive success/failure threshold (#5739 ) A check may be set to become passing/critical only if a specified number of successive checks return passing/critical in a row. Status will stay identical as before until the threshold is reached. This feature is available for HTTP, TCP, gRPC, Docker & Monitor checks.	2019-10-14 21:49:49 +01:00
Sarah Christoff	9b93dd93c9	Prune Unhealthy Agents (#6571 ) * Add -prune flag to ForceLeave	2019-10-04 16:10:02 -05:00
Freddy	5eace88ce2	Expose HTTP-based paths through Connect proxy (#6446 ) Fixes: #5396 This PR adds a proxy configuration stanza called expose. These flags register listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only listening on the loopback interface, while still accepting traffic from non Connect-enabled services. Under expose there is a boolean checks flag that would automatically expose all registered HTTP and gRPC check paths. This stanza also accepts a paths list to expose individual paths. The primary use case for this functionality would be to expose paths for third parties like Prometheus or the kubelet. Listeners for requests to exposed paths are be configured dynamically at run time. Any time a proxy, or check can be registered, a listener can also be created. In this initial implementation requests to these paths are not authenticated/encrypted.	2019-09-25 20:55:52 -06:00
R.B. Boyer	682b5370c9	agent: tolerate more failure scenarios during service registration with central config enabled (#6472 ) Also: * Finished threading replaceExistingChecks setting (from GH-4905) through service manager. * Respected the original configSource value that was used to register a service or a check when restoring persisted data. * Run several existing tests with and without central config enabled (not exhaustive yet). * Switch to ioutil.ReadFile for all types of agent persistence.	2019-09-24 10:04:48 -05:00
Aestek	19c4459d19	Add option to register services and their checks idempotently (#4905 )	2019-09-02 09:38:29 -06:00
Alvin Huang	e4e9381851	revert commits on master (#6413 )	2019-08-27 17:45:58 -04:00
tradel	b0bbcd8b94	add domain and nodeName to agent cert request	2019-08-27 14:11:40 -07:00
R.B. Boyer	0c5409d172	test: fix TestAgent.Start() to not segfault if the DNSServer cannot ListenAndServe (#6409 ) The embedded `Server` field on a `DNSServer` is only set inside of the `ListenAndServe` method. If that method fails for reasons like the address being in use and is not bindable, then the `Server` field will not be set and the overall `Agent.Start()` will fail. This will trigger the inner loop of `TestAgent.Start()` to invoke `ShutdownEndpoints` which will attempt to pretty print the DNS servers using fields on that inner `Server` field. Because it was never set, this causes a nil pointer dereference and crashes the test.	2019-08-27 10:45:05 -05:00
Mike Morris	88df658243	connect: remove managed proxies (#6220 ) * connect: remove managed proxies implementation and all supporting config options and structs * connect: remove deprecated ProxyDestination * command: remove CONNECT_PROXY_TOKEN env var * agent: remove entire proxyprocess proxy manager * test: remove all managed proxy tests * test: remove irrelevant managed proxy note from TestService_ServerTLSConfig * test: update ContentHash to reflect managed proxy removal * test: remove deprecated ProxyDestination test * telemetry: remove managed proxy note * http: remove /v1/agent/connect/proxy endpoint * ci: remove deprecated test exclusion * website: update managed proxies deprecation page to note removal * website: remove managed proxy configuration API docs * website: remove managed proxy note from built-in proxy config * website: add note on removing proxy subdirectory of data_dir	2019-08-09 15:19:30 -04:00
Alvin Huang	5b6fa58453	resolve circleci config conflicts	2019-07-23 20:18:36 -04:00
Paul Banks	42296292a4	Allow raft TrailingLogs to be configured. (#6186 ) This fixes pathological cases where the write throughput and snapshot size are both so large that more than 10k log entries are written in the time it takes to restore the snapshot from disk. In this case followers that restart can never catch up with leader replication again and enter a loop of constantly downloading a full snapshot and restoring it only to find that snapshot is already out of date and the leader has truncated its logs so a new snapshot is sent etc. In general if you need to adjust this, you are probably abusing Consul for purposes outside its design envelope and should reconsider your usage to reduce data size and/or write volume.	2019-07-23 15:19:57 +01:00
Alvin Huang	17654c6292	Merge branch 'master' into release/1-6	2019-07-17 15:43:30 -04:00
R.B. Boyer	4c05f1f519	agent: avoid reverting any check updates that occur while a service is being added or the config is reloaded (#6144 )	2019-07-17 14:06:50 -05:00
Matt Keeler	fc27eb973a	Implement caching for config entry lists Update agent/cache-types/config_entry.go Co-Authored-By: R.B. Boyer <public@richardboyer.net>	2019-07-02 10:11:19 -04:00
R.B. Boyer	bccbb2b4ae	activate most discovery chain features in xDS for envoy (#6024 )	2019-07-01 22:10:51 -05:00
Matt Keeler	24749bc7e5	Implement Kind based ServiceDump and caching of the ServiceDump RPC	2019-07-01 16:28:30 -04:00
hashicorp-ci	e36792395e	Merge Consul OSS branch 'master' at commit e91f73f59249f5756896b10890e9298e7c1fbacc	2019-06-30 02:00:31 +00:00
Hans Hasselberg	73c4e9f07c	tls: auto_encrypt enables automatic RPC cert provisioning for consul clients (#5597 )	2019-06-27 22:22:07 +02:00
hashicorp-ci	3224bea082	Merge Consul OSS branch 'master' at commit 4eb73973b6e53336fd505dc727ac84c1f7e78872	2019-06-27 02:00:41 +00:00
Pierre Souchay	e394a9469b	Support for maximum size for Output of checks (#5233 ) * Support for maximum size for Output of checks This PR allows users to limit the size of output produced by checks at the agent and check level. When set at the agent level, it will limit the output for all checks monitored by the agent. When set at the check level, it can override the agent max for a specific check but only if it is lower than the agent max. Default value is 4k, and input must be at least 1.	2019-06-26 09:43:25 -06:00
hashicorp-ci	d237e86d83	Merge Consul OSS branch 'master' at commit 88b15d84f9fdb58ceed3dc971eb0390be85e3c15 skip-checks: true	2019-06-25 02:00:26 +00:00
Matt Keeler	f0f28707bc	New Cache Types (#5995 ) * Add a cache type for the Catalog.ListServices endpoint * Add a cache type for the Catalog.ListDatacenters endpoint	2019-06-24 14:11:34 -04:00
Hans Hasselberg	0d8d7ae052	agent: transfer leadership when establishLeadership fails (#5247 )	2019-06-19 14:50:48 +02:00
Pierre Souchay	1da1825056	Ensure Consul is IPv6 compliant (#5468 )	2019-06-04 10:02:38 -04:00
Pierre Souchay	27207fdaed	agent: Improve startup message to avoid confusing users when no error occurs (#5896 ) * Improve startup message to avoid confusing users when no error occurs Several times, some users not very familiar with Consul get confused by error message at startup: `[INFO] agent: (LAN) joined: 1 Err: <nil>` Having `Err: <nil>` seems weird to many users, I propose to have the following instead: * Success: `[INFO] agent: (LAN) joined: 1` * Error: `[WARN] agent: (LAN) couldn't join: %d Err: ERROR`	2019-05-24 16:50:18 +02:00
R.B. Boyer	9542fdc9bc	acl: adding Roles to Tokens (#5514 ) Roles are named and can express the same bundle of permissions that can currently be assigned to a Token (lists of Policies and Service Identities). The difference with a Role is that it not itself a bearer token, but just another entity that can be tied to a Token. This lets an operator potentially curate a set of smaller reusable Policies and compose them together into reusable Roles, rather than always exploding that same list of Policies on any Token that needs similar permissions. This also refactors the acl replication code to be semi-generic to avoid 3x copypasta.	2019-04-26 14:49:12 -05:00
Matt Keeler	3ea9fe3bff	Implement bootstrapping proxy defaults from the config file (#5714 )	2019-04-26 14:25:03 -04:00
Matt Keeler	2831c8993d	Move the watch package into the api module (#5664 ) * Move the watch package into the api module It was already just a thin wrapper around the API anyways. The biggest change was to the testing. Instead of using a test agent directly from the agent package it now uses the binary on the PATH just like the other API tests. The other big changes were to fix up the connect based watch tests so that we didn’t need to pull in the connect package (and therefore all of Consul)	2019-04-26 12:33:01 -04:00

1 2 3 4 5 ...

282 Commits