Commit Graph

1021 Commits

Author SHA1 Message Date
Florian Apolloner cb5389cc89
Allow Operator Generated bootstrap token (#14437)
Add support to provide an initial token via the bootstrap HTTP API, similar to hashicorp/nomad#12520
2023-01-04 20:19:33 +00:00
Dan Upton 4719b717ea
grpc/acl: relax permissions required for "core" endpoints (#15346)
Previously, these endpoints required `service:write` permission on _any_
service as a sort of proxy for "is the caller allowed to participate in
the mesh?".

Now, they're called as part of the process of establishing a server
connection by any consumer of the consul-server-connection-manager
library, which will include non-mesh workloads (e.g. Consul KV as a
storage backend for Vault) as well as ancillary components such as
consul-k8s' acl-init process, which likely won't have `service:write`
permission.

So this commit relaxes those requirements to accept *any* valid ACL token
on the following gRPC endpoints:

- `hashicorp.consul.dataplane.DataplaneService/GetSupportedDataplaneFeatures`
- `hashicorp.consul.serverdiscovery.ServerDiscoveryService/WatchServers`
- `hashicorp.consul.connectca.ConnectCAService/WatchRoots`
2023-01-04 12:40:34 +00:00
Derek Menteer 2af14c0084
Fix issue with incorrect proxycfg watch on upstream peer-targets. (#15865)
This fixes an issue where the incorrect partition was given to the
upstream target watch, which meant that failover logic would not
work correctly.
2023-01-03 10:44:08 -06:00
Derek Menteer 76cc876ac3
Fix agent cache incorrectly notifying unchanged protobufs. (#15866)
Fix agent cache incorrectly notifying unchanged protobufs.

This change fixes a situation where the protobuf private fields
would be read by reflect.DeepEqual() and indicate data was modified.
This resulted in change notifications being fired every time, which
could cause performance problems in proxycfg.
2023-01-03 10:11:56 -06:00
Dan Stough 38d65efb72
[OSS] feat: access logs for listeners and listener filters (#15864)
* feat: access logs for listeners and listener filters

* changelog

* fix integration test
2022-12-22 15:18:15 -05:00
Michael Wilkerson ebed9e048f
Enhancement: Consul Compatibility Checking (#15818)
* add functions for returning the max and min Envoy major versions
- added an UnsupportedEnvoyVersions list
- removed an unused error from TestDetermineSupportedProxyFeaturesFromString
- modified minSupportedVersion to use the function for getting the Min Envoy major version. Using just the major version without the patch is equivalent to using `.0`

* added a function for executing the envoy --version command
- added a new exec.go file to not be locked to unix system

* added envoy version check when using consul connect envoy

* added changelog entry

* added docs change
2022-12-20 09:58:19 -08:00
Derek Menteer e25f7313e4
Fix incorrect protocol check on discovery chains with peer targets. (#15833) 2022-12-20 10:15:03 -06:00
Nitya Dhanushkodi 8386bf19bf
extensions: refactor serverless plugin to use extensions from config entry fields (#15817)
docs: update config entry docs and the Lambda manual registration docs

Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
Co-authored-by: Eric <eric@haberkorn.co>
2022-12-19 12:19:37 -08:00
Chris S. Kim 87485d05bd
Error out `consul connect envoy` if agent explicitly disabled grpc (#15794)
Co-authored-by: Paul Glass <pglass@hashicorp.com>
2022-12-19 14:37:27 -05:00
Chris S. Kim f8868c7ccf
Add custom balancer to always remove subConns (#15701)
The new balancer is a patched version of gRPC's default pick_first balancer
which removes the behavior of preserving the active subconnection if
a list of new addresses contains the currently active address.
2022-12-19 17:39:31 +00:00
Dan Stough 480a7d2ff6
docs: update changelog from 1.14.3, 1.13.5, 1.12.8 (#15804) 2022-12-14 18:47:35 -05:00
Pier-Luc Caron St-Pierre 97b859a690
connect: Add support for ConsulResolver to specifies a filter expression (#15659)
* connect: Add support for ConsulResolver to specifies a filter expression
2022-12-14 12:41:07 -08:00
Paul Glass 62df6a7513
Deprecate -join and -join-wan (#15598) 2022-12-14 20:28:25 +00:00
cskh 3e37a449c8
feat(ingress-gateway): support outlier detection of upstream service for ingress gateway (#15614)
* feat(ingress-gateway): support outlier detection of upstream service for ingress gateway

* changelog

Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
2022-12-13 11:51:37 -05:00
Derek Menteer 50a5549f8a
Fix DialedDirectly configuration for Consul dataplane. (#15760)
Fix DialedDirectly configuration for Consul dataplane.
2022-12-13 09:16:31 -06:00
Kyle Schochenmaier 83fb137d4f
add changelog for enterprise 3846 (#15773) 2022-12-12 15:08:02 -06:00
John Murret fe0432ade5
agent: Fix assignment of error when auto-reloading cert and key file changes. (#15769)
* Adding the setting of errors missing in config file watcher code in agent.

* add changelog
2022-12-12 12:24:39 -07:00
Dan Stough c01b9d5bfe
[OSS] security: update x/net module (#15737)
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-12-08 16:45:44 -05:00
Eric Haberkorn 5dd131fee8
Remove the `connect.enable_serverless_plugin` agent configuration option (#15710) 2022-12-08 14:46:42 -05:00
Dhia Ayachi addf4e2346
update go version to 1.19.4 (#15705)
* update go version to 1.19.4

* add changelog
2022-12-07 15:11:22 -05:00
Derek Menteer f17a4f07c5
Fix local mesh gateway with peering discovery chains. (#15690)
Fix local mesh gateway with peering discovery chains.

Prior to this patch, discovery chains with peers would not
properly honor the mesh gateway mode for two reasons.

1. An incorrect target upstream ID was used to lookup the
mesh gateway mode. To fix this, the parent upstream uid is
now used instead of the discovery-chain-target-uid to find
the intended mesh gateway mode.

2. The watch for local mesh gateways was never initialized
for discovery chains. To fix this, the discovery chains are
now scanned, and a local GW watch is spawned if: the mesh
gateway mode is local and the target is a peering connection.
2022-12-07 13:07:42 -06:00
R.B. Boyer ba6b24babf
connect: ensure all vault connect CA tests use limited privilege tokens (#15669)
All of the current integration tests where Vault is the Connect CA now use non-root tokens for the test. This helps us detect privilege changes in the vault model so we can keep our guides up to date.

One larger change was that the RenewIntermediate function got refactored slightly so it could be used from a test, rather than the large duplicated function we were testing in a test which seemed error prone.
2022-12-06 10:06:36 -06:00
R.B. Boyer a88d1239e3
Detect Vault 1.11+ import in secondary datacenters and update default issuer (#15661)
The fix outlined and merged in #15253 fixed the issue as it occurs in the primary
DC. There is a similar issue that arises when vault is used as the Connect CA in a
secondary datacenter that is fixed by this PR.

Additionally: this PR adds support to run the existing suite of vault related integration
tests against the last 4 versions of vault (1.9, 1.10, 1.11, 1.12)
2022-12-05 15:39:21 -06:00
Jared Kirschner 145289dc08
Clarify Vault CA changelog entry (#15662) 2022-12-02 20:16:49 -05:00
Dao Thanh Tung 6d519c7343
Fixing CLI ACL token processing unexpected precedence (#15274)
* Fixing CLI ACL token processing unexpected precedence

* Minor flow format and add Changelog

* Fixed failed tests and improve error logging message

* Add unit test cases and minor changes from code review

* Unset env var once the test case finishes running

* remove label FINISH
2022-12-02 12:19:52 -05:00
Michael Wilkerson 862e4410a8
added changelog for enterprise only change (#15621) 2022-11-30 11:39:20 -08:00
Tyler Wendlandt 4a7fe5625a
ui: Add ServerExternalAddresses to peer token create form (#15555)
* ui: Add ServerExternalAddresses field to token generation

* Add test for ServerExternalAddresses on peer token create

* Add changelog entry

* Update translations

* Format hbs files

* Update translations
2022-11-30 11:42:36 -07:00
R.B. Boyer a8411976a8
peering: better represent non-passing states during peer check flattening (#15615)
During peer stream replication we flatten checks from the source cluster and build one thin overall check to hide the irrelevant details from the consuming cluster. This flattening logic did correctly flip to non-passing if there were any non-passing checks, but WHICH status it got during that was random (warn/error).

Also it didn't represent "maintenance" operations. There is an api package call AggregatedStatus which more correctly flattened check statuses.

This PR replicated the more complete logic into the peer stream package.
2022-11-30 11:29:21 -06:00
Freddy 7641d10184
Remove log line about server mgmt token init (#15610)
* Remove log line about server mgmt token init

Currently the server management token is only being bootstrapped in the
primary datacenter. That means that servers on the secondary datacenter
will never have this token available, and would log this line any time a
token is resolved.

Bootstrapping the token in secondary datacenters will be done in a
follow-up.

* Add changelog entry
2022-11-29 17:56:03 -05:00
James Oulman 71f7f2e3dc
Add support for configuring Envoys route idle_timeout (#14340)
* Add idleTimeout

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
2022-11-29 17:43:15 -05:00
Derek Menteer 79bef1982f
Add peering `.service` and `.node` DNS lookups. (#15596)
Add peering `.service` and `.node` DNS lookups.
2022-11-29 12:23:18 -06:00
cskh 92e71318c1
fix(peering): increase the gRPC limit to 8MB (#15503)
* fix(peering): increase the gRPC limit to 50MB

* changelog

* update gRPC limit to 8MB
2022-11-28 17:48:43 -05:00
Chris S. Kim efffcd56d0
Fix Vault managed intermediate PKI bug (#15525) 2022-11-28 16:17:58 -05:00
Chris S. Kim 450c8b9eff
[OSS] Add boilerplate for proto files implementing BlockableQuery (#15554) 2022-11-25 15:46:56 -05:00
Chris S. Kim d146a3d542
Use rpcHoldTimeout to calculate blocking timeout (#15541)
Adds buffer to clients so that servers have time to respond to blocking queries.
2022-11-24 10:13:02 -05:00
Chris Thain f9b85cbee6
Add changelog for snapshot agent updates (#15516) 2022-11-22 06:11:46 -08:00
Jared Kirschner b97acfb107
Support RFC 2782 for prepared query DNS lookups (#14465)
Format:
	_<query id or name>._tcp.query[.<datacenter>].<domain>
2022-11-20 17:21:24 -05:00
Derek Menteer 3d82afcb01
Fix issue with connect Envoy choosing incorrect TLS settings. (#15466)
This commit fixes a situation where the API TLS configuration
incorrectly influences the GRPC port TLS configuration for XDS.
2022-11-18 14:36:20 -06:00
Derek Menteer d71f071bec
Fix SDK to support older versions of Consul. (#15423)
This change was necessary, because the configuration was always
generated with a gRPC TLS port, which did not exist in Consul 1.13,
and would result in the server failing to launch with an error.

This code checks the version of Consul and conditionally adds the
gRPC TLS port, only if the version number is greater than 1.14.
2022-11-18 10:32:01 -06:00
Alexander Scheel 8ef3fe3812
Detect Vault 1.11+ import, update default issuer (#15253)
Consul used to rely on implicit issuer selection when calling Vault endpoints to issue new CSRs. Vault 1.11+ changed that behavior, which caused Consul to check the wrong (previous) issuer when renewing its Intermediate CA. This patch allows Consul to explicitly set a default issuer when it detects that the response from Vault is 1.11+.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2022-11-17 16:29:49 -05:00
Derek Menteer 1f21364f7c
Consul 1.14 post-release updates (#15382)
* Update changelog with 1.14 notes.

* gomod version bumps for 1.14 release.
2022-11-15 14:22:43 -06:00
Kyle Havlovitz f5c5d2f5c6
auto-config: relax node name validation for JWT authorization (#15370)
* auto-config: relax node name validation for JWT authorization

This changes the JWT authorization logic to allow all non-whitespace,
non-quote characters when validating node names. Consul had previously
allowed these characters in node names, until this validation was added
to fix a security vulnerability with whitespace/quotes being passed to
the `bexpr` library. This unintentionally broke node names with
characters like `.` which aren't related to this vulnerability.

* Update website/content/docs/agent/config/cli-flags.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2022-11-14 18:24:40 -06:00
Dhia Ayachi 219a3c5bd3
Leadership transfer cmd (#14132)
* add leadership transfer command

* add RPC call test (flaky)

* add missing import

* add changelog

* add command registration

* Apply suggestions from code review

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* add the possibility of providing an id to raft leadership transfer. Add few tests.

* delete old file from cherry pick

* rename changelog filename to PR #

* rename changelog and fix import

* fix failing test

* check for OperatorWrite

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* rename from leader-transfer to transfer-leader

* remove version check and add test for operator read

* move struct to operator.go

* first pass

* add code for leader transfer in the grpc backend and tests

* wire the http endpoint to the new grpc endpoint

* remove the RPC endpoint

* remove non needed struct

* fix naming

* add mog glue to API

* fix comment

* remove dead code

* fix linter error

* change package name for proto file

* remove error wrapping

* fix failing test

* add command registration

* add grpc service mock tests

* fix receiver to be pointer

* use defined values

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

* reuse MockAclAuthorizer

* add documentation

* remove usage of external.TokenFromContext

* fix failing tests

* fix proto generation

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Apply suggestions from code review

* add more context in doc for the reason

* Apply suggestions from docs code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* regenerate proto

* fix linter errors

Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2022-11-14 15:35:12 -05:00
Freddy 0cc3fac6c4
Ensure that NodeDump imported nodes are filtered (#15356) 2022-11-14 12:35:20 -07:00
Nitya Dhanushkodi a4c64ab006
add changelog (#15351) 2022-11-14 13:23:09 -06:00
Kyle Havlovitz 7be442ee63
connect: strip port from DNS SANs for ingress gateway leaf cert (#15320)
* connect: strip port from DNS SANs for ingress gateway leaf cert

* connect: format DNS SANs in CreateCSR

* connect: Test wildcard case when formatting SANs
2022-11-14 10:27:03 -08:00
Chris S. Kim 95d5f4de69
Add changelog (#15327) 2022-11-14 11:23:02 -05:00
Derek Menteer 0c07a36408
Prevent serving TLS via ports.grpc (#15339)
Prevent serving TLS via ports.grpc

We remove the ability to run the ports.grpc in TLS mode to avoid
confusion and to simplify configuration. This breaking change
ensures that any user currently using ports.grpc in an encrypted
mode will receive an error message indicating that ports.grpc_tls
must be explicitly used.

The suggested action for these users is to simply swap their ports.grpc
to ports.grpc_tls in the configuration file. If both ports are defined,
or if the user has not configured TLS for grpc, then the error message
will not be printed.
2022-11-11 14:29:22 -06:00
Kyle Schochenmaier 2b1e5f69e2
removes ioutil usage everywhere which was deprecated in go1.16 (#15297)
* update go version to 1.18 for api and sdk, go mod tidy
* removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-11-10 10:26:01 -06:00
malizz 8d2ed1999d
update ACLs for cluster peering (#15317)
* update ACLs for cluster peering

* add changelog

* Update .changelog/15317.txt

Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>

Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
2022-11-09 13:02:58 -08:00
malizz b823d79fcf
update config defaults, add docs (#15302)
* update config defaults, add docs

* update grpc tls port for non-default values

* add changelog

* Update website/content/docs/upgrading/upgrade-specific.mdx

Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>

* Update website/content/docs/agent/config/config-files.mdx

Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>

* update logic for setting grpc tls port value

* move default config to default.go, update changelog

* update docs

* Fix config tests.

* Fix linter error.

* Fix ConnectCA tests.

* Cleanup markdown on upgrade notes.

Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>
Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>
2022-11-09 09:29:55 -08:00
Derek Menteer 9e76d274ec
Fix mesh gateway configuration with proxy-defaults (#15186)
* Fix mesh gateway proxy-defaults not affecting upstreams.

* Clarify distinction with upstream settings

Top-level mesh gateway mode in proxy-defaults and service-defaults gets
merged into NodeService.Proxy.MeshGateway, and only gets merged with
the mode attached to an an upstream in proxycfg/xds.

* Fix mgw mode usage for peered upstreams

There were a couple issues with how mgw mode was being handled for
peered upstreams.

For starters, mesh gateway mode from proxy-defaults
and the top-level of service-defaults gets stored in
NodeService.Proxy.MeshGateway, but the upstream watch for peered data
was only considering the mesh gateway config attached in
NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway
mode via global proxy-defaults or service-defaults on the downstream
would not have an effect.

Separately, transparent proxy watches for peered upstreams didn't
consider mesh gateway mode at all.

This commit addresses the first issue by ensuring that we overlay the
upstream config for peered upstreams as we do for non-peered. The second
issue is addressed by re-using setupWatchesForPeeredUpstream when
handling transparent proxy updates.

Note that for transparent proxies we do not yet support mesh gateway
mode per upstream, so the NodeService.Proxy.MeshGateway mode is used.

* Fix upstream mesh gateway mode handling in xds

This commit ensures that when determining the mesh gateway mode for
peered upstreams we consider the NodeService.Proxy.MeshGateway config as
a baseline.

In absense of this change, setting a mesh gateway mode via
proxy-defaults or the top-level of service-defaults will not have an
effect for peered upstreams.

* Merge service/proxy defaults in cfg resolver

Previously the mesh gateway mode for connect proxies would be
merged at three points:

1. On servers, in ComputeResolvedServiceConfig.
2. On clients, in MergeServiceConfig.
3. On clients, in proxycfg/xds.

The first merge returns a ServiceConfigResponse where there is a
top-level MeshGateway config from proxy/service-defaults, along with
per-upstream config.

The second merge combines per-upstream config specified at the service
instance with per-upstream config specified centrally.

The third merge combines the NodeService.Proxy.MeshGateway
config containing proxy/service-defaults data with the per-upstream
mode. This third merge is easy to miss, which led to peered upstreams
not considering the mesh gateway mode from proxy-defaults.

This commit removes the third merge, and ensures that all mesh gateway
config is available at the upstream. This way proxycfg/xds do not need
to do additional overlays.

* Ensure that proxy-defaults is considered in wc

Upstream defaults become a synthetic Upstream definition under a
wildcard key "*". Now that proxycfg/xds expect Upstream definitions to
have the final MeshGateway values, this commit ensures that values from
proxy-defaults/service-defaults are the default for this synthetic
upstream.

* Add changelog.

Co-authored-by: freddygv <freddy@hashicorp.com>
2022-11-09 10:14:29 -06:00
Derek Menteer a8eb047ee6
Bring back parameter ServerExternalAddresses in GenerateToken endpoint (#15267)
Re-add ServerExternalAddresses parameter in GenerateToken endpoint

This reverts commit 5e156772f6a7fba5324eb6804ae4e93c091229a6
and adds extra functionality to support newer peering behaviors.
2022-11-08 14:55:18 -06:00
cskh 3d2d7a77cb
fix(mesh-gateway): remove deregistered service from mesh gateway (#15272)
* fix(mesh-gateway): remove deregistered service from mesh gateway

* changelog

Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>
Co-authored-by: Evan Culver <eculver@users.noreply.github.com>
2022-11-07 20:30:15 -05:00
Freddy eee0fb1035
Avoid blocking child type updates on parent ack (#15083) 2022-11-07 18:10:42 -07:00
cskh 0a3dbb1c6e
integ test: reduce flakiness due to compound output from retry (#15233)
* integ test: avoid flakiness due to compound output from retry

* changelog
2022-11-02 14:08:17 -04:00
Evan Culver 548cf6f7a4
connect: Add Envoy 1.24 to integration tests, remove Envoy 1.20 (#15093) 2022-10-31 10:50:45 -05:00
Eric Haberkorn 57fb729547
Fix peering metrics bug (#15178)
This bug was caused by the peering health metric being set to NaN.
2022-10-28 10:51:12 -04:00
Chris S. Kim a0ac76ecf5
Allow consul debug on non-ACL consul servers (#15155) 2022-10-27 09:25:18 -04:00
cskh 57380ea752
fix(peering): nil pointer in calling handleUpdateService (#15160)
* fix(peering): nil pointer in calling handleUpdateService

* changelog
2022-10-26 11:50:34 -04:00
Luke Kysow 4956b81333
ingress-gateways: don't log error when registering gateway (#15001)
* ingress-gateways: don't log error when registering gateway

Previously, when an ingress gateway was registered without a
corresponding ingress gateway config entry, an error was logged
because the watch on the config entry returned a nil result.
This is expected so don't log an error.
2022-10-25 10:55:44 -07:00
Luke Kysow 1be7b5bf35
config entry: hardcode proxy-defaults name as global (#14833)
* config entry: hardcode proxy-defaults name as global

proxy-defaults can only have the name global. Because of this,
we support not even setting the name in the config file:

```
kind = "proxy-defaults"
```

Previously, writing this would result in the output:

```
Config entry written: proxy-defaults/
```

Now it will output:

```
Config entry written: proxy-defaults/global
```

This change follows what was done for the new Mesh config entry.
2022-10-25 10:55:15 -07:00
Luke Kysow 6b1ec05470
autoencrypt: helpful error for clients with wrong dc (#14832)
* autoencrypt: helpful error for clients with wrong dc

If clients have set a different datacenter than the servers they're
connecting with for autoencrypt, give a helpful error message.
2022-10-25 10:13:41 -07:00
R.B. Boyer a01936442c
cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956)
This continues the work done in #14908 where a crude solution to prevent a
goroutine leak was implemented. The former code would launch a perpetual
goroutine family every iteration (+1 +1) and the fixed code simply caused a
new goroutine family to first cancel the prior one to prevent the
leak (-1 +1 == 0).

This PR refactors this code completely to:

- make it more understandable
- remove the recursion-via-goroutine strangeness
- prevent unnecessary RPC fetches when the prior one has errored.

The core issue arose from a conflation of the entry.Fetching field to mean:

- there is an RPC (blocking query) in flight right now
- there is a goroutine running to manage the RPC fetch retry loop

The problem is that the goroutine-leak-avoidance check would treat
Fetching like (2), but within the body of a goroutine it would flip that
boolean back to false before the retry sleep. This would cause a new
chain of goroutines to launch which #14908 would correct crudely.

The refactored code uses a plain for-loop and changes the semantics
to track state for "is there a goroutine associated with this cache entry"
instead of the former.

We use a uint64 unique identity per goroutine instead of a boolean so
that any orphaned goroutines can tell when they've been replaced when
the expiry loop deletes a cache entry while the goroutine is still running
and is later replaced.
2022-10-25 10:27:26 -05:00
Chris S. Kim 89f92b3cce Add changelog 2022-10-24 16:12:08 -04:00
cskh a5acb987fa
fix(peering): replicating wan address (#15108)
* fix(peering): replicating wan address

* add changelog

* unit test
2022-10-24 15:44:57 -04:00
Iryna Shustava 46fe21a204
cli/sdk: Allow redirection to a different consul dns port (#15050) 2022-10-21 13:15:32 -06:00
Chris S. Kim 9dfd8386ff Add changelog 2022-10-20 14:32:42 -04:00
cskh 8379ef8d52 changelog 2022-10-19 16:37:50 -04:00
Nitya Dhanushkodi 598670e376
Remove ability to specify external addresses in GenerateToken endpoint (#14930)
* Reverts "update generate token endpoint to take external addresses (#13844)"

This reverts commit f47319b7c6b6e7c7dd720a5af927ad2d33fa536d.
2022-10-19 09:31:36 -07:00
Tyler Wendlandt b790131289
Merge pull request #14971 from hashicorp/ui/feature/agentless-nodes-banner
ui: agentless nodes notice banner banner
2022-10-19 09:06:46 -06:00
Kyle Havlovitz 3c13cf0994
Merge pull request #15035 from hashicorp/vault-ttl-update-warn
Warn instead of returning error when missing intermediate mount tune permissions
2022-10-18 15:41:52 -07:00
Chris S. Kim e4c20ec190
Refactor client RPC timeouts (#14965)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created.

Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.
2022-10-18 15:05:09 -04:00
Kyle Havlovitz 0a968e53b5 Warn instead of returning an error when intermediate mount tune permission is missing 2022-10-18 12:01:25 -07:00
R.B. Boyer 9f41cc4a25
cache: prevent goroutine leak in agent cache (#14908)
There is a bug in the error handling code for the Agent cache subsystem discovered:

1. NotifyCallback calls notifyBlockingQuery which calls getWithIndex in
   a loop (which backs off on-error up to 1 minute)

2. getWithIndex calls fetch if there’s no valid entry in the cache

3. fetch starts a goroutine which calls Fetch on the cache-type, waits
   for a while (again with backoff up to 1 minute for errors) and then
   calls fetch to trigger a refresh

The end result being that every 1 minute notifyBlockingQuery spawns an
ancestry of goroutines that essentially lives forever.

This PR ensures that the goroutine started by `fetch` cancels any prior
goroutine spawned by the same line for the same key.

In isolated testing where a cache type was tweaked to indefinitely
error, this patch prevented goroutine counts from skyrocketing.
2022-10-17 14:38:10 -05:00
R.B. Boyer ca916eec32
ca: fix a masked bug in leaf cert generation that would not be notified of root cert rotation after the first one (#15005)
In practice this was masked by #14956 and was only uncovered fixing the
other bug.

  go test ./agent -run TestAgentConnectCALeafCert_goodNotLocal

would fail when only #14956 was fixed.
2022-10-17 13:24:27 -05:00
Chris S. Kim 58c041eb6e
Merge pull request #13388 from deblasis/feature/health-checks_windows_service
Feature: Health checks windows service
2022-10-17 09:26:19 -04:00
Kyle Havlovitz 096ca5e4b0 Extend tcp keepalive settings to work for terminating gateways as well 2022-10-14 17:05:46 -07:00
Kyle Havlovitz f8e745315f Update docs and add tcp_keepalive_probes setting 2022-10-14 17:05:46 -07:00
Freddy bbf6b17e44
Merge pull request #14981 from hashicorp/peering/dial-through-gateways 2022-10-14 09:44:56 -06:00
Tyler Wendlandt 4a3801385d
Merge pull request #14986 from hashicorp/ui/feature/filter-node-healthchecks-agentless
UI: filter node healthchecks on agentless service instances
2022-10-14 09:33:45 -06:00
Dan Upton 3b9297f95a
proxycfg: rate-limit delivery of config snapshots (#14960)
Adds a user-configurable rate limiter to proxycfg snapshot delivery,
with a default limit of 250 updates per second.

This addresses a problem observed in our load testing of Consul
Dataplane where updating a "global" resource such as a wildcard
intention or the proxy-defaults config entry could starve the Raft or
Memberlist goroutines of CPU time, causing general cluster instability.
2022-10-14 15:52:00 +01:00
Dan Upton 0a0534a094
perf: remove expensive reflection from xDS hot path (#14934)
Replaces the reflection-based implementation of proxycfg's
ConfigSnapshot.Clone with code generated by deep-copy.

While load testing server-based xDS (for consul-dataplane) we discovered
this method is extremely expensive. The ConfigSnapshot struct, directly
or indirectly, contains a copy of many of the structs in the agent/structs
package, which creates a large graph for copystructure.Copy to traverse
at runtime, on every proxy reconfiguration.
2022-10-14 10:26:42 +01:00
wenincode 229a97967a Add changelog entry 2022-10-13 18:54:39 -06:00
Freddy 909fc33271
Merge pull request #14935 from hashicorp/fix/alias-leak 2022-10-13 16:31:15 -06:00
freddygv c5040b8111 Add changelog entry 2022-10-13 16:09:32 -06:00
freddygv a468cbcce9 Add changelog entry 2022-10-13 16:03:15 -06:00
wenincode 3006025bbe Add changelog 2022-10-13 10:43:57 -06:00
Derek Menteer bfa4adbfce Add remote peer partition and datacenter info. 2022-10-13 10:37:41 -05:00
Tyler Wendlandt a6b9219808
Merge pull request #14970 from hashicorp/ui/feature/filter-synthetic-nodes
ui: Filter synthetic nodes on nodes list page
2022-10-13 09:12:03 -06:00
Michael Klein e6cce385e7
Merge pull request #14947 from hashicorp/ui/feat/peer-detail-page
ui: peer detail view
2022-10-13 17:03:57 +02:00
Michael Klein b7f5e02109 Add changelog for peers detail page 2022-10-13 16:45:03 +02:00
Dan Upton de7f380385
xds: properly merge central config for "agentless" services (#14962) 2022-10-13 12:04:59 +01:00
Dan Upton 36a3d00f0d
bug: fix goroutine leaks caused by incorrect usage of `WatchCh` (#14916)
memdb's `WatchCh` method creates a goroutine that will publish to the
returned channel when the watchset is triggered or the given context
is canceled. Although this is called out in its godoc comment, it's
not obvious that this method creates a goroutine who's lifecycle you
need to manage.

In the xDS capacity controller, we were calling `WatchCh` on each
iteration of the control loop, meaning the number of goroutines would
grow on each autopilot event until there was catalog churn.

In the catalog config source, we were calling `WatchCh` with the
background context, meaning that the goroutine would keep running after
the sync loop had terminated.
2022-10-13 12:04:27 +01:00
Hans Hasselberg 56580d6fa6
adding configuration option cloud.scada_address (#14936)
* adding scada_address

* config tests

* add changelog entry
2022-10-13 11:31:28 +02:00
wenincode ef681d06dc Add changelog 2022-10-12 17:16:01 -06:00
Paul Glass be1a4438a9
Add consul.xds.server.streamStart metric (#14957)
This adds a new consul.xds.server.streamStart metric to measure the time taken to first generate xDS resources after an xDS stream is opened.
2022-10-12 14:17:58 -05:00
Iryna Shustava 28b7dea973
cli: Add -node-name flag to redirect-traffic command (#14933) 2022-10-12 11:53:41 -06:00
Riddhi Shah 474d9cfcdc
Service http checks data source for agentless proxies (#14924)
Adds another datasource for proxycfg.HTTPChecks, for use on server agents. Typically these checks are performed by local client agents and there is no equivalent of this in agentless (where servers configure consul-dataplane proxies).
Hence, the data source is mostly a no-op on servers but in the case where the service is present within the local state, it delegates to the cache data source.
2022-10-12 07:49:56 -07:00
Dhia Ayachi 2bd2054e09
bump raft version to v1.3.11 (#14897)
* bump raft version to v1.3.11

* Add change log

* fix go.sum
2022-10-12 08:51:52 -04:00