Commit Graph

2173 Commits

Author SHA1 Message Date
Daniel Nephin 56ecfce5fa Remove unused return values 2020-06-24 13:00:15 -04:00
Daniel Nephin 07c1081d39 Fix a bunch of unparam lint issues 2020-06-24 13:00:14 -04:00
Matt Keeler 341aedbce9
Ensure that retryLoopBackoff can be cancelled
We needed to pass a cancellable context into the limiter.Wait instead of context.Background. So I made the func take a context instead of a chan as most places were just passing through a Done chan from a context anyways.

Fix go routine leak in the gateway locator
2020-06-24 12:41:08 -04:00
Matt Keeler 934f92f83b
Don’t leak metrics go routines in tests (#8182) 2020-06-24 10:15:25 -04:00
gitforbit aebb8e09a7
agent-http: cleanup: return nil instead of err (#8043)
Since err is already checked, it should return `nil`
2020-06-24 14:29:21 +02:00
R.B. Boyer ba83b52b32
connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165) 2020-06-23 15:19:56 -05:00
freddygv 224f486aef Update namespaces subject-verb agreement 2020-06-23 10:57:30 -06:00
freddygv c631e7496a Remove break 2020-06-22 19:59:04 -06:00
freddygv e81b6d25ad Let users know namespaces are ent only in config entry decode 2020-06-22 19:59:04 -06:00
Pierre Souchay 667b36cdba
Returns DNS Error NSDOMAIN when DC does not exists (#8103)
This will allow to increase cache value when DC is not valid (aka
return SOA to avoid too many consecutive requests) and will
distinguish DC being temporarily not available from DC not existing.

Implements https://github.com/hashicorp/consul/issues/8102
2020-06-22 09:01:48 -04:00
Matt Keeler 7b49fc1529
Require enabling TLS to enable Auto Config (#8159)
On the servers they must have a certificate.

On the clients they just have to set verify_outgoing to true to attempt TLS connections for RPCs.

Eventually we may relax these restrictions but right now all of the settings we push down (acl tokens, acl related settings, certificates, gossip key) are sensitive and shouldn’t be transmitted over an unencrypted connection. Our guides and docs should recoommend verify_server_hostname on the clients as well.

Another reason to do this is weird things happen when making an insecure RPC when TLS is not enabled. Basically it tries TLS anyways. We should probably fix that to make it clearer what is going on.
2020-06-19 16:38:14 -04:00
Freddy 7e7c783c8f
Always return a gateway cluster (#8158) 2020-06-19 13:31:39 -06:00
Matt Keeler 9dc9f7df15
Allow cancelling startup when performing auto-config (#8157)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2020-06-19 15:16:00 -04:00
Daniel Nephin 896b057a9f
Merge pull request #8147 from hashicorp/dnephin/remove-private-ip-2
Remove some dead code from agent/consul/util.go
2020-06-18 15:51:09 -04:00
Matt Keeler 7086a50353 Change auto config authorizer to allow for future extension
The envisioned changes would allow extra settings to enable dynamically defined auth methods to be used instead of  or in addition to the statically defined one in the configuration.
2020-06-18 15:22:24 -04:00
Daniel Nephin b5ef9b7ea9 Remove bytesToUint64 from agent/consul 2020-06-18 12:45:43 -04:00
Daniel Nephin 81bc082b63 Remove unused private IP code from agent/consul 2020-06-18 12:40:38 -04:00
Matt Keeler 2c7844d220
Implement Client Agent Auto Config
There are a couple of things in here.

First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism.

This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.
2020-06-17 16:49:46 -04:00
Matt Keeler f5d57ccd48
Allow the Agent its its child Client/Server to share a connection pool
This is needed so that we can make an AutoConfig RPC at the Agent level prior to creating the Client/Server.
2020-06-17 16:19:33 -04:00
Matt Keeler 8c601ad8db
Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc 2020-06-17 16:07:25 -04:00
Chris Piraino 79d003d395
Remove ACLEnforceVersion8 from tests (#8138)
The field had been deprecated for a while and was recently removed,
however a PR which added these tests prior to removal was merged.
2020-06-17 14:58:01 -05:00
Daniel Nephin 3d03d72727
Merge pull request #7762 from hashicorp/dnephin/warn-on-unknown-service-file
config: warn if a config file is being skipped because of its file extension
2020-06-17 15:14:40 -04:00
Daniel Nephin cb736b6947 config: warn when a config file is skipped
All commands which read config (agent, services, and validate) will now
print warnings when one of the config files is skipped because it did
not match an expected format.

Also ensures that config validate prints all warnings.
2020-06-17 13:08:54 -04:00
Daniel Nephin 1ef8279ac9
Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4
ci: enable SA4006 staticcheck check and add ineffassign
2020-06-17 12:16:02 -04:00
Matt Keeler eda8cb39fd
Implement the insecure version of the Cluster.AutoConfig RPC endpoint
Right now this is only hooked into the insecure RPC server and requires JWT authorization. If no JWT authorizer is setup in the configuration then we inject a disabled “authorizer” to always report that JWT authorization is disabled.
2020-06-17 11:25:29 -04:00
Pierre Souchay f7a1189dba
gossip: Ensure that metadata of Consul Service is updated (#7903)
While upgrading servers to a new version, I saw that metadata of
existing servers are not upgraded, so the version and raft meta
is not up to date in catalog.

The only way to do it was to:
 * update Consul server
 * make it leave the cluster, then metadata is accurate

That's because the optimization to avoid updating catalog does
not take into account metadata, so no update on catalog is performed.
2020-06-17 12:16:13 +02:00
Daniel Nephin 8753d1f1ba ci: Add ineffsign linter
And fix an additional ineffective assignment that was not caught by staticcheck
2020-06-16 17:32:50 -04:00
Daniel Nephin 97342de262
Merge pull request #8070 from hashicorp/dnephin/add-gofmt-simplify
ci: Enable gofmt simplify
2020-06-16 17:18:38 -04:00
Matt Keeler d994dc7b35
Agent Auto Configuration: Configuration Syntax Updates (#8003) 2020-06-16 15:03:22 -04:00
Daniel Nephin 89d95561df Enable gofmt simplify
Code changes done automatically with 'gofmt -s -w'
2020-06-16 13:21:11 -04:00
Daniel Nephin 5f24171f13 ci: enable SA4006 staticcheck check
And fix the 'value not used' issues.

Many of these are not bugs, but a few are tests not checking errors, and
one appears to be a missed error in non-test code.
2020-06-16 13:10:11 -04:00
Daniel Nephin 71e6534061 Rename txnWrapper to txn 2020-06-16 13:06:02 -04:00
Daniel Nephin 537ae1fd46 Rename db 2020-06-16 13:04:31 -04:00
Daniel Nephin 78c76f0773 Handle return value from txn.Commit 2020-06-16 13:04:31 -04:00
Daniel Nephin 50db8f409a state: Update docstrings for changeTrackerDB and txn
And un-embed memdb.DB to prevent accidental access to underlying
methods.
2020-06-16 13:04:31 -04:00
Paul Banks f9a6386c4a state: track changes so that they may be used to produce change events 2020-06-16 13:04:29 -04:00
Matt Keeler cdc4b20afa
ACL Node Identities (#7970)
A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy).

Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.
2020-06-16 12:54:27 -04:00
Daniel Nephin 2c2da41b3d config: refactor to consolidate all File->Source loading
Previously the logic for reading ConfigFiles and produces Sources was split
between NewBuilder and Build. This commit moves all of the logic into NewBuilder
so that Build() can operate entirely on Sources.

This change is in preparation for logging warnings when files have an
unsupported extension.

It also reduces the scope of BuilderOpts, and gets us very close to removing
Builder.options.
2020-06-16 12:52:23 -04:00
Daniel Nephin c820a8de88 config: Make ConfigFormat not a pointer
The nil value was never used. We can avoid a bunch of complications by
making the field a string value instead of a pointer.

This change is in preparation for fixing a silent config failure.
2020-06-16 12:52:22 -04:00
Daniel Nephin b7b652e8c9 config: rename Flags to BuilderOpts
Flags is an overloaded term in this context. It generally is used to
refer to command line flags. This struct, however, is a data object
used as input to the construction.

It happens to be partially populated by command line flags, but
otherwise has very little to do with them.

Renaming this struct should make the actual responsibility of this struct
more obvious, and remove the possibility that it is confused with
command line flags.

This change is in preparation for adding additional fields to
BuilderOpts.
2020-06-16 12:51:19 -04:00
Daniel Nephin 5ac012dddf config: remove Args field from Flags
This field was populated for one reason, to test that it was empty.
Of all the callers, only a single one used this functionality. The rest
constructed a `Flags{}` struct which did not set Args.

I think this shows that the logic was in the wrong place. Only the agent
command needs to care about validating the args.

This commit removes the field, and moves the logic to the one caller
that cares.

Also fix some comments.
2020-06-16 12:49:53 -04:00
Daniel Nephin bd866d694c agent/service_manager: remove 'updateCh' field from serviceConfigWatch
Passing the channel to the function which uses it significantly
reduces the scope of the variable, and makes its usage more explicit. It
also moves the initialization of the channel closer to where it is used.

Also includes a couple very small cleanups to remove a local var and
read the error from `ctx.Err()` directly instead of creating a channel
to check for an error.
2020-06-16 12:15:57 -04:00
Daniel Nephin 1c7fce73a8 agent/service_manager: remove 'defaults' field from serviceConfigWatch
This field was always read by the same function that populated the field,
so it does not need to be a field. Passing the value as an argument to
functions makes it more obvious where the value comes from, and also reduces
the scope of the variable significantly.
2020-06-16 12:15:52 -04:00
Daniel Nephin 93d7debbeb agent/service_manager: Pass ctx around
[The documentation for context](https://golang.org/pkg/context/)
recommends not storing context in a struct field:

> Do not store Contexts inside a struct type; instead, pass a Context
> explicitly to each function that needs it. The Context should be the
> first parameter, typically named ctx...

Sometimes there are good reasons to not follow this recommendation, but
in this case it seems easy enough to follow.

Also moved the ctx argument to be the first in one of the function calls
to follow the same recommendation.
2020-06-16 12:14:00 -04:00
Daniel Nephin 13f564bdd4
Merge pull request #8074 from hashicorp/dnephin/remove-references-to-PatchSliceOfMaps
Update comments that reference PatchSliceOfMaps
2020-06-15 14:33:10 -04:00
Matt Keeler 976f922abf
Make the Agent Cache more Context aware (#8092)
Blocking queries issues will still be uncancellable (that cannot be helped until we get rid of net/rpc). However this makes it so that if calling getWithIndex (like during a cache Notify go routine) we can cancell the outer routine. Previously it would keep issuing more blocking queries until the result state actually changed.
2020-06-15 11:01:25 -04:00
freddygv f4179b400d Update telemetry for gateway-services endpoint 2020-06-12 14:44:36 -06:00
freddygv 51f8f76b3b Remove unused method and fixup docs ref 2020-06-12 13:47:43 -06:00
freddygv cc4ff3ae02 Fixup stray sid references 2020-06-12 13:47:43 -06:00
freddygv 1e7e716742 Move compound service names to use ServiceName type 2020-06-12 13:47:43 -06:00
freddygv 1cab73e609 Create HTTP endpoint 2020-06-12 13:46:47 -06:00
freddygv 806b1fb608 Move GatewayServices out of Internal 2020-06-12 13:46:47 -06:00
Freddy 66e2def461
Only pass one hostname via EDS and prefer healthy ones (#8084)
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

Currently when passing hostname clusters to Envoy, we set each service instance registered with Consul as an LbEndpoint for the cluster.

However, Envoy can only handle one per cluster:
[2020-06-04 18:32:34.094][1][warning][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.api.v2.Cluster rejected: Error adding/updating cluster(s) dc2.internal.ddd90499-9b47-91c5-4616-c0cbf0fc358a.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint, server.dc2.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint

Envoy is currently handling this gracefully by only picking one of the endpoints. However, we should avoid passing multiple to avoid these warning logs.

This PR:

* Ensures we only pass one endpoint, which is tied to one service instance.
* We prefer sending an endpoint which is marked as Healthy by Consul.
* If no endpoints are healthy we emit a warning and skip the cluster.
* If multiple unique hostnames are spread across service instances we emit a warning and let the user know which will be resolved.
2020-06-12 13:46:17 -06:00
Chris Piraino cd93e0cd99
Allow users to set hosts to the wildcard specifier when TLS is disabled (#8083)
This allows easier demoing/testing of ingress gateways, while still
preserving the validation we have for DNSSANs
2020-06-11 10:03:06 -05:00
Chris Piraino cba863af84
Move ingress param to a new endpoint (#8081)
In discussion with team, it was pointed out that query parameters tend
to be filter mechanism, and that semantically the "/v1/health/connect"
endpoint should return "all healthy connect-enabled endpoints (e.g.
could be side car proxies or native instances) for this service so I can
connect with mTLS".

That does not fit an ingress gateway, so we remove the query parameter
and add a new endpoint "/v1/health/ingress" that semantically means
"all the healthy ingress gateway instances that I can connect to
to access this connect-enabled service without mTLS"
2020-06-10 13:07:15 -05:00
Daniel Nephin af063a5692 Update comments that reference PatchSliceOfMaps
To reference decode.HookWeakDecodeFromSlice instead.

Also removes a step from the adding config fields checklist which is
no longer necessary.
2020-06-09 17:43:05 -04:00
Chris Piraino b8a43e164a
Merge pull request #8064 from hashicorp/ingress/health-query-param
Add API query parameter ?ingress to allow users to find ingress gateways associated to a service
2020-06-09 16:08:28 -05:00
Chris Piraino 4fe1f5fdb9 Remove TODO note about ingress API, it is done! 2020-06-09 14:58:30 -05:00
Chris Piraino 0e72da76db Set connect or ingress boolean after checking for query param 2020-06-09 14:45:21 -05:00
Daniel Nephin 6719f1a6fa
Merge pull request #7900 from hashicorp/dnephin/add-linter-staticcheck-2
intentions: fix a bug in Intention.SetHash
2020-06-09 15:40:20 -04:00
Daniel Nephin 5f14eb124c
Merge pull request #8037 from hashicorp/dnephin/add-linter-staticcheck-5
ci: Enabled SA2002 staticcheck check
2020-06-09 15:31:24 -04:00
Hans Hasselberg 7404712854
acl: do not resolve local tokens from remote dcs (#8068) 2020-06-09 21:13:09 +02:00
Kyle Havlovitz b7cf5139dd
Merge pull request #8040 from hashicorp/ingress/expose-cli
Ingress expose CLI command
2020-06-09 12:11:23 -07:00
Chris Piraino 5f28de5654 Add ?ingress query parameter on /v1/health/connect
Refactor boolean query parameter logic from ?passing value to re-use
with ingress
2020-06-09 11:44:31 -05:00
Daniel Nephin c1feec176f
Merge pull request #7964 from hashicorp/dnephin/remove-patch-slice-of-maps-forward-compat
config: Use HookWeakDecodeFromSlice in place of PatchSliceOfMaps
2020-06-08 19:53:04 -04:00
Daniel Nephin 7b99d9a25d config: add HookWeakDecodeFromSlice
Currently opaque config blocks (config entries, and CA provider config) are
modified by PatchSliceOfMaps, making it impossible for these opaque
config sections to contain slices of maps.

In order to fix this problem, any lazy-decoding of these blocks needs to support
weak decoding of []map[string]interface{} to a struct type before
PatchSliceOfMaps is replaces. This is necessary because these config
blobs are persisted, and during an upgrade an older version of Consul
could read one of the new configuration values, which would cause an error.

To support the upgrade path, this commit first introduces the new hooks
for weak decoding of []map[string]interface{} and uses them only in the
lazy-decode paths. That way, in a future release, new style
configuration will be supported by the older version of Consul.

This decode hook has a number of advantages:

1. It no longer panics. It allows mapstructure to report the error
2. It no longer requires the user to declare which fields are slices of
   structs. It can deduce that information from the 'to' value.
3. It will make it possible to preserve opaque configuration, allowing
   for structured opaque config.
2020-06-08 17:05:09 -04:00
Hans Hasselberg bec21c849d
Tokens converted from legacy ACLs get their Hash computed (#8047)
* Fixes #5606: Tokens converted from legacy ACLs get their Hash computed

This allows new style token replication to work for legacy tokens as well when they change.

* tests: fix timestamp comparison

Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
2020-06-08 21:44:06 +02:00
Chris Piraino 5d0cb00ec3
Always require Host header values for http services (#7990)
Previously, we did not require the 'service-name.*' host header value
when on a single http service was exposed. However, this allows a user
to get into a situation where, if they add another service to the
listener, suddenly the previous service's traffic might not be routed
correctly. Thus, we always require the Host header, even if there is
only 1 service.

Also, we add the make the default domain matching more restrictive by
matching "service-name.ingress.*" by default. This lines up better with
the namespace case and more accurately matches the Consul DNS value we
expect people to use in this case.
2020-06-08 13:16:24 -05:00
Hans Hasselberg 4790f97bb6
http: use default minsize for gzip handler. (#7354)
Fixes #6306
2020-06-08 10:10:08 +02:00
Hans Hasselberg 532e14fdc4
agent: add option to disable agent cache for HTTP endpoints (#8023)
This allows the operator to disable agent caching for the http endpoint.
It is on by default for backwards compatibility and if disabled will
ignore the url parameter `cached`.
2020-06-08 10:08:12 +02:00
Kyle Havlovitz 3f7c072b75 Add connect expose CLI command 2020-06-05 14:54:29 -07:00
Daniel Nephin 1cdfc4f290 ci: Enabled SA2002 staticcheck check
And handle errors in the main test goroutine
2020-06-05 17:50:11 -04:00
Hans Hasselberg 681124eaff
Setup intermediate_pki_path on secondary when using vault (#8001)
Make sure to mount vault backend for intermediate_pki_path on secondary
dc.
2020-06-05 21:36:22 +02:00
Daniel Nephin b9e4544ec3 intentions: fix a bug in Intention.SetHash
Found using staticcheck.

binary.Write does not accept int types without a size. The error from binary.Write was ignored, so we never saw this error. Casting the data to uint64 produces a correct hash.

Also deprecate the Default{Addr,Port} fields, and prevent them from being encoded. These fields will always be empty and are not used.
Removing these would break backwards compatibility, so they are left in place for now.

Co-authored-by: Hans Hasselberg <me@hans.io>
2020-06-05 14:51:43 -04:00
R.B. Boyer 2867730e8a
tests: ensure that the ServiceExists helper function normalizes entmeta (#8025)
This fixes a unit test failure over in enterprise due to https://github.com/hashicorp/consul/pull/7384
2020-06-05 10:41:39 +02:00
R.B. Boyer 3ad570ba99
server: don't activate federation state replication or anti-entropy until all servers are running 1.8.0+ (#8014) 2020-06-04 16:05:27 -05:00
Hans Hasselberg 7f14d3ac8a
tests: use constructor instead init (#8024) 2020-06-04 22:59:06 +02:00
Pierre Souchay 7cd5477c3c
checks: when a service does not exists in an alias, consider it failing (#7384)
In current implementation of Consul, check alias cannot determine
if a service exists or not. Because a service without any check
is semantically considered as passing, so when no healthchecks
are found for an agent, the check was considered as passing.

But this make little sense as the current implementation does not
make any difference between:
 * a non-existing service (passing)
 * a service without any check (passing as well)

In order to make it work, we have to ensure that when a check did
not find any healthcheck, the service does indeed exists. If it
does not, lets consider the check as failing.
2020-06-04 14:50:52 +02:00
Hans Hasselberg dd8cd9bc24
Merge pull request #7966 from hashicorp/pool_improvements
Agent connection pool cleanup
2020-06-04 08:56:26 +02:00
Freddy f759a48726
Enable gateways to resolve hostnames to IPv4 addresses (#7999)
The DNS resolution will be handled by Envoy and defaults to LOGICAL_DNS. This discovery type can be overridden on a per-gateway basis with the envoy_dns_discovery_type Gateway Option.

If a service contains an instance with a hostname as an address we set the Envoy cluster to use DNS as the discovery type rather than EDS. Since both mesh gateways and terminating gateways route to clusters using SNI, whenever there is a mix of hostnames and IP addresses associated with a service we use the hostname + CDS rather than the IPs + EDS.

Note that we detect hostnames by attempting to parse the service instance's address as an IP. If it is not a valid IP we assume it is a hostname.
2020-06-03 15:28:45 -06:00
Matt Keeler 2c615807af
Fix legacy management tokens in unupgraded secondary dcs (#7908)
The ACL.GetPolicy RPC endpoint was supposed to return the “parent” policy and not always the default policy. In the case of legacy management tokens the parent policy was supposed to be “manage”. The result of us not sending this properly was that operations that required specifically a management token such as saving a snapshot would not work in secondary DCs until they were upgraded.
2020-06-03 11:22:22 -04:00
Matt Keeler 9fa9ec4ba0
Fix segfault due to race condition for checking server versions (#7957)
The ACL monitoring routine uses c.routers to check for server version updates. Therefore it needs to be started after initializing the routers.
2020-06-03 10:36:32 -04:00
Daniel Nephin e8a883e829
Replace goe/verify.Values with testify/require.Equal (#7993)
* testing: replace most goe/verify.Values with require.Equal

One difference between these two comparisons is that go/verify considers
nil slices/maps to be equal to empty slices/maps, where as testify/require
does not, and does not appear to provide any way to enable that behaviour.

Because of this difference some expected values were changed from empty
slices to nil slices, and some calls to verify.Values were left.

* Remove github.com/pascaldekloe/goe/verify

Reduce the number of assertion packages we use from 2 to 1
2020-06-02 12:41:25 -04:00
Alvin Huang 1cb06ce7f0
Merge pull request #7956 from hashicorp/update-master-to-1.8.0-beta2
Update master to 1.8.0 beta2
2020-06-01 16:52:19 -04:00
R.B. Boyer 7bd7895047
acl: allow auth methods created in the primary datacenter to optionally create global tokens (#7899) 2020-06-01 11:44:47 -05:00
R.B. Boyer 16db20b1f3
acl: remove the deprecated `acl_enforce_version_8` option (#7991)
Fixes #7292
2020-05-29 16:16:03 -05:00
Jono Sosulska 7a13c96a2a
Replace whitelist/blacklist terminology with allowlist/denylist (#7971)
* Replace whitelist/blacklist terminology with allowlist/denylist
2020-05-29 14:19:16 -04:00
Hans Hasselberg 1ed91cbdf6 pool: remove timeout parameter
Timeout was never used in a meaningful way by callers, which is why it
is now entirely internal to the pool.
2020-05-29 08:21:28 +02:00
Hans Hasselberg 5cda505495 pool: remove useTLS and ForceTLS
In the past TLS usage was enforced with these variables, but these days
this decision is made by TLSConfigurator and there is no reason to keep
using the variables.
2020-05-29 08:21:24 +02:00
Hans Hasselberg 9ef44ec3da pool: remove version
The version field has been used to decide which multiplexing to use. It
was introduced in 2457293dceec95ecd12ef4f01442e13710ea131a. But this is
6y ago and there is no need for this differentiation anymore.
2020-05-28 23:06:01 +02:00
hashicorp-ci 6b8bedd004 update bindata_assetfs.go 2020-05-28 14:39:37 -04:00
hashicorp-ci 634a0e630a update bindata_assetfs.go 2020-05-28 14:39:28 -04:00
Daniel Nephin ea6c2b2adc ci: Add staticcheck and fix most errors
Three of the checks are temporarily disabled to limit the size of the
diff, and allow us to enable all the other checks in CI.

In a follow up we can fix the issues reported by the other checks one
at a time, and enable them.
2020-05-28 11:59:58 -04:00
Daniel Nephin e359b10f77
Merge pull request #7963 from hashicorp/dnephin/replace-lib-translate-keys
Replace lib.TranslateKeys with a mapstructure decode hook
2020-05-27 16:51:26 -04:00
Daniel Nephin 8f939da431 config: use the new HookTranslateKeys instead of lib.TranslateKeys
With the exception of CA provider config, which will be migrated at some
later time.
2020-05-27 16:24:47 -04:00
Daniel Nephin 644eb3b33a Add alias struct tags for new decode hook 2020-05-27 16:24:47 -04:00
R.B. Boyer 54c7f825d6
create lib/stringslice package (#7934) 2020-05-27 11:47:32 -05:00
R.B. Boyer 813d69622e
agent: handle re-bootstrapping in a secondary datacenter when WAN federation via mesh gateways is configured (#7931)
The main fix here is to always union the `primary-gateways` list with
the list of mesh gateways in the primary returned from the replicated
federation states list. This will allow any replicated (incorrect) state
to be supplemented with user-configured (correct) state in the config
file. Eventually the game of random selection whack-a-mole will pick a
winning entry and re-replicate the latest federation states from the
primary. If the user-configured state is actually the incorrect one,
then the same eventual correct selection process will work in that case,
too.

The secondary fix is actually to finish making wanfed-via-mgws actually
work as originally designed. Once a secondary datacenter has replicated
federation states for the primary AND managed to stand up its own local
mesh gateways then all of the RPCs from a secondary to the primary
SHOULD go through two sets of mesh gateways to arrive in the consul
servers in the primary (one hop for the secondary datacenter's mesh
gateway, and one hop through the primary datacenter's mesh gateway).
This was neglected in the initial implementation. While everything
works, ideally we should treat communications that go around the mesh
gateways as just provided for bootstrapping purposes.

Now we heuristically use the success/failure history of the federation
state replicator goroutine loop to determine if our current mesh gateway
route is working as intended. If it is, we try using the local gateways,
and if those don't work we fall back on trying the primary via the union
of the replicated state and the go-discover configuration flags.

This can be improved slightly in the future by possibly initializing the
gateway choice to local on startup if we already have replicated state.
This PR does not address that improvement.

Fixes #7339
2020-05-27 11:31:10 -05:00
Raphaël Rondeau b799471e29
connect: fix endpoints clusterName when using cluster escape hatch (#7319)
```changelog
* fix(connect): fix endpoints clusterName when using cluster escape hatch
```
2020-05-26 10:57:22 +02:00
Pierre Souchay fa43d427cb
Stop all watches before shuting down anything dring shutdown. (#7526)
This will prevent watches from being triggered.

```changelog
* fix(agent):  stop all watches before shuting down
```
2020-05-26 10:01:49 +02:00