Commit Graph

2952 Commits

Author SHA1 Message Date
hashicorp-ci 3b442c77f4 auto-updated agent/uiserver/bindata_assetfs.go from commit 96204a21d 2021-02-11 11:42:07 +00:00
hashicorp-ci 7c29524b95 auto-updated agent/uiserver/bindata_assetfs.go from commit 369637aff 2021-02-11 10:10:21 +00:00
hashicorp-ci cd87ee21f8 auto-updated agent/uiserver/bindata_assetfs.go from commit eda38a0ba 2021-02-11 09:54:29 +00:00
R.B. Boyer 194fb0d144
connect: update supported envoy point releases to 1.16.2, 1.15.3, 1.14.6, 1.13.7 (#9737) 2021-02-10 13:11:15 -06:00
Pierre Souchay 54dbcd0bb9 Do not filter tags unless req.TagFilter is set
Send empty array [] instead of [""] in DNS requests when TagFilter is not set

Do not change case sensitivity of services anymore in `getServiceNodes()` since
cache keys are now case insensitive
2021-02-10 10:36:11 +01:00
Daniel Nephin bd122bb9f5 streaming: double the cache TTL
10 minutes is the default blocking query timeout. Using the same value results in us hitting
the expired cache entry bug frequently. By extending this TTL we at least mitigate the problem.

The underlying bug still needs to be fixed.
2021-02-09 14:36:26 -05:00
Daniel Nephin 3ab546623f submatview: do not reset retry waiter when materializer is reset
The materializer is often reset when an error is received. By resetting
the retryWaiter we effectively never wait. The retryWaiter should only
be reset when we get an event without error. This is done in
Materializer.updateView().
2021-02-09 13:56:50 -05:00
Daniel Nephin 041f96639d api: Use blocking query for health when near is set
Streaming can not be used for these queries because the near query
paramter indicates a specific sort of the results, and that sort
requires data that is not available to the client from the streaming
API.
2021-02-09 13:55:33 -05:00
Pierre Souchay 2c44a731c8 Use lower case for serviceName computation of cache keys 2021-02-09 19:19:40 +01:00
Matt Keeler 19c99dc104
Stop background refresh of cached data for requests that result in ACL not found errors (#9738) 2021-02-09 10:15:53 -05:00
Freddy 5a50b26767
Avoid potential proxycfg/xDS deadlock using non-blocking send 2021-02-08 16:14:06 -07:00
R.B. Boyer 91d9544803
connect: connect CA Roots in the primary datacenter should use a SigningKeyID derived from their local intermediate (#9428)
This fixes an issue where leaf certificates issued in primary
datacenters using Vault as a Connect CA would be reissued very
frequently (every ~20 seconds) because the logic meant to detect root
rotation was errantly triggering.

The hash of the rootCA was being compared against a hash of the
intermediateCA and always failing. This doesn't apply to the Consul
built-in CA provider because there is no intermediate in use in the
primary DC.

This is reminiscent of #6513
2021-02-08 13:18:51 -06:00
Pierre Souchay cec640a1f2 [Streaming] Properly filters node-meta queries on health
This wil fix https://github.com/hashicorp/consul/issues/9730
2021-02-08 17:53:18 +01:00
freddygv a417f88e44 Update comments on avoiding proxycfg deadlock 2021-02-08 09:45:45 -07:00
R.B. Boyer 77424e179a
xds: prevent LDS flaps in mesh gateways due to unstable datacenter lists (#9651)
Also fix a similar issue in Terminating Gateways that was masked by an overzealous test.
2021-02-08 10:19:57 -06:00
Mohammad Banikazemi 45b11c713a Correcting the changed function name in comment
Signed-off-by: Mohammad Banikazemi <mbanikazemi@gmail.com>
2021-02-06 20:23:40 -05:00
freddygv 0a8f2f2105 Retry send after timer fires, in case no updates occur 2021-02-05 18:00:59 -07:00
Daniel Nephin cdda3b9321 state: Use the tableIndex constant 2021-02-05 18:37:45 -05:00
Daniel Nephin de841bd459 state: Document index table
And move the IndexEntry (which is stored in the table) next to the table
schema definition.
2021-02-05 18:37:45 -05:00
R.B. Boyer 05d767b8d6
xds: deduplicate mesh gateway listeners in a stable way (#9650)
In a situation where the mesh gateway is configured to bind to multiple
network interfaces, we use a feature called 'tagged addresses'.
Sometimes an address is duplicated across multiple tags such as 'lan'
and 'lan_ipv4'.

There is code to deduplicate these things when creating envoy listeners,
but that code doesn't ensure that the same tag wins every time. If the
winning tag flaps between xDS discovery requests it will cause the
listener to be drained and replaced.
2021-02-05 16:28:07 -06:00
freddygv 8de6b2590c Make xDS labeling consistent with proxycfg 2021-02-05 15:15:52 -07:00
freddygv 57c29aba5d Update proxycfg logging, labels were already attached 2021-02-05 15:14:49 -07:00
Daniel Nephin 23cfbc8f8d
Merge pull request #9719 from hashicorp/oss/state-store-4
state: remove registerSchema
2021-02-05 14:02:38 -05:00
Daniel Nephin dc70f583d4
Merge pull request #9718 from hashicorp/oss/dnephin/ent-meta-in-state-store-3
state: convert all table name constants to the new prefix pattern
2021-02-05 14:02:07 -05:00
Daniel Nephin eb5d71fd19
Merge pull request #9665 from hashicorp/dnephin/state-store-indexes-2
state: move config-entries table definition to config_entries_schema.go
2021-02-05 14:01:08 -05:00
Daniel Nephin 9beadc578b
Merge pull request #9664 from hashicorp/dnephin/state-store-indexes
state: move ACL schema and index definitions to acl_schema.go
2021-02-05 13:38:31 -05:00
Daniel Nephin b747b27afd state: remove the need for registerSchema
registerSchema creates some indirection which is not necessary in this
case. newDBSchema can call each of the tables.

Enterprise tables can be added from the existing withEnterpriseSchema
shim.
2021-02-05 12:19:56 -05:00
Daniel Nephin 33621706ac state: rename table name constants to use pattern
the 'table' prefix is shorter, and also reads better in queries.
2021-02-05 12:12:19 -05:00
Daniel Nephin 8569295116 state: rename connect constants 2021-02-05 12:12:19 -05:00
Daniel Nephin afdbf2a8ef state: rename table name constants to new pattern
Using Apps Hungarian Notation for these constants makes the memdb queries more readable.
2021-02-05 12:12:18 -05:00
Pierre Souchay c466b08481 Streaming filter tags + case insensitive lookups for Service Names
Will fix:
 * https://github.com/hashicorp/consul/issues/9695
 * https://github.com/hashicorp/consul/issues/9702
2021-02-04 11:00:51 +01:00
Daniel Nephin f929a7117e state: Remove unnecessary entMeta arg to EnsureConfigEntry 2021-02-03 18:10:38 -05:00
freddygv a0be7dcc1d Add trace logs to proxycfg state runner and xds srv 2021-02-02 12:26:38 -07:00
freddygv 0fb96afe31 Avoid potential deadlock using non-blocking send
Deadlock scenario:
    1. Due to scheduling, the state runner sends one snapshot into
    snapCh and then attempts to send a second. The first send succeeds
    because the channel is buffered, but the second blocks.
    2. Separately, Manager.Watch is called by the xDS server after
    getting a discovery request from Envoy. This function acquires the
    manager lock and then blocks on receiving the CurrentSnapshot from
    the state runner.
    3. Separately, there is a Manager goroutine that reads the snapshots
    from the channel in step 1. These reads are done to notify proxy
    watchers, but they require holding the manager lock. This goroutine
    goes to acquire that lock, but can't because it is held by step 2.

Now, the goroutine from step 3 is waiting on the one from step 2 to
release the lock. The goroutine from step 2 won't release the lock until
the goroutine in step 1 advances. But the goroutine in step 1 is waiting
for the one in step 3. Deadlock.

By making this send non-blocking step 1 above can proceed. The coalesce
timer will be reset and a new valid snapshot will be delivered after it
elapses or when one is requested by xDS.
2021-02-02 11:31:14 -07:00
hashicorp-ci 7e7b56e13e auto-updated agent/uiserver/bindata_assetfs.go from commit e0ff7080a 2021-02-02 10:08:48 +00:00
hashicorp-ci d979b67d63 auto-updated agent/uiserver/bindata_assetfs.go from commit 0b7d676dc 2021-02-01 17:55:03 +00:00
hashicorp-ci 083fa1693b auto-updated agent/uiserver/bindata_assetfs.go from commit 3aef5cde2 2021-02-01 17:35:20 +00:00
hashicorp-ci d156596776 auto-updated agent/uiserver/bindata_assetfs.go from commit 3477b1de7 2021-01-29 16:03:41 +00:00
Daniel Nephin 09425b22a1 state: rename config-entries table const to match new pattern 2021-01-28 20:34:34 -05:00
Daniel Nephin 7d17e20270 state: move config-entries table to new pattern 2021-01-28 20:34:15 -05:00
Daniel Nephin 825b8ade39 state: use indexID
this change was already made to enterprise, so backporting it.
2021-01-28 20:30:08 -05:00
Daniel Nephin 2a262f07fc state: Move ACL schema indexes to match Ent
and use constants for table and index names.
2021-01-28 20:05:09 -05:00
Daniel Nephin 2eea58bcc4
Merge pull request #9302 from hashicorp/dnephin/add-service-3
agent: remove ServiceManager.Start goroutine
2021-01-28 16:59:41 -05:00
Daniel Nephin fc88c28bb9 config: make config.TestLoad_FullConfig use config.Load
This commit makes a number of changes that should make
TestLoad_FullConfig easier to work with, and make the test more like
real world scenarios.

* use separate files in testdata/ dir to store the config source.
  Separate files are much easier to edit because editors can syntax
  highlight json/hcl, and it makes strings easier to find. Previously
  trying to find strings would match strings used in other tests.
* use the exported config.Load interface instead of internal NewBuilder
  and BuildAndValidate.
* remove the tail config overrides, which are only necessary with
  nonZero works.
2021-01-27 17:51:53 -05:00
Daniel Nephin aa42ff5d81 config: Unexport Builder and NewBuilder
This type and constructor are implementation details of config loading.
All callers should use config.Load.
2021-01-27 17:41:53 -05:00
Daniel Nephin 08b2039f5a config: replace calls to config.NewBuilder with config.Load
This is another incremental change to reduce config loading to a single
small interface. All calls to NewBuilder can be replaced with Load.
2021-01-27 17:34:43 -05:00
Daniel Nephin bd545963cc config: improve the interface of Load
This commit reduces the interface to Load() a bit, in preparation for
unexporting NewBuilder and having everything call Load.

The three arguments are reduced to a single argument by moving the other
two into the options struct.

The three return values are reduced to two by moving the RuntimeConfig
and Warnings into a LoadResult struct.
2021-01-27 17:34:43 -05:00
Daniel Nephin 63ba836257
Merge pull request #9252 from hashicorp/dnephin/config-unmethod
config: remove Builder receiver from funcs that dont use it
2021-01-27 17:31:17 -05:00
Matt Keeler 1379b5f7d6
Upgrade raft-autopilot and wait for autopilot it to stop when revoking leadership (#9644)
Fixes: 9626
2021-01-27 11:14:52 -05:00
hashicorp-ci 614c57a9c6 auto-updated agent/uiserver/bindata_assetfs.go from commit 25f989753 2021-01-27 10:47:58 +00:00
Hans Hasselberg 623aab5880
Add flags to support CA generation for Connect (#9585) 2021-01-27 08:52:15 +01:00
hashicorp-ci 773f1c6b21 auto-updated agent/uiserver/bindata_assetfs.go from commit 92f0eb3bd 2021-01-26 18:00:09 +00:00
hashicorp-ci 68f65fd4a9 auto-updated agent/uiserver/bindata_assetfs.go from commit 82a62cd2e 2021-01-26 17:47:18 +00:00
Daniel Nephin 18fcce575a
Merge pull request #9301 from hashicorp/dnephin/add-service-2
agent: reduce AddService 2
2021-01-26 12:01:34 -05:00
R.B. Boyer 5777fa1f59
server: initialize mgw-wanfed to use local gateways more on startup (#9528)
Fixes #9342
2021-01-25 17:30:38 -06:00
Daniel Nephin e1e94ce69c agent: use the new lib/mutex for stateLock
Previously the ServiceManager had to run a separate goroutine so that it could block on a channel
send/receive instead of a lock. Using this mutex with TryLock allows us to cancel the lock when
the serviceConfigWatch is stopped.

Without this change removing the ServiceManager.Start goroutine would not be possible because
when AddService is called it acquires the stateLock. While that lock is held, if there are
existing watches for the service, the old watch will be stopped, and the goroutine holding the
lock will attempt to wait for that watcher goroutine to exit.

If the goroutine is handling an update (serviceConfigWatch.handleUpdate) then it can block on
acquiring the stateLock and deadlock the agent.  With this change the context is cancelled as
and the goroutine will exit instead of waiting on the stateLock.
2021-01-25 18:01:47 -05:00
Daniel Nephin 08e8ed0a7e agent: remove ServiceManager goroutine
The ServiceManager.Start goroutine was used to serialize calls to
agent.addServiceInternal.

All the goroutines which sent events to the channel would block waiting
for a response from that same goroutine, which is effectively the same
as a synchronous call without any channels.

This commit removes the goroutine and channels, and instead calls
addServiceInternal directly. Since all of these goroutines will need to
take the agent.stateLock, the mutex handles the serializing of calls.
2021-01-25 18:01:47 -05:00
Daniel Nephin 081173652c agent: Minor cosmetic changes in ServiceManager
Also use the non-deprecated func in a test
2021-01-25 18:01:47 -05:00
Daniel Nephin 627371bc37 agent: update godoc for AddServiceRequest 2021-01-25 18:01:03 -05:00
Daniel Nephin 121fd18f9a agent: move checkStateSnapshot
Move the field into the struct for addServiceLocked. Also don't require
setting a default value, so that the callers can leave it as nil if they
don't already have a snapshot.
2021-01-25 18:01:03 -05:00
Daniel Nephin 4d0dd9f5ff agent: move two fields off of AddServiceRequest 2021-01-25 18:01:03 -05:00
Daniel Nephin 76ba6c2198 agent: Replace two fields on AddServiceRequest with a func field
The two previous fields were mutually exclusive. They can be represented
with a single function which provides the value.
2021-01-25 18:01:03 -05:00
Daniel Nephin 5e31bdf51a agent: remove serviceRegiration type
Replace with the existing AddServiceRequest struct. These structs are
almost identical. Additionally, the only reason the serviceRegistration
struct existed was to recreate an AddServiceRequest.

By storing and re-using the AddServiceRequest we remove the need to
translate into one type and back to the original type.

We also remove the extra parameters to a function, because those values
are already available from the AddServiceRequest field.

Also a minor optimization to only call tokens.AgentToken() when
necessary. Previous it was being called every time, but the value was
being ignored if the AddServiceRequest had a token.
2021-01-25 18:01:03 -05:00
Daniel Nephin 75b2c55291 agent: remove an a branch in the AddService flow
Handle the decision to use ServiceManager in a single place. Instead of
calling ServiceManager.AddService, then calling back into
addServiceInternal, only call ServiceManager.AddService if we are going
to use it.

This change removes some small duplication and removes a branch from the
AddService flow.
2021-01-25 18:01:03 -05:00
Daniel Nephin 1ce7cdd892 agent: use fields directly, not temp variables
The temprorary variables make it much harder to trace where and how struct
fields are used. If a field is only used a small number of times than
refer to the field directly.
2021-01-25 17:25:04 -05:00
Daniel Nephin 493e987a88 agent: addServiceIternalRequest
Move fields that are only relevant for addServiceInternal onto a new
struct and embed AddServiceRequest.
2021-01-25 17:25:04 -05:00
Daniel Nephin a0b11b4c20 agent: move deprecated AddServiceFromSource to a test file
The method is only used in tests, and only exists for legacy calls.

There was one other package which used this method in tests. Export
the AddServiceRequest and a couple of its fields so the new function can
be used in those tests.
2021-01-25 17:25:03 -05:00
Daniel Nephin de1a80b368 agent: use a single method for Agent.AddService 2021-01-25 17:25:03 -05:00
Daniel Nephin 1c8eca2bfd agent: rename AddService->AddServiceFromSource
In preparation for extracting a single AddService func that accepts a request struct.
2021-01-25 17:25:01 -05:00
Daniel Nephin d7d081f402
Merge pull request #9420 from hashicorp/dnephin/reduce-duplicate-in-catalog-schema
state: reduce interface for Enterprise schema
2021-01-25 17:04:25 -05:00
Chris Boulton 448212060a
connect: add local_request_timeout_ms to configure local_app http timeouts (#9554) 2021-01-25 13:50:00 -06:00
R.B. Boyer 6622185d64
server: use the presense of stored federation state data as a sign that we already activated the federation state feature flag (#9519)
This way we only have to wait for the serf barrier to pass once before
we can make use of federation state APIs Without this patch every
restart needs to re-compute the change.
2021-01-25 13:24:32 -06:00
hashicorp-ci 7c8cfb7a2a auto-updated agent/uiserver/bindata_assetfs.go from commit 09d917618 2021-01-25 19:09:57 +00:00
hashicorp-ci b9d7f04760 auto-updated agent/uiserver/bindata_assetfs.go from commit 88d5e00f9 2021-01-25 18:52:54 +00:00
hashicorp-ci eefa22b79b auto-updated agent/uiserver/bindata_assetfs.go from commit bb9573832 2021-01-25 18:19:10 +00:00
R.B. Boyer e3f5a77ade
server: add OSS stubs supporting validation of source namespaces in service-intentions config entries (#9527) 2021-01-25 11:27:38 -06:00
R.B. Boyer 0247f409a0
server: when wan federating via mesh gateways only do heuristic primary DC bypass on the leader (#9366)
Fixes #9341
2021-01-22 10:03:24 -06:00
hashicorp-ci cae9b96b00 auto-updated agent/uiserver/bindata_assetfs.go from commit 02772e46a 2021-01-20 18:46:55 +00:00
John Cowen d3ecb6d7a0
Fix -ui-content-path without regex (#9569)
* Add templating to inject JSON into an application/json script tag

Plus an external script in order to pick it out and inject the values we
need injecting into ember's environment meta tag.

The UI still uses env style naming (CONSUL_*) but we uses the new style
JSON/golang props behind the scenes.

Co-authored-by: Paul Banks <banks@banksco.de>
2021-01-20 18:40:46 +00:00
hashicorp-ci 30d26fa90e auto-updated agent/uiserver/bindata_assetfs.go from commit ffb6680e7 2021-01-20 17:08:19 +00:00
John Cowen 82a5e6d6fc
api: Ensure the internal/ui/gateway-service-nodes endpoint responds with an array (#9593)
In some circumstances this endpoint will have no results in it (dues to
ACLs, Namespaces, filtering or missing configuration).

This ensures that the response is at least an empty array (`[]`) rather
than `null`
2021-01-20 16:59:02 +00:00
Matt Keeler d871a6f260
Fix flaky test by marking mock expectations as optional (#9596)
These expectations are optional because in a slow CI environment the deadline to cancell the context might occur before the go routine reaches issuing the RPC. Either way we are successfully ensuring context cancellation is working.
2021-01-20 10:58:27 -05:00
hashicorp-ci 884a2d61dc auto-updated agent/uiserver/bindata_assetfs.go from commit 30014ff8f 2021-01-20 15:43:19 +00:00
Freddy 5519051c84
Update topology mapping Refs on all proxy instance deletions (#9589)
* Insert new upstream/downstream mapping to persist new Refs

* Avoid upserting mapping copy if it's a no-op

* Add test with panic repro

* Avoid deleting up/downstreams from inside memdb iterator

* Avoid deleting gateway mappings from inside memdb iterator

* Add CHANGELOG entry

* Tweak changelog entry

Co-authored-by: Paul Banks <banks@banksco.de>
2021-01-20 15:17:26 +00:00
Daniel Nephin 979749d86e state: do not delete from inside an iteration
Deleting from memdb inside an interation can cause a panic from Iterator.Next. This
case is technically safe (for now) because the iterator is using the root radix tree
not a modified one.

However this could break at any time if someone adds an insert or delete to the coordinates table
before this place in the function.

It also sets a bad example, because generally deletes in an interator are not safe. So this
commit uses the pattern we have in other places to move the deletes out of the iteration.
2021-01-19 17:00:07 -05:00
Matt Keeler 2d7a4073e3
Merge pull request #9570 from hashicorp/bugfix/9498 2021-01-19 16:30:04 -05:00
Matt Keeler 2d2ce1fb0c
Ensure that CA initialization does not block leader election.
After fixing that bug I uncovered a couple more:

Fix an issue where we might try to cross sign a cert when we never had a valid root.
Fix a potential issue where reconfiguring the CA could cause either the Vault or AWS PCA CA providers to delete resources that are still required by the new incarnation of the CA.
2021-01-19 15:27:48 -05:00
hashicorp-ci 4b3361fb03 auto-updated agent/uiserver/bindata_assetfs.go from commit be694366a 2021-01-19 15:47:02 +00:00
hashicorp-ci 88ab4cd159 auto-updated agent/uiserver/bindata_assetfs.go from commit 41a4a9f4f 2021-01-19 15:29:55 +00:00
Daniel Nephin 52a1d78e39 state: add a regression test for state store schema
To allow the index to be refactored without accidental changes.

To update the expected value run: 'go test ./agent/consul/state -update'
2021-01-15 18:49:55 -05:00
Daniel Nephin aa21c1ea04 state: reduce interface for Enterprise schema
Using withEnterpriseSchema() we can apply any enterprise schema changes
with a single shim, removing the need to duplicate all of the table
definitions.

Also move all the catalog schemas to a new file to shrink catalog.go a bit.
2021-01-15 18:49:55 -05:00
Daniel Nephin 45c0a71e7e
Merge pull request #8696 from hashicorp/dnephin/fix-load-limits
agent/consul: make Client/Server config reloading more obvious
2021-01-14 17:40:42 -05:00
Daniel Nephin 27a59f86d0
Merge pull request #9436 from hashicorp/dnephin/fix-service-health-req-cache-key
structs: fix caching of ServiceSpecificRequest when ingress=true
2021-01-14 17:26:25 -05:00
Daniel Nephin e8427a48ab agent/consuk: Rename RPCRate -> RPCRateLimit
so that the field name is consistent across config structs.
2021-01-14 17:26:00 -05:00
Daniel Nephin e5320c2db6 agent/consul: make Client/Server config reloading more obvious
I believe this commit also fixes a bug. Previously RPCMaxConnsPerClient was not being re-read from the RuntimeConfig, so passing it to Server.ReloadConfig was never changing the value.

Also improve the test runtime by not doing a lot of unnecessary work.
2021-01-14 17:21:10 -05:00
Daniel Nephin f2b504873a
Merge pull request #9460 from hashicorp/dnephin/fix-data-races
Fix a couple data races in tests
2021-01-14 17:07:01 -05:00
Daniel Nephin ef9d44fdfe structs: fix caching of ServiceSpecificRequest when ingress=true
The field was not being included in the cache info key. This would result in a DNS request for
web.service.consul returning the same result as web.ingress.consul, when those results should
not be the same.
2021-01-14 17:01:40 -05:00
hashicorp-ci 66ec8e5e1f auto-updated agent/uiserver/bindata_assetfs.go from commit 1e30503ec 2021-01-13 09:47:00 +00:00
kevinkengne 18a12d6517
add completeness test for types with CacheInfo method (#9480)
include all fields when fuzzing in tests
split tests by struct type

Ensure the new value for the field is different

fuzzer.Fuzz could produce the same value again in some cases.

Use a custom fuzz function for QueryOptions. That type is an embedded struct in the request types
but only one of the fields is important to include in the cache key.

Move enterpriseMetaField to an oss file so that we can change it in enterprise.
2021-01-12 19:45:46 -05:00
Chris Piraino baad708929
Fix bug in usage metrics when multiple service instances are changed in a single transaction (#9440)
* Fix bug in usage metrics that caused a negative count to occur

There were a couple of instances were usage metrics would do the wrong
thing and result in incorrect counts, causing the count to attempt to
decrement below zero and return an error. The usage metrics did not
account for various places where a single transaction could
delete/update/add multiple service instances at once.

We also remove the error when attempting to decrement below zero, and
instead just make sure we do not accidentally underflow the unsigned
integer. This is a more graceful failure than returning an error and not
allowing a transaction to commit.

* Add changelog
2021-01-12 15:31:47 -06:00