open-consul/agent/consul
R.B. Boyer d65008700a
acl: reduce complexity of token resolution process with alternative singleflighting (#5480)
acl: reduce complexity of token resolution process with alternative singleflighting

Switches acl resolution to use golang.org/x/sync/singleflight. For the
identity/legacy lookups this is a drop-in replacement with the same
overall approach to request coalescing.

For policies this is technically a change in behavior, but when
considered holistically is approximately performance neutral (with the
benefit of less code).

There are two goals with this blob of code (speaking specifically of
policy resolution here):

  1) Minimize cross-DC requests.
  2) Minimize client-to-server LAN requests.

The previous iteration of this code was optimizing for the case of many
possibly different tokens being resolved concurrently that have a
significant overlap in linked policies such that deduplication would be
worth the complexity. While this is laudable there are some things to
consider that can help to adjust expectations:

  1) For v1.4+ policies are always replicated, and once a single policy
  shows up in a secondary DC the replicated data is considered
  authoritative for requests made in that DC. This means that our
  earlier concerns about minimizing cross-DC requests are irrelevant
  because there will be no cross-DC policy reads that occur.

  2) For Server nodes the in-memory ACL policy cache is capped at zero,
  meaning it has no caching. Only Client nodes run with a cache. This
  means that instead of having an entire DC's worth of tokens (what a
  Server might see) that can have policy resolutions coalesced these
  nodes will only ever be seeing node-local token resolutions. In a
  reasonable worst-case scenario where a scheduler like Kubernetes has
  "filled" a node with Connect services, even that will only schedule
  ~100 connect services per node. If every service has a unique token
  there will only be 100 tokens to coalesce and even then those requests
  have to occur concurrently AND be hitting an empty consul cache.

Instead of seeing a great coalescing opportunity for cutting down on
redundant Policy resolutions, in practice it's far more likely given
node densities that you'd see requests for the same token concurrently
than you would for two tokens sharing a policy concurrently (to a degree
that would warrant the overhead of the current variation of
singleflighting.

Given that, this patch switches the Policy resolution process to only
singleflight by requesting token (but keeps the cache as by-policy).
2019-03-14 09:35:34 -05:00
..
autopilot autopilot: don't follow the normal server removal rules for nonvoters 2018-08-14 14:24:51 -07:00
fsm Store leaf cert indexes in raft and use for the ModifyIndex on the returned certs (#5211) 2019-01-11 16:04:57 -05:00
prepared_query Allow ignoring checks by ID when defining a PreparedQuery. Fixes #3727. 2018-04-10 14:04:16 +01:00
state [catalog] Update the node's services indexes on update (#5458) 2019-03-11 14:48:19 +00:00
acl.go acl: reduce complexity of token resolution process with alternative singleflighting (#5480) 2019-03-14 09:35:34 -05:00
acl_client.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
acl_endpoint.go fix typos reported by golangci-lint:misspell (#5434) 2019-03-06 11:13:28 -06:00
acl_endpoint_legacy.go acl: add stub hooks to support some plumbing in enterprise (#4951) 2018-11-13 15:35:54 -06:00
acl_endpoint_test.go fix typos 2019-03-06 14:47:33 -05:00
acl_replication.go Fixes for CVE-2019-8336 2019-03-04 19:13:24 +00:00
acl_replication_legacy.go ACL Token Persistence and Reloading (#5328) 2019-02-27 14:28:31 -05:00
acl_replication_legacy_test.go ACL Token Persistence and Reloading (#5328) 2019-02-27 14:28:31 -05:00
acl_replication_test.go Fixes for CVE-2019-8336 2019-03-04 19:13:24 +00:00
acl_server.go ACL Token Persistence and Reloading (#5328) 2019-02-27 14:28:31 -05:00
acl_test.go acl: reduce complexity of token resolution process with alternative singleflighting (#5480) 2019-03-14 09:35:34 -05:00
autopilot.go Remove deprecated metric names 2018-05-08 16:23:15 -07:00
autopilot_oss.go Fix vet error 2017-12-18 18:04:42 -08:00
autopilot_test.go autopilot: don't follow the normal server removal rules for nonvoters 2018-08-14 14:24:51 -07:00
catalog_endpoint.go Merge pull request #4869 from hashicorp/txn-checks 2019-01-22 11:16:09 -08:00
catalog_endpoint_test.go Avoid to have infinite recursion in DNS lookups when resolving CNAMEs (#4918) 2019-01-07 16:53:54 -05:00
client.go agent: enable reloading of tls config (#5419) 2019-03-13 10:29:06 +01:00
client_serf.go Call RemoveServer for reap events (#5317) 2019-03-04 09:19:35 -05:00
client_test.go improve flaky LANReap tests by expliciting configuring the tombstone timeout 2019-03-05 14:34:03 -06:00
config.go agent: enable reloading of tls config (#5419) 2019-03-13 10:29:06 +01:00
connect_ca_endpoint.go fix typos reported by golangci-lint:misspell (#5434) 2019-03-06 11:13:28 -06:00
connect_ca_endpoint_test.go fix typos reported by golangci-lint:misspell (#5434) 2019-03-06 11:13:28 -06:00
consul_ca_delegate.go Move connect CA provider to separate package 2018-06-14 09:42:15 -07:00
coordinate_endpoint.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
coordinate_endpoint_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
enterprise_client_oss.go Allow for easy enterprise/oss coexistence 2018-05-24 10:36:42 -04:00
enterprise_server_oss.go oss: add the enterprise server stub for intention replication check 2019-01-18 17:32:10 -08:00
filter.go txn: add ACL enforcement/validation to new txn ops 2018-12-12 10:04:10 -08:00
filter_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
flood.go Add segment addr field to tags for LAN flood joiner 2017-08-30 11:58:29 -07:00
health_endpoint.go bugfix: use ServiceTags to generate cache key hash (#4987) 2019-01-07 21:30:47 +00:00
health_endpoint_test.go bugfix: use ServiceTags to generate cache key hash (#4987) 2019-01-07 21:30:47 +00:00
helper_test.go acl: Prevent tokens from deleting themselves (#5210) 2019-01-10 09:22:51 -05:00
intention_endpoint.go connect: Forward intention RPCs if this isn't the primary 2019-01-22 11:29:21 -08:00
intention_endpoint_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
internal_endpoint.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
internal_endpoint_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
issue_test.go Moves the FSM into its own package. 2017-11-29 18:36:53 -08:00
kvs_endpoint.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
kvs_endpoint_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
leader.go Fixes for CVE-2019-8336 2019-03-04 19:13:24 +00:00
leader_oss.go re-add Connect multi-dc config changes 2018-10-19 08:41:03 -07:00
leader_test.go Avoid to have infinite recursion in DNS lookups when resolving CNAMEs (#4918) 2019-01-07 16:53:54 -05:00
merge.go Takes the skip out of the client check. 2017-09-06 17:05:40 -07:00
merge_test.go Skips unique node ID check for old versions of Consul. 2017-09-05 22:57:29 -07:00
operator_autopilot_endpoint.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
operator_autopilot_endpoint_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
operator_endpoint.go pkg refactor 2017-06-10 18:52:45 +02:00
operator_raft_endpoint.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
operator_raft_endpoint_test.go [Travis][UnstableTests] Fixed unstable tests in travis (#5013) 2018-12-12 12:09:42 -08:00
prepared_query_endpoint.go Improve Connect with Prepared Queries (#5291) 2019-02-04 09:36:51 -05:00
prepared_query_endpoint_test.go Improve Connect with Prepared Queries (#5291) 2019-02-04 09:36:51 -05:00
raft_rpc.go agent: move conn pool for muxed connections into separate pkg 2017-06-21 05:42:39 +02:00
rpc.go fix typos reported by golangci-lint:misspell (#5434) 2019-03-06 11:13:28 -06:00
rpc_test.go Fixed another list of unstable unit tests in travis (#4915) 2018-11-20 11:27:26 +00:00
rtt.go Added Coordinate.Node rpc endpoint and client api method 2017-10-26 19:16:40 -07:00
rtt_test.go Fix more unstable tests in agent and command 2018-09-12 14:49:27 +01:00
segment_oss.go Remove deprecated metric names 2018-05-08 16:23:15 -07:00
serf_test.go pkg refactor 2017-06-10 18:52:45 +02:00
server.go agent: enable reloading of tls config (#5419) 2019-03-13 10:29:06 +01:00
server_lookup.go More cleanup from code review 2017-08-30 12:31:36 -05:00
server_lookup_test.go More cleanup from code review 2017-08-30 12:31:36 -05:00
server_oss.go connect: tame thundering herd of CSRs on CA rotation (#5228) 2019-01-22 17:19:36 +00:00
server_serf.go Call RemoveServer for reap events (#5317) 2019-03-04 09:19:35 -05:00
server_test.go agent: enable reloading of tls config (#5419) 2019-03-13 10:29:06 +01:00
session_endpoint.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
session_endpoint_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
session_timers.go address review comments 2017-07-07 09:22:34 +02:00
session_timers_test.go rpc: refactor sessionTimers and fix racy tests 2017-07-07 09:22:34 +02:00
session_ttl.go Remove deprecated metric names 2018-05-08 16:23:15 -07:00
session_ttl_test.go [Travis][UnstableTests] Fixed unstable tests in travis (#5013) 2018-12-12 12:09:42 -08:00
snapshot_endpoint.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
snapshot_endpoint_test.go New ACLs (#4791) 2018-10-19 12:04:07 -04:00
stats_fetcher.go Fix stats fetcher healthcheck RPCs not being independent 2018-08-14 14:23:52 -07:00
stats_fetcher_test.go Fix more unstable tests in agent and command 2018-09-12 14:49:27 +01:00
status_endpoint.go Move autopilot to a standalone package 2017-12-11 16:45:33 -08:00
status_endpoint_test.go Fixed another list of unstable unit tests in travis (#4915) 2018-11-20 11:27:26 +00:00
txn_endpoint.go fix typos reported by golangci-lint:misspell (#5434) 2019-03-06 11:13:28 -06:00
txn_endpoint_test.go txn: add tests for RPC endpoint 2018-12-12 10:04:10 -08:00
util.go Remaining ACL Unit Tests (#4852) 2018-10-31 13:00:46 -07:00
util_test.go pkg refactor 2017-06-10 18:52:45 +02:00