Commit Graph

1666 Commits

Author SHA1 Message Date
Daniel Nephin e03b7e4c68
Merge pull request #11182 from hashicorp/dnephin/acl-legacy-remove-upgrade
acl: remove upgrade from legacy, start in non-legacy mode
2021-10-04 17:25:39 -04:00
Daniel Nephin b9f0014d70 acl: remove updateEnterpriseSerfTags
The only remaining caller is a test helper, and the tests don't use the enterprise gossip
pools.
2021-10-04 17:01:51 -04:00
Daniel Nephin 5ac360b22d
Merge pull request #11126 from hashicorp/dnephin/acl-legacy-remove-resolve-and-get-policy
acl: remove ACL.GetPolicy RPC endpoint and ACLResolver.resolveTokenLegacy
2021-10-04 16:29:51 -04:00
Connor Kelly ed5693b537
Add metrics to count the number of service-mesh config entries 2021-10-04 14:50:17 -05:00
Connor Kelly 9c487389cf
Add metrics to count connect native service mesh instances
This will add the counts of the service mesh instances tagged by
whether or not it is connect native
2021-10-04 14:37:05 -05:00
Connor Kelly 8000ea45ca
Add metrics to count service mesh Kind instance counts
This will add the counts of service mesh instances tagged by the
different ServiceKind's.
2021-10-04 14:36:59 -05:00
Daniel Nephin b6435259c3 acl: fix test failures caused by remocving legacy ACLs
This commit two test failures:

1. Remove check for "in legacy ACL mode", the actual upgrade will be removed in a following commit.
2. Remove the early WaitForLeader in dc2, because with it the test was
   failing with ACL not found.
2021-10-01 18:03:10 -04:00
Dhia Ayachi 8bd52995d1
fix token list by auth method (#11196)
* add tests to OIDC authmethod and fix entMeta when retrieving auth-methods

* fix oss compilation error
2021-10-01 12:00:43 -04:00
Daniel Nephin ec935a2486 acl: call stop for the upgrade goroutine when done
TestAgentLeaks_Server was reporting a goroutine leak without this. Not sure if it would actually
be a leak in production or if this is due to the test setup, but seems easy enough to call it
this way until we remove legacyACLTokenUpgrade.
2021-09-29 17:36:43 -04:00
Daniel Nephin 0c077d0527 acl: only run startACLUpgrade once
Since legacy ACL tokens can no longer be created we only need to run this upgrade a single
time when leadership is estalbished.
2021-09-29 16:22:01 -04:00
Daniel Nephin f21097beda acl: remove reading of serf acl tags
We no long need to read the acl serf tag, because servers are always either ACL enabled or
ACL disabled.

We continue to write the tag so that during an upgarde older servers will see the tag.
2021-09-29 15:45:11 -04:00
Daniel Nephin b866e3c4f4 acl: fix test failure
For some reason removing legacy ACL upgrade requires using an ACL token now
for this WaitForLeader.
2021-09-29 15:21:30 -04:00
Daniel Nephin ebb2388605 acl: remove legacy ACL upgrades from Server
As part of removing the legacy ACL system
2021-09-29 15:19:23 -04:00
Daniel Nephin 41a97360ca acl: fix test failures caused by remocving legacy ACLs
This commit two test failures:

1. Remove check for "in legacy ACL mode", the actual upgrade will be removed in a following commit.
2. Use the root token in WaitForLeader, because without it the test was
   failing with ACL not found.
2021-09-29 15:15:50 -04:00
Daniel Nephin b73b68d696 acl: remove ACL.GetPolicy endpoint and resolve legacy acls
And all code that was no longer used once those two were removed.
2021-09-29 14:33:19 -04:00
Daniel Nephin b8da06a34d acl: remove ACL upgrading from Clients
As part of removing the legacy ACL system ACL upgrading and the flag for
legacy ACLs is removed from Clients.

This commit also removes the 'acls' serf tag from client nodes. The tag is only ever read
from server nodes.

This commit also introduces a constant for the acl serf tag, to make it easier to track where
it is used.
2021-09-29 14:02:38 -04:00
Daniel Nephin 33a5448604
Merge pull request #11136 from hashicorp/dnephin/acl-resolver-fix-default-authz
acl: fix default Authorizer for down_policy extend-cache/async-cache
2021-09-29 13:45:12 -04:00
Daniel Nephin 2995ac61f2 acl: remove the last of the legacy FSM
Replace it with an implementation that returns an error, and rename some symbols
to use a Deprecated suffix to make it clear.

Also remove the ACLRequest struct, which is no longer referenced.
2021-09-29 12:42:23 -04:00
Daniel Nephin a8358f7575 acl: remove bootstrap-init FSM operation 2021-09-29 12:42:23 -04:00
Daniel Nephin ea2e0ad2ec acl: remove initializeLegacyACL from leader init 2021-09-29 12:42:23 -04:00
Daniel Nephin 4e36442583 acl: remove ACLDelete FSM command, and state store function
These are no longer used now that ACL.Apply has been removed.
2021-09-29 12:42:23 -04:00
Daniel Nephin 7e37c9a765 acl: remove legacy field to ACLBoostrap 2021-09-29 12:42:23 -04:00
Daniel Nephin d4c48a3f23
Merge pull request #11101 from hashicorp/dnephin/acl-legacy-remove-rpc-2
acl: remove legacy ACL.Apply RPC
2021-09-29 12:23:55 -04:00
Daniel Nephin 69a83aefcf
Merge pull request #11177 from hashicorp/dnephin/remove-entmeta-methods
structs: remove EnterpriseMeta helper methods
2021-09-29 12:08:07 -04:00
Daniel Nephin acb62aa896
Merge pull request #10986 from hashicorp/dnephin/acl-legacy-remove-rpc
acl: remove legacy ACL RPC - part 1
2021-09-29 12:04:09 -04:00
Daniel Nephin 1bc07c5166 structs: rename the last helper method.
This one gets used a bunch, but we can rename it to make the behaviour more obvious.
2021-09-29 11:48:38 -04:00
Daniel Nephin 93b3e110b6 structs: remove another helper
We already have a helper funtion.
2021-09-29 11:48:03 -04:00
Chris S. Kim 3f79aaf509
Cleanup unnecessary normalizing method (#11169) 2021-09-28 15:31:12 -04:00
Daniel Nephin 4ed9476a61
Merge pull request #11084 from krastin/krastin-autopilot-loggingtypo
Fix a tiny typo in logging in autopilot.go
2021-09-28 15:11:11 -04:00
Daniel Nephin 30fe14eed3 acl: fix default authorizer for down_policy
This was causing a nil panic because a nil authorizer is no longer valid after the cleanup done
in https://github.com/hashicorp/consul/pull/10632.
2021-09-23 18:12:22 -04:00
Daniel Nephin a6a7069ecf Remove t.Parallel from TestACLResolver_DownPolicy
These tests run in under 10ms, t.Parallel does nothing but slow them down and
make failures harder to debug when one panics.
2021-09-23 18:12:22 -04:00
Dhia Ayachi 4505cb2920
Refactor table index acl phase 2 (#11133)
* extract common methods from oss and ent

* remove unreachable code

* add missing normalize for binding rules

* fix oss to use Query
2021-09-23 15:26:09 -04:00
Dhia Ayachi ebe333b947
Refactor table index (#11131)
* convert tableIndex to use the new pattern

* make `indexFromString` available for oss as well

* refactor `indexUpdateMaxTxn`
2021-09-23 11:06:23 -04:00
Daniel Nephin 3e6dc2a843 acl: remove ACL.Apply
As part of removing the legacy ACL system.
2021-09-22 18:28:08 -04:00
Daniel Nephin 2ce64e2837 acl: made acl rules in tests slightly more specific
When converting these tests from the legacy ACL system to the new RPC endpoints I
initially changed most things to use _prefix rules, because that was equivalent to
the old legacy rules.

This commit modifies a few of those rules to be a bit more specific by replacing the _prefix
rule with a non-prefix one where possible.
2021-09-22 18:24:56 -04:00
Mark Anderson c87d57bfeb
partitions/authmethod-index work from enterprise (#11056)
* partitions/authmethod-index work from enterprise

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2021-09-22 13:19:20 -07:00
R.B. Boyer ba13416b57
grpc: strip local ACL tokens from RPCs during forwarding if crossing datacenters (#11099)
Fixes #11086
2021-09-22 13:14:26 -05:00
Connor bc04a155fb
Merge pull request #11090 from hashicorp/clly/kv-usage-metrics
Add KVUsage to consul state usage metrics
2021-09-22 11:26:56 -05:00
Connor Kelly bfe6b64ca7
Strip out go 1.17 bits 2021-09-22 11:04:48 -05:00
Daniel Nephin b40bdc9e98 acl: remove remaining tests that use ACL.Apply
In preparation for removing ACL.Apply.

Tests for ACL.Apply, ACL.GetPolicy, and ACL upgrades were removed
because all 3 of those will be removed shortly.

The forth test appears to be for the ACLResolver cache, so the test was moved to the correct
test file, and the name was updated to make it obvious what is being tested.
2021-09-21 19:35:26 -04:00
Daniel Nephin ab91d254a3 fsm: restore the legacy commands
and emit a helpful error message.
2021-09-21 18:35:12 -04:00
Daniel Nephin 0180dd67ff Convert tests to the new ACL system
In preparation for removing ACL.Apply
2021-09-21 18:35:12 -04:00
Daniel Nephin b639f47e3c config: use the new ACL system in tests
In preparation for removing ACL.Apply
2021-09-21 17:57:29 -04:00
Daniel Nephin 2702aecc27 catalog: use the new ACL system in tests
In preparation for removing ACL.Apply
2021-09-21 17:57:29 -04:00
Daniel Nephin ad9748adc3 acl: remove two commented out tests for legacy ACL replication
They were commented out in 2018.
2021-09-21 17:57:29 -04:00
Daniel Nephin 5a31a2e167 acl: replace legacy Get and List RPCs with an error impl
These endpoints are being removed as part of the legacy ACL system.
2021-09-21 17:57:29 -04:00
Daniel Nephin 26f3380688 acl: remove a couple legacy ACL operation constants
structs.ACLForceSet was deprecated 4 years ago, it should be safe to remove now.
ACLBootstrapNow was removed in a recent commit. While it is technically possible that a cluster with mixed version
could still attempt a legacy boostrap, we documented that the legacy system was deprecated in 1.4, so no
clusters that are being upgraded should be attempting a legacy boostrap.
2021-09-21 17:57:29 -04:00
Daniel Nephin 5493ff06cc
Merge pull request #10985 from hashicorp/dnephin/acl-legacy-remove-replication
acl: remove legacy ACL replication
2021-09-21 17:56:54 -04:00
Connor 64852cd3e5
Apply suggestions from code review
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2021-09-21 10:52:46 -05:00
Connor Kelly 973b7b5c78
Fix test 2021-09-20 13:44:43 -05:00
Connor Kelly 698fc291a9
Add KVUsage to consul state usage metrics
This change will add the number of entries in the consul KV store to the
already existing usage metrics.
2021-09-20 12:41:54 -05:00
Krastin Krastev ba13dbf24c
Update autopilot.go
Fixing a minuscule typo in logging
2021-09-20 14:40:58 +02:00
Freddy f1b2ef30d1
Merge pull request #11071 from hashicorp/partitions/ixn-decisions 2021-09-16 15:18:23 -06:00
R.B. Boyer 7fa8f19077
acl: ensure the global management policy grants all necessary partition privileges (#11072) 2021-09-16 15:53:10 -05:00
freddygv b5a8935bb8 Default the partition in ixn check 2021-09-16 14:39:01 -06:00
freddygv caafc1905e Fixup test 2021-09-16 14:39:01 -06:00
freddygv 8a9bf3748c Account for partitions in ixn match/decision 2021-09-16 14:39:01 -06:00
Jeff Widman a8f396c55f
Bump `go-discover` to fix broken dep tree (#10898) 2021-09-16 15:31:22 -04:00
R.B. Boyer 4e7b6888e3
acl: fix intention:*:write checks (#11061)
This is a partial revert of #10793
2021-09-16 11:08:45 -05:00
Freddy 88627700d0
Merge pull request #11051 from hashicorp/partitions/fixes 2021-09-16 09:29:00 -06:00
Freddy 494764ee2d
acl: small resolver changes to account for partitions (#11052)
Also refactoring the enterprise side of a test to make it easier to reason about.
2021-09-16 09:17:02 -05:00
freddygv dc549eca30 Default partition in match endpoint 2021-09-15 17:23:52 -06:00
Mark Anderson 08b222cfc3
ACL Binding Rules table partitioning (#11044)
* ACL Binding Rules table partitioning

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2021-09-15 13:26:08 -07:00
Dhia Ayachi 25ea1a9276
use const instead of literals for `tableIndex` (#11039) 2021-09-15 10:24:04 -04:00
Mark Anderson ffe3806aaf
Refactor `indexAuthMethod` in `tableACLBindingRules` (#11029)
* Port consul-enterprise #1123 to OSS

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Fixup missing query field

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* change to re-trigger ci system

Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2021-09-15 09:34:19 -04:00
Dhia Ayachi 4992218676
convert expiration indexed in ACLToken table to use `indexerSingle` (#11018)
* move intFromBool to be available for oss

* add expiry indexes

* remove dead code: `TokenExpirationIndex`

* fix remove indexer `TokenExpirationIndex`

* fix rebase issue
2021-09-13 14:37:16 -04:00
Dhia Ayachi 1f23bdf388
add locality indexer partitioning (#11016)
* convert `Roles` index to use `indexerSingle`

* split authmethod write indexer to oss and ent

* add index locality

* add locality unit tests

* move intFromBool to be available for oss

* use Bool func

* refactor `aclTokenList` to merge func
2021-09-13 11:53:00 -04:00
Dhia Ayachi 3638825db8
convert `indexAuthMethod` index to use `indexerSingle` (#11014)
* convert `Roles` index to use `indexerSingle`

* fix oss build

* split authmethod write indexer to oss and ent

* add auth method unit tests
2021-09-10 16:56:56 -04:00
Paul Banks 3484d77b18 Fix enterprise discovery chain tests; Fix multi-level split merging 2021-09-10 21:11:00 +01:00
Paul Banks 5c6d27555b Fix discovery chain test fixtures 2021-09-10 21:09:24 +01:00
Paul Banks 1dd1683ed9 Header manip for split legs plumbing 2021-09-10 21:09:24 +01:00
Dhia Ayachi 82b30f8020
convert `Roles` index to use `indexerMulti` (#11013)
* convert `Roles` index to use `indexerMulti`

* add role test in oss

* fix oss to use the right index func

* preallocate slice
2021-09-10 16:04:33 -04:00
Dhia Ayachi 569e18d002
convert indexPolicies in ACLTokens table to the new index (#11011) 2021-09-10 14:57:37 -04:00
Dhia Ayachi 0d0edeec27
convert indexSecret to the new index (#11007) 2021-09-10 09:10:11 -04:00
Dhia Ayachi f0cbe25ca6
convert indexAccessor to the new index (#11002) 2021-09-09 16:28:04 -04:00
Hans Hasselberg 24c6ce0be0
tls: consider presented intermediates during server connection tls handshake. (#10964)
* use intermediates when verifying

* extract connection state

* remove useless import

* add changelog entry

* golint

* better error

* wording

* collect errors

* use SAN.DNSName instead of CommonName

* Add test for unknown intermediate

* improve changelog entry
2021-09-09 21:48:54 +02:00
Chris S. Kim 3fb797382b
Sync enterprise changes to oss (#10994)
This commit updates OSS with files for enterprise-specific admin partitions feature work
2021-09-08 11:59:30 -04:00
Kyle Havlovitz a7b5a5d1b4
Merge pull request #10984 from hashicorp/mesh-resource
acl: adding a new mesh resource
2021-09-07 15:06:20 -07:00
Dhia Ayachi 96d7842118
partition dicovery chains (#10983)
* partition dicovery chains

* fix default partition for OSS
2021-09-07 16:29:32 -04:00
Daniel Nephin 4dd5bb8e3b acl: remove legacy ACL replication 2021-09-03 12:42:06 -04:00
R.B. Boyer 4206f585f0 acl: adding a new mesh resource 2021-09-03 09:12:03 -04:00
Evan Culver 93f94ac24f
rpc: authorize raft requests (#10925) 2021-08-26 15:04:32 -07:00
Chris S. Kim 86de20c975
ent->oss test fix (#10926) 2021-08-26 14:06:49 -04:00
Chris S. Kim efbdf7e117
api: expose upstream routing configurations in topology view (#10811)
Some users are defining routing configurations that do not have associated services. This commit surfaces these configs in the topology visualization. Also fixes a minor internal bug with non-transparent proxy upstream/downstream references.
2021-08-25 15:20:32 -04:00
R.B. Boyer 6b5a58de50
acl: some acl authz refactors for nodes (#10909) 2021-08-25 13:43:11 -05:00
R.B. Boyer a84f5fa25d
grpc: ensure that streaming gRPC requests work over mesh gateway based wan federation (#10838)
Fixes #10796
2021-08-24 16:28:44 -05:00
Giulio Micheloni 10b03c3f4e
Merge branch 'main' into serve-panic-recovery 2021-08-22 20:31:11 +02:00
Giulio Micheloni 465e9fecda grpc, xds: recovery middleware to return and log error in case of panic
1) xds and grpc servers:
   1.1) to use recovery middleware with callback that prints stack trace to log
   1.2) callback turn the panic into a core.Internal error
2) added unit test for grpc server
2021-08-22 19:06:26 +01:00
R.B. Boyer b6be94e7fa
fixing various bits of enterprise meta plumbing to be more correct (#10889) 2021-08-20 14:34:23 -05:00
Dhia Ayachi f766b6dff7
oss portion of ent #1069 (#10883) 2021-08-20 12:57:45 -04:00
R.B. Boyer d730298f59
state: partition the nodes.uuid and nodes.meta indexes as well (#10882) 2021-08-19 16:17:59 -05:00
R.B. Boyer 61f1c01b83
agent: ensure that most agent behavior correctly respects partition configuration (#10880) 2021-08-19 15:09:42 -05:00
R.B. Boyer e565409c6a
state: partition the usage metrics subsystem (#10867) 2021-08-18 09:27:15 -05:00
R.B. Boyer 1cef3c99c2
state: adjust streaming event generation to account for partitioned nodes (#10860)
Also re-enabled some tests that had to be disabled in the prior PR.
2021-08-17 16:49:26 -05:00
R.B. Boyer e50e13d2ab
state: partition nodes and coordinates in the state store (#10859)
Additionally:

- partitioned the catalog indexes appropriately for partitioning
- removed a stray reference to a non-existent index named "node.checks"
2021-08-17 13:29:39 -05:00
Daniel Nephin 5a82859ee1 acl: small improvements to ACLResolver disable due to RPC error
Remove the error return, so that not handling is not reported as an
error by errcheck. It was returning the error passed as an arg
unmodified so there is no reason to return the same value that was
passed in.

Remove the term upstreams to remove any confusion with the term used in
service mesh.

Remove the AutoDisable field, and replace it with the TTL value, using 0
to indicate the setting is turned off.

Replace "not Before" with "After".

Add some test coverage to show the behaviour is still correct.
2021-08-17 13:34:18 -04:00
Daniel Nephin 09ae0ab94a acl: make ACLDisabledTTL a constant
This field was never user-configurable. We always overwrote the value with 120s from
NonUserSource. However, we also never copied the value from RuntimeConfig to consul.Config,
So the value in NonUserSource was always ignored, and we used the default value of 30s
set by consul.DefaultConfig.

All of this code is an unnecessary distraction because a user can not actually configure
this value.

This commit removes the fields and uses a constant value instad. Someone attempting to set
acl.disabled_ttl in their config will now get an error about an unknown field, but previously
the value was completely ignored, so the new behaviour seems more correct.

We have to keep this field in the AutoConfig response for backwards compatibility, but the value
will be ignored by the client, so it doesn't really matter what value we set.
2021-08-17 13:34:18 -04:00
Daniel Nephin a8bc964241 Fix test failures
Tests only specified one of the fields, but in production we copy the
value from a single place, so we can do the same in tests.

The AutoConfig test broke because of the problem noticed in a previous
commit. The DisabledTTL is not wired up properly so it reports 0s here.
Changed the test to use an explicit value.
2021-08-17 13:32:52 -04:00
Daniel Nephin 0d69b49f41 config: remove ACLResolver settings from RuntimeConfig 2021-08-17 13:32:52 -04:00
Daniel Nephin 75baa22e64 acl: remove ACLResolver config fields from consul.Config 2021-08-17 13:32:52 -04:00
Daniel Nephin 454f62eacc acl: replace ACLResolver.Config with its own struct
This is step toward decoupling ACLResolver from the agent/consul
package.
2021-08-17 13:32:52 -04:00
Daniel Nephin be0358df02 acl: remove legacy bootstrap
Return an explicit error from the RPC, and remove the flag from the HTTP API.
2021-08-17 13:10:00 -04:00
Daniel Nephin 4f54d9708c acl: add some notes about removing legacy ACL system 2021-08-17 13:08:29 -04:00
Daniel Nephin e4c6bee7e6
Merge pull request #10792 from hashicorp/dnephin/rename-authz-vars
acl: use authz consistently as the variable name for an acl.Authorizer
2021-08-17 13:07:17 -04:00
Daniel Nephin 7f71a672f3
Merge pull request #10807 from hashicorp/dnephin/remove-acl-datacenter
config: remove ACLDatacenter
2021-08-17 13:07:09 -04:00
Daniel Nephin 608b291565 acl: use authz consistently as the variable name for an acl.Authorizer
Follow up to https://github.com/hashicorp/consul/pull/10737#discussion_r682147950

Renames all variables for acl.Authorizer to use `authz`. Previously some
places used `rule` which I believe was an old name carried over from the
legacy ACL system.

A couple places also used authorizer.

This commit also removes another couple of authorizer nil checks that
are no longer necessary.
2021-08-17 12:14:10 -04:00
Daniel Nephin 364ef3d052 server: remove defaulting of PrimaryDatacenter
The constructor for Server is not at all the appropriate place to be setting default
values for a config struct that was passed in.

In production this value is always set from agent/config. In tests we should set the
default in a test helper.
2021-08-06 18:45:24 -04:00
Daniel Nephin 87fb26fd65
Merge pull request #10612 from bigmikes/acl-replication-fix
acl: acl replication routine to report the last error message
2021-08-06 18:29:51 -04:00
Daniel Nephin 047abdd73c acl: remove ACLDatacenter
This field has been unnecessary for a while now. It was always set to the same value
as PrimaryDatacenter. So we can remove the duplicate field and use PrimaryDatacenter
directly.

This change was made by GoLand refactor, which did most of the work for me.
2021-08-06 18:27:00 -04:00
Giulio Micheloni 5c34a48d45 String type instead of error type and changelog. 2021-08-06 22:35:27 +01:00
Daniel Nephin 9435118179 acl: remove Server.ResolveTokenIdentityAndDefaultMeta
This method suffered from similar naming to a couple other methods on Server, and had not great
re-use (2 callers). By copying a few of the lines into one of the callers we can move the
implementation into the second caller.

Once moved, we can see that ResolveTokenAndDefaultMeta is identical in both Client and Server, and
likely should be further refactored, possibly into ACLResolver.

This change is being made to make ACL resolution easier to trace.
2021-08-05 15:20:13 -04:00
Daniel Nephin 25f40de163 acl: remove Server.ResolveTokenToIdentityAndAuthorizer
This method was an alias for ACLResolver.ResolveTokenToIdentityAndAuthorizer. By removing the
method that does nothing the code becomes easier to trace.
2021-08-05 15:20:13 -04:00
Daniel Nephin 695963acb7 acl: recouple acl filtering from ACLResolver
ACL filtering only needs an authorizer and a logger. We can decouple filtering from
the ACLResolver by passing in the necessary logger.

This change is being made in preparation for moving the ACLResolver into an acl package
2021-08-05 15:20:13 -04:00
Daniel Nephin ba2f9a65d1 acl: remove unused error return
filterACLWithAuthorizer could never return an error. This change moves us a little bit
closer to being able to enable errcheck and catch problems caused by unhandled error
return values.
2021-08-05 15:20:13 -04:00
Daniel Nephin c80b9565e2 acl: rename acl.Authorizer vars to authz
For consistency
2021-08-05 15:19:47 -04:00
Daniel Nephin 37c67cb280 acl: move vet functions
These functions are moved to the one place they are called to improve code locality.

They are being moved out of agent/consul/acl.go in preparation for moving
ACLResolver to an acl package.
2021-08-05 15:19:24 -04:00
Daniel Nephin c8eedabc7c acl: move vetRegisterWithACL and vetDeregisterWithACL
These functions are used in only one place. Move the functions next to their one caller
to improve code locality.

This change is being made in preparation for moving the ACLResolver into an
acl package. The moved functions were previously in the same file as the ACLResolver.
By moving them out of that file we may be able to move the entire file
with fewer modifications.
2021-08-05 15:17:54 -04:00
Daniel Nephin b223c2bc25
Merge pull request #10770 from hashicorp/dnephin/log-cert-expiration
telemetry: add log message when certs are about to expire
2021-08-05 15:17:20 -04:00
Daniel Nephin c866f1041a
Merge pull request #10793 from hashicorp/dnephin/acl-intentions
acl: small cleanup of a couple Authorization flows
2021-08-05 15:16:49 -04:00
Dhia Ayachi 40baf98159
defer setting the state before returning to avoid stuck in `INITIALIZING` state (#10630)
* defer setting the state before returning to avoid being stuck in `INITIALIZING` state

* add changelog

* move comment with the right if statement

* ca: report state transition error from setSTate

* update comment to reflect state transition

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-08-05 14:51:19 -04:00
Daniel Nephin 79ab48ef81
Merge pull request #10768 from hashicorp/dnephin/agent-tls-cert-expiration-metric
telemetry: add Agent TLS Certificate expiration metric
2021-08-04 18:42:02 -04:00
Daniel Nephin 0ca9e875e2 acl: remove special handling of services in txn_endpoint
Follow up to: https://github.com/hashicorp/consul/pull/10738#discussion_r680190210

Previously we were passing an Authorizer that would always allow the
operation, then later checking the authorization using vetServiceTxnOp.

On the surface this seemed strange, but I think it was actually masking
a bug as well. Over time `servicePreApply` was changed to add additional
authorization for `service.Proxy.DestinationServiceName`, but because
we were passing a nil Authorizer, that authorization was not handled on
the txn_endpoint.

`TxnServiceOp.FillAuthzContext` has some special handling in enterprise,
so we need to make sure to continue to use that from the Txn endpoint.

This commit removes the `vetServiceTxnOp` function, and passes in the
`FillAuthzContext` function so that `servicePreApply` can be used by
both the catalog and txn endpoints. This should be much less error prone
and prevent bugs like this in the future.
2021-08-04 18:32:20 -04:00
Daniel Nephin 3dc113ada6
Merge pull request #10738 from hashicorp/dnephin/remove-authorizer-nil-checks-2
acl: remove the last of the authz == nil checks
2021-08-04 17:41:40 -04:00
Daniel Nephin 2e9aa91256
Merge pull request #10737 from hashicorp/dnephin/remove-authorizer-nil-checks
acl: remove authz == nil checks
2021-08-04 17:39:34 -04:00
Daniel Nephin 210a850353 telemetry: add log message when certs are about to expire 2021-08-04 14:18:59 -04:00
Daniel Nephin 13aa7b70d5 telemetry: fix a couple bugs in cert expiry metrics
1. do not emit the metric if Query fails
2. properly check for PrimaryUsersIntermediate, the logic was inverted

Also improve the logging by including the metric name in the log message
2021-08-04 13:51:44 -04:00
Daniel Nephin 1673b3a68c telemetry: add a metric for agent TLS cert expiry 2021-08-04 13:51:44 -04:00
Dhia Ayachi 6ed6966a1f
fix state index for `CAOpSetRootsAndConfig` op (#10675)
* fix state index for `CAOpSetRootsAndConfig` op

* add changelog

* Update changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* remove the change log as it's not needed

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-08-04 13:07:49 -04:00
Daniel Nephin 953c9bee4f acl: Remove the remaining authz == nil checks
These checks were a bit more involved. They were previously skipping some code paths
when the authorizer was nil. After looking through these it seems correct to remove the
authz == nil check, since it will never evaluate to true.
2021-07-30 14:55:35 -04:00
Daniel Nephin e4821a58ee acl: remove acl == nil checks 2021-07-30 14:28:19 -04:00
Daniel Nephin fbaeac9ecf acl: remove authz == nil checks
These case are already impossible conditions, because most of these functions already start
with a check for ACLs being disabled. So the code path being removed could never be reached.

The one other case (ConnectAuthorized) was already changed in a previous commit. This commit
removes an impossible branch because authz == nil can never be true.
2021-07-30 13:58:35 -04:00
Daniel Nephin b6d9d0d9f7 acl: remove many instances of authz == nil 2021-07-30 13:58:35 -04:00
Daniel Nephin 2503f27a36 acl: remove rule == nil checks 2021-07-30 13:58:35 -04:00
Daniel Nephin 9b41e7287f acl: use acl.ManangeAll when ACLs are disabled
Instead of returning nil and checking for nilness

Removes a bunch of nil checks, and fixes one test failures.
2021-07-30 12:58:24 -04:00
Freddy b136b1795a
Reset root prune interval after TestLeader_CARootPruning completes
#10645

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-07-26 15:43:40 -06:00
R.B. Boyer c271976445
state: refactor some node/coordinate state store functions to take an EnterpriseMeta (#10687)
Note the field is not used yet.
2021-07-23 13:42:23 -05:00
R.B. Boyer 254557a1f6
sync changes to oss files made in enterprise (#10670) 2021-07-22 13:58:08 -05:00
R.B. Boyer 62ac98b564
agent/structs: add a bunch more EnterpriseMeta helper functions to help with partitioning (#10669) 2021-07-22 13:20:45 -05:00
Dhia Ayachi b725605fe4
config raft apply silent error (#10657)
* return an error when the index is not valid

* check response as bool when applying `CAOpSetConfig`

* remove check for bool response

* fix error message and add check to test

* fix comment

* add changelog
2021-07-22 10:32:27 -04:00
Daniel Nephin db29c51cd2 acl: use SetHash consistently in testPolicyForID
A previous commit used SetHash on two of the cases to fix a data race. This commit applies
that change to all cases. Using SetHash in this test helper should ensure that the
test helper behaves closer to production.
2021-07-16 17:59:56 -04:00
Giulio Micheloni 3a1afd8f57 acl: fix error type into a string type for serialization issue
acl_endpoint_test.go:507:
        	Error Trace:	acl_endpoint_test.go:507
        	            				retry.go:148
        	            				retry.go:149
        	            				retry.go:103
        	            				acl_endpoint_test.go:504
        	Error:      	Received unexpected error:
        	            	codec.decoder: decodeValue: Cannot decode non-nil codec value into nil error (1 methods)
        	Test:       	TestACLEndpoint_ReplicationStatus
2021-07-15 11:31:44 +02:00
Daniel Nephin 27871498f0 Fix a data race in TestACLResolver_Client
By setting the hash when we create the policy.

```
WARNING: DATA RACE
Read at 0x00c0028b4b10 by goroutine 1182:
  github.com/hashicorp/consul/agent/structs.(*ACLPolicy).SetHash()
      /home/daniel/pers/code/consul/agent/structs/acl.go:701 +0x40d
  github.com/hashicorp/consul/agent/structs.ACLPolicies.resolveWithCache()
      /home/daniel/pers/code/consul/agent/structs/acl.go:779 +0xfe
  github.com/hashicorp/consul/agent/structs.ACLPolicies.Compile()
      /home/daniel/pers/code/consul/agent/structs/acl.go:809 +0xf1
  github.com/hashicorp/consul/agent/consul.(*ACLResolver).ResolveTokenToIdentityAndAuthorizer()
      /home/daniel/pers/code/consul/agent/consul/acl.go:1226 +0x6ef
  github.com/hashicorp/consul/agent/consul.resolveTokenAsync()
      /home/daniel/pers/code/consul/agent/consul/acl_test.go:66 +0x5c

Previous write at 0x00c0028b4b10 by goroutine 1509:
  github.com/hashicorp/consul/agent/structs.(*ACLPolicy).SetHash()
      /home/daniel/pers/code/consul/agent/structs/acl.go:730 +0x3a8
  github.com/hashicorp/consul/agent/structs.ACLPolicies.resolveWithCache()
      /home/daniel/pers/code/consul/agent/structs/acl.go:779 +0xfe
  github.com/hashicorp/consul/agent/structs.ACLPolicies.Compile()
      /home/daniel/pers/code/consul/agent/structs/acl.go:809 +0xf1
  github.com/hashicorp/consul/agent/consul.(*ACLResolver).ResolveTokenToIdentityAndAuthorizer()
      /home/daniel/pers/code/consul/agent/consul/acl.go:1226 +0x6ef
  github.com/hashicorp/consul/agent/consul.resolveTokenAsync()
      /home/daniel/pers/code/consul/agent/consul/acl_test.go:66 +0x5c

Goroutine 1182 (running) created at:
  github.com/hashicorp/consul/agent/consul.TestACLResolver_Client.func4()
      /home/daniel/pers/code/consul/agent/consul/acl_test.go:1669 +0x459
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:1193 +0x202

Goroutine 1509 (running) created at:
  github.com/hashicorp/consul/agent/consul.TestACLResolver_Client.func4()
      /home/daniel/pers/code/consul/agent/consul/acl_test.go:1668 +0x415
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:1193 +0x202
```
2021-07-14 18:58:16 -04:00
Daniel Nephin ff26294d63 consul: fix data race in leader CA tests
Some global variables are patched to shorter values in these tests. But the goroutines that read
them can outlive the test because nothing waited for them to exit.

This commit adds a Wait() method to the routine manager, so that tests can wait for the goroutines
to exit. This prevents the data race because the 'reset to original value' can happen
after all other goroutines have stopped.
2021-07-14 18:58:15 -04:00
Giulio Micheloni 96fe1f4078 acl: acl replication routine to report the last error message 2021-07-14 11:50:23 +02:00
Dhia Ayachi 53b45a8441
check expiry date of the root/intermediate before using it to sign a leaf (#10500)
* ca: move provider creation into CAManager

This further decouples the CAManager from Server. It reduces the interface between them and
removes the need for the SetLogger method on providers.

* ca: move SignCertificate to CAManager

To reduce the scope of Server, and keep all the CA logic together

* ca: move SignCertificate to the file where it is used

* auto-config: move autoConfigBackend impl off of Server

Most of these methods are used exclusively for the AutoConfig RPC
endpoint. This PR uses a pattern that we've used in other places as an
incremental step to reducing the scope of Server.

* fix linter issues

* check error when `raftApplyMsgpack`

* ca: move SignCertificate to CAManager

To reduce the scope of Server, and keep all the CA logic together

* check expiry date of the intermediate before using it to sign a leaf

* fix typo in comment

Co-authored-by: Kyle Havlovitz <kylehav@gmail.com>

* Fix test name

* do not check cert start date

* wrap error to mention it is the intermediate expired

* Fix failing test

* update comment

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* use shim to avoid sleep in test

* add root cert validation

* remove duplicate code

* Revert "fix linter issues"

This reverts commit 6356302b54f06c8f2dee8e59740409d49e84ef24.

* fix import issue

* gofmt leader_connect_ca

* add changelog entry

* update error message

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* fix error message in test

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Kyle Havlovitz <kylehav@gmail.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2021-07-13 12:15:06 -04:00
R.B. Boyer ae8b526be8
connect/ca: ensure edits to the key type/bits for the connect builtin CA will regenerate the roots (#10330)
progress on #9572
2021-07-13 11:12:07 -05:00
R.B. Boyer 0537922c6c
connect/ca: require new vault mount points when updating the key type/bits for the vault connect CA provider (#10331)
progress on #9572
2021-07-13 11:11:46 -05:00
Daniel Nephin 58cf5767a8
Merge pull request #10479 from hashicorp/dnephin/ca-provider-explore-2
ca: move Server.SignIntermediate to CAManager
2021-07-12 19:03:43 -04:00
Daniel Nephin a22bdb2ac9
Merge pull request #10445 from hashicorp/dnephin/ca-provider-explore
ca: isolate more of the CA logic in CAManager
2021-07-12 15:26:23 -04:00
Daniel Nephin fdb0ba8041 ca: use provider constructors to be more consistent
Adds a contructor for the one provider that did not have one.
2021-07-12 14:04:34 -04:00
Dhia Ayachi 3eac4ffda4 check error when `raftApplyMsgpack` 2021-07-12 13:42:51 -04:00
Daniel Nephin 34c8585b29 auto-config: move autoConfigBackend impl off of Server
Most of these methods are used exclusively for the AutoConfig RPC
endpoint. This PR uses a pattern that we've used in other places as an
incremental step to reducing the scope of Server.
2021-07-12 13:42:40 -04:00
Daniel Nephin 605275b4dc ca: move SignCertificate to the file where it is used 2021-07-12 13:42:39 -04:00
Daniel Nephin c2e85f25d4 ca: move SignCertificate to CAManager
To reduce the scope of Server, and keep all the CA logic together
2021-07-12 13:42:39 -04:00
Dhia Ayachi a0320169fe add missing state reset when stopping ca manager 2021-07-12 09:32:36 -04:00
Daniel Nephin 68d5f7769a ca: fix mockCAServerDelegate to work with the new interface
raftApply was removed so ApplyCARequest needs to handle all the possible operations

Also set the providerShim to use the mock provider.

other changes are small test improvements that were necessary to debug the failures.
2021-07-12 09:32:36 -04:00
Daniel Nephin 6d4b0ce194 ca: remove unused method
and small refactor to getCAProvider so that GoLand is less confused about what it is doing.
Previously it was reporting that the for condition was always true, which was not the case.
2021-07-12 09:32:35 -04:00
Daniel Nephin 4330122d9a ca: remove raftApply from delegate interface
After moving ca.ConsulProviderStateDelegate into the interface we now
have the ApplyCARequest method which does the same thing. Use this more
specific method instead of raftApply.
2021-07-12 09:32:35 -04:00
Daniel Nephin fae0a8f851 ca: move generateCASignRequest to the delegate
This method on Server was only used by the caDelegateWithState, so move it there
until we can move it entirely into CAManager.
2021-07-12 09:32:35 -04:00
Daniel Nephin d4bb9fd97a ca: move provider creation into CAManager
This further decouples the CAManager from Server. It reduces the interface between them and
removes the need for the SetLogger method on providers.
2021-07-12 09:32:33 -04:00
Daniel Nephin fc629d9eaa ca-manager: move provider shutdown into CAManager
Reducing the coupling between Server and CAManager
2021-07-12 09:27:28 -04:00
Daniel Nephin 1e23d181b5 config: remove misleading UseTLS field
This field was documented as enabling TLS for outgoing RPC, but that was not the case.
All this field did was set the use_tls serf tag.

Instead of setting this field in a place far from where it is used, move the logic to where
the serf tag is set, so that the code is much more obvious.
2021-07-09 19:01:45 -04:00
Daniel Nephin 3c60a46376 config: remove duplicate TLSConfig fields from agent/consul.Config
tlsutil.Config already presents an excellent structure for this
configuration. Copying the runtime config fields to agent/consul.Config
makes code harder to trace, and provides no advantage.

Instead of copying the fields around, use the tlsutil.Config struct
directly instead.

This is one small step in removing the many layers of duplicate
configuration.
2021-07-09 18:49:42 -04:00
Evan Culver 5ff191ad99
Add support for returning ACL secret IDs for accessors with acl:write (#10546) 2021-07-08 15:13:08 -07:00
Dhia Ayachi e5dbf5e55b
Add ca certificate metrics (#10504)
* add intermediate ca metric routine

* add Gauge config for intermediate cert

* Stop metrics routine when stopping leader

* add changelog entry

* updage changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* use variables instead of a map

* go imports sort

* Add metrics for primary and secondary ca

* start metrics routine in the right DC

* add telemetry documentation

* update docs

* extract expiry fetching in a func

* merge metrics for primary and secondary into signing ca metric

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-07-07 09:41:01 -04:00
Daniel Nephin 14527dd005
Merge pull request #10552 from hashicorp/dnephin/ca-remove-rotation-period
ca: remove unused RotationPeriod field
2021-07-06 18:49:33 -04:00
jkirschner-hashicorp 31bbab8ae7
Merge pull request #10560 from jkirschner-hashicorp/change-sane-to-reasonable
Replace use of 'sane' where appropriate
2021-07-06 11:46:04 -04:00
Daniel Nephin b4a10443d1 ca: remove unused RotationPeriod field
This field was never used. Since it is persisted as part of a map[string]interface{} it
is pretty easy to remove it.
2021-07-05 19:15:44 -04:00
Jared Kirschner 4c3b1b8b7b Replace use of 'sane' where appropriate
HashiCorp voice, style, and language guidelines recommend avoiding ableist
language unless its reference to ability is accurate in a particular use.
2021-07-02 12:18:46 -04:00
Dhia Ayachi b57cf27e8f
Format certificates properly (rfc7468) with a trailing new line (#10411)
* trim carriage return from certificates when inserting rootCA in the inMemDB

* format rootCA properly when returning the CA on the connect CA endpoint

* Fix linter warnings

* Fix providers to trim certs before returning it

* trim newlines on write when possible

* add changelog

* make sure all provider return a trailing newline after the root and intermediate certs

* Fix endpoint to return trailing new line

* Fix failing test with vault provider

* make test more robust

* make sure all provider return a trailing newline after the leaf certs

* Check for suffix before removing newline and use function

* Add comment to consul provider

* Update change log

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix typo

* simplify code callflow

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* extract requireNewLine as shared func

* remove dependency to testify in testing file

* remove extra newline in vault provider

* Add cert newline fix to envoy xds

* remove new line from mock provider

* Remove adding a new line from provider and fix it when the cert is read

* Add a comment to explain the fix

* Add missing for leaf certs

* fix missing new line

* fix missing new line in leaf certs

* remove extra new line in test

* updage changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* fix in vault provider and when reading cache (RPC call)

* fix AWS provider

* fix failing test in the provider

* remove comments and empty lines

* add check for empty cert in test

* fix linter warnings

* add new line for leaf and private key

* use string concat instead of Sprintf

* fix new lines for leaf signing

* preallocate slice and remove append

* Add new line to `SignIntermediate` and `CrossSignCA`

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-06-30 20:48:29 -04:00
Daniel Nephin 86244967c5 docs: correct some misleading telemetry docs
The query metrics are actually reported for all read queries, not only
ones that use a MinIndex to block for updates.

Also clarify the raft.apply metric is only on the leader.
2021-06-28 12:20:53 -04:00
R.B. Boyer c3d5a2a5ab
connect/ca: cease including the common name field in generated certs (#10424)
As part of this change, we ensure that the SAN extensions are marked as
critical when the subject is empty so that AWS PCA tolerates the loss of
common names well and continues to function as a Connect CA provider.

Parts of this currently hack around a bug in crypto/x509 and can be
removed after https://go-review.googlesource.com/c/go/+/329129 lands in
a Go release.

Note: the AWS PCA tests do not run automatically, but the following
passed locally for me:

    ENABLE_AWS_PCA_TESTS=1 go test ./agent/connect/ca -run TestAWS
2021-06-25 13:00:00 -05:00
Daniel Nephin 72b30174fa ca: replace ca.PrimaryIntermediateProviders
With an optional interface that providers can use to indicate if they
use an intermediate cert in the primary DC.

This removes the need to look up the provider config when renewing the
intermediate.
2021-06-23 15:47:30 -04:00
Daniel Nephin f4c1f982d1
Merge pull request #9924 from hashicorp/dnephin/cert-expiration-metric
connect: emit a metric for the seconds until root CA expiry
2021-06-18 14:18:55 -04:00
Daniel Nephin e36800cefa Update metric name
and handle the case where there is no active root CA.
2021-06-14 17:01:16 -04:00
Daniel Nephin 548796ae13 connect: emit a metric for the number of seconds until root CA expiration 2021-06-14 16:57:01 -04:00
Freddy f399fd2add
Rename CatalogDestinationsOnly (#10397)
CatalogDestinationsOnly is a passthrough that would enable dialing
addresses outside of Consul's catalog. However, when this flag is set to
true only _connect_ endpoints for services can be dialed.

This flag is being renamed to signal that non-Connect endpoints can't be
dialed by transparent proxies when the value is set to true.
2021-06-14 14:15:09 -06:00
Freddy 61ae2995b7
Add flag for transparent proxies to dial individual instances (#10329) 2021-06-09 14:34:17 -06:00
Daniel Nephin 20f7a72792 stream: remove bufferItem.NextLink
Both NextLink and NextNoBlock had the same logic, with slightly
different return values. By adding a bool return value (similar to map
lookups) we can remove the duplicate method.
2021-06-07 17:04:46 -04:00
Daniel Nephin 48f388f590 stream: fix a bug with creating a snapshot
The head of the topic buffer was being ignored when creating a snapshot. This commit fixes
the bug by ensuring that the head of the topic buffer is included in the snapshot
before handing it off to the subscription.
2021-06-04 18:33:04 -04:00
Paul Ewing e454a9aae0
usagemetrics: add cluster members to metrics API (#10340)
This PR adds cluster members to the metrics API. The number of members per
segment are reported as well as the total number of members.

Tested by running a multi-node cluster locally and ensuring the numbers were
correct. Also added unit test coverage to add the new expected gauges to
existing test cases.
2021-06-03 08:25:53 -07:00
Daniel Nephin 0dfb7da610 grpc: fix a data race by using a static resolver
We have seen test flakes caused by 'concurrent map read and map write', and the race detector
reports the problem as well (prevent us from running some tests with -race).

The root of the problem is the grpc expects resolvers to be registered at init time
before any requests are made, but we were using a separate resolver for each test.

This commit introduces a resolver registry. The registry is registered as the single
resolver for the consul scheme. Each test uses the Authority section of the target
(instead of the scheme) to identify the resolver that should be used for the test.
The scheme is used for lookup, which is why it can no longer be used as the unique
key.

This allows us to use a lock around the map of resolvers, preventing the data race.
2021-06-02 11:35:38 -04:00
Dhia Ayachi 0c13f80d5a
RPC Timeout/Retries account for blocking requests (#8978) 2021-05-27 17:29:43 -04:00
Matt Keeler 7e4ea16149 Move some things around to allow for license updating via config reload
The bulk of this commit is moving the LeaderRoutineManager from the agent/consul package into its own package: lib/gort. It also got a renaming and its Start method now requires a context. Requiring that context required updating a whole bunch of other places in the code.
2021-05-25 09:57:50 -04:00
Matt Keeler 58b934133d hcs-1936: Prepare for adding license auto-retrieval to auto-config in enterprise 2021-05-24 13:20:30 -04:00
Matt Keeler 82f5cb3f08 Preparation for changing where license management is done. 2021-05-24 10:19:31 -04:00
Daniel Nephin f2cf586414 Refactor of serf feature flag tags.
This refactor is to make it easier to see how serf feature flags are
encoded as serf tags, and where those feature flags are read.

- use constants for both the prefix and feature flag name. A constant
  makes it much easier for an IDE to locate the read and write location.
- isolate the feature-flag encoding logic in the metadata package, so
  that the feature flag prefix can be unexported. Only expose a function
  for encoding the flags into tags. This logic is now next to the logic
  which reads the tags.
- remove the duplicate `addEnterpriseSerfTags` functions. Both Client
  and Server structs had the same implementation. And neither
  implementation needed the method receiver.
2021-05-20 12:57:06 -04:00
R.B. Boyer b90877b440
server: ensure that central service config flattening properly resets the state each time (#10239)
The prior solution to call reply.Reset() aged poorly since newer fields
were added to the reply, but not added to Reset() leading serial
blocking query loops on the server to blend replies.

This could manifest as a service-defaults protocol change from
default=>http not reverting back to default after the config entry
reponsible was deleted.
2021-05-14 10:21:44 -05:00
Daniel Nephin 3dd951ab1e testing: don't run t.Parallel in a goroutine
TestACLEndpoint_Login_with_TokenLocality was reguardly being reported as failed even though
it was not failing. I took another look and I suspect it is because t.Parllel was being
called in a goroutine.

This would lead to strange behaviour which apparently confused the 'go test' runner.
2021-05-10 13:30:10 -04:00
Daniel Nephin c9ae72e72f
Merge pull request #10075 from hashicorp/dnephin/handle-raft-apply-errors
rpc: some cleanup of canRetry and ForwardRPC
2021-05-06 16:59:53 -04:00
Daniel Nephin 39d7d07922 state: reduce arguments to validateProposedConfigEntryInServiceGraph 2021-05-06 13:47:40 -04:00
Daniel Nephin 4905ac6f44 rpc: add tests for canRetry
Also accept an RPCInfo instead of interface{}. Accepting an interface
lead to a bug where the caller was expecting the arg to be the response
when in fact it was always passed the request. By accepting RPCInfo
it should indicate that this is actually the request value.

One caller of canRetry already passed an RPCInfo, the second handles
the type assertion before calling canRetry.
2021-05-06 13:30:07 -04:00
Daniel Nephin c38f4869ad rpc: remove unnecessary arg to ForwardRPC 2021-05-06 13:30:07 -04:00
Daniel Nephin 55f620d636
Merge pull request #10155 from hashicorp/dnephin/config-entry-remove-fields
config-entry: remove Kind and Name field from Mesh config entry
2021-05-04 17:27:56 -04:00
Daniel Nephin 0e5e1270b6 config-entries: add a test for the API client
Also fixes a bug with listing kind=mesh config entries. ValidateConfigEntryKind was only being used by
the List endpoint, and was yet another place where we have to enumerate all the kinds.

This commit removes ValidateConfigEntryKind and uses MakeConfigEntry instead. This change removes
the need to maintain two separate functions at the cost of creating an instance of the config entry which will be thrown away immediately.
2021-05-04 17:14:21 -04:00
Daniel Nephin df98027ad1 lint: fix warning by removing reference to deprecated interface 2021-05-04 14:09:14 -04:00
Paul Banks d47eea3a3f
Make Raft trailing logs and snapshot timing reloadable (#10129)
* WIP reloadable raft config

* Pre-define new raft gauges

* Update go-metrics to change gauge reset behaviour

* Update raft to pull in new metric and reloadable config

* Add snapshot persistance timing and installSnapshot to our 'protected' list as they can be infrequent but are important

* Update telemetry docs

* Update config and telemetry docs

* Add note to oldestLogAge on when it is visible

* Add changelog entry

* Update website/content/docs/agent/options.mdx

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2021-05-04 15:36:53 +01:00
Luke Kysow eb84a856c4
Give descriptive error if auth method not found (#10163)
* Give descriptive error if auth method not found

Previously during a `consul login -method=blah`, if the auth method was not found, the
error returned would be "ACL not found". This is potentially confusing
because there may be many different ACLs involved in a login: the ACL of
the Consul client, perhaps the binding rule or the auth method.

Now the error will be "auth method blah not found", which is much easier
to debug.
2021-05-03 13:39:13 -07:00
Daniel Nephin bf4c289804 config-entry: remove Kind and Name field from Mesh config entry
No config entry needs a Kind field. It is only used to determine the Go type to
target. As we introduce new config entries (like this one) we can remove the kind field
and have the GetKind method return the single supported value.

In this case (similar to proxy-defaults) the Name field is also unnecessary. We always
use the same value. So we can omit the name field entirely.
2021-04-29 17:11:21 -04:00
Freddy 401f3010e0
Rename "cluster" config entry to "mesh" (#10127)
This config entry is being renamed primarily because in k8s the name
cluster could be confusing given that the config entry applies across
federated datacenters.

Additionally, this config entry will only apply to Consul as a service
mesh, so the more generic "cluster" name is not needed.
2021-04-28 16:13:29 -06:00
Daniel Nephin 3cda0a7cc4 health: create health.Client in Agent.New 2021-04-27 19:03:16 -04:00
Matt Keeler 6c639be8ec
Add prometheus guage definitions for replication metrics. (#10109) 2021-04-23 17:05:33 -04:00
Matt Keeler 09bf05ec5d
Add replication metrics (#10073) 2021-04-22 11:20:53 -04:00
Daniel Nephin 281d7616fa
Merge pull request #10045 from hashicorp/dnephin/state-proxy-defaults
state: remove config-entries kind index
2021-04-20 17:12:50 -04:00
Daniel Nephin 6d1a5b3629 Handle ErrChunkingResubmit.Error properly
Previously canRetry was attempting to retrieve this error from args, however there was never
any callers that would pass an error to args.

With the change to raftApply to move this error to the error return value, it is now possible
to receive this error from the err argument.

This commit updates canRetry to check for ErrChunkingResubmit in err.
2021-04-20 13:29:31 -04:00
Daniel Nephin 8654adfc53 Handle FSM.Apply errors in raftApply
Previously we were inconsistently checking the response for errors. This
PR moves the response-is-error check into raftApply, so that all callers
can look at only the error response, instead of having to know that
errors could come from two places.

This should expose a few more errors that were previously hidden because
in some calls to raftApply we were ignoring the response return value.

Also handle errors more consistently. In some cases we would log the
error before returning it. This can be very confusing because it can
result in the same error being logged multiple times. Instead return
a wrapped error.
2021-04-20 13:29:29 -04:00
Daniel Nephin 95b361ecc8 state: remove unnecessary kind index
The query can be performed using a prefix query on the ID index.

Also backport some enterprise changes to prevent conflicts.
2021-04-15 17:37:28 -04:00
Daniel Nephin eb7f4b7ea4 state: use index constants for ConfigEntry indexes 2021-04-15 17:30:07 -04:00
Freddy 5a9b75a443
Merge pull request #10016 from hashicorp/topology-update 2021-04-15 14:11:23 -06:00
R.B. Boyer c88512fe14
connect: update centralized upstreams representation in service-defaults (#10015) 2021-04-15 14:21:44 -05:00
Daniel Nephin 2a10f01bf5 snapshot: fix saving of auth methods
Previously only a single auth method would be saved to the snapshot. This commit fixes the typo
and adds to the test, to show that all auth methods are now saved.
2021-04-14 16:51:21 -04:00
freddygv 2ff8b9f2f5 Avoid returning a nil slice 2021-04-14 10:52:05 -06:00
Matt Keeler aa0eb60f57
Move static token resolution into the ACLResolver (#10013) 2021-04-14 12:39:35 -04:00
freddygv 7fd4c569ce Update viz endpoint to include topology from intentions 2021-04-14 10:20:15 -06:00
freddygv 83501d5415 Augment intention decision summary with DefaultAllow mode 2021-04-12 19:32:09 -06:00
freddygv eeccba945d Replace TransparentProxy bool with ProxyMode
This PR replaces the original boolean used to configure transparent
proxy mode. It was replaced with a string mode that can be set to:

- "": Empty string is the default for when the setting should be
defaulted from other configuration like config entries.
- "direct": Direct mode is how applications originally opted into the
mesh. Proxy listeners need to be dialed directly.
- "transparent": Transparent mode enables configuring Envoy as a
transparent proxy. Traffic must be captured and redirected to the
inbound and outbound listeners.

This PR also adds a struct for transparent proxy specific configuration.
Initially this is not stored as a pointer. Will revisit that decision
before GA.
2021-04-12 09:35:14 -06:00
Freddy 920ba3db39
Merge pull request #9976 from hashicorp/centralized-upstream-fixups 2021-04-08 12:26:56 -06:00
Daniel Nephin 93818ebc5a
Merge pull request #9950 from hashicorp/dnephin/state-use-txn-everywhere
state: use Txn interface everywhere
2021-04-08 12:02:03 -04:00
Daniel Nephin 9db8ffb1c5
Merge pull request #9880 from hashicorp/dnephin/catalog-events-test-pattern
state: use runCase pattern for large test
2021-04-08 11:54:41 -04:00
freddygv a1fd3b0271 Pass down upstream defaults to client proxies
This is needed in case the client proxy is in TransparentProxy mode.
Typically they won't have explicit configuration for every upstream, so
this ensures the settings can be applied to all of them when generating
xDS config.
2021-04-07 09:32:47 -06:00
freddygv c2e74e21bc Prevent requests without UpstreamIDs from being flagged as legacy.
New clients in transparent proxy mode can send requests for service
config resolution without any upstream args because they do not have
explicitly defined upstreams.

Old clients on the other hand will never send requests without the
Upstreams args unless they don't have upstreams, in which case we do not
send back upstream config.
2021-04-07 09:32:47 -06:00
R.B. Boyer 82245585c6
connect: add toggle to globally disable wildcard outbound network access when transparent proxy is enabled (#9973)
This adds a new config entry kind "cluster" with a single special name "cluster" where this can be controlled.
2021-04-06 13:19:59 -05:00
Daniel Nephin f0590e7c18 state: support additional test cases in indexer tests
And add a few additional cases.
2021-03-31 14:39:33 -04:00
Kyle Havlovitz 3cdd495600 Backport enterprise changes to prevent merge conflicts
Co-Authored-By: Kyle Havlovitz <kylehav@gmail.com>
2021-03-31 14:05:26 -04:00
Daniel Nephin e4a60a2a8d state: use tableIndex constant 2021-03-29 18:52:20 -04:00
Daniel Nephin 7cb2255838 state: use ReadTxn and WriteTxn interface
Instead of *txn, so that we can replace the txn implementation with others, and so
that the function is easily documented as a read or write function.
2021-03-29 18:52:16 -04:00
Daniel Nephin d785c86db1 state: convert checks.service index to new pattern 2021-03-29 16:38:53 -04:00
Daniel Nephin f859ba6d4b state: convert checks.status indexer
As part of this change the indexer will now be case insensitive by using
the lower case value. This should be safe because previously we always
had lower case strings.

This change was made out of convenience. All the other indexers use
lowercase, so we can re-use the indexFromQuery function by using
lowercase here as well.
2021-03-29 16:38:50 -04:00
Daniel Nephin 9251ac881a state: add tests for checks indexers 2021-03-29 16:38:47 -04:00
Daniel Nephin 98d6dcbdf8 state: use constants for table checks 2021-03-29 16:38:43 -04:00
Daniel Nephin bcbdc9cab3 state: pass Query in from caller
To reduce the number of arguments
2021-03-29 15:42:30 -04:00
Daniel Nephin 372d274b34 state: convert services.kind to functional indexer pattern 2021-03-29 15:42:30 -04:00
Daniel Nephin bcde8d2fad state: add tests for services.kind indexer 2021-03-29 15:42:27 -04:00
Daniel Nephin 9f9eadd569 state: convert services table service and connect indexer
To the new functional indexer pattern
2021-03-29 15:42:24 -04:00
Daniel Nephin 11311c1fcc state: add tests for services table service and connect indexers 2021-03-29 15:42:22 -04:00
Daniel Nephin 9a3daf3100 state: use constant for tableServices 2021-03-29 15:42:18 -04:00
Daniel Nephin ec04df66bd state: remove duplication of Query indexer 2021-03-29 14:35:11 -04:00
Daniel Nephin 28866e48ad state: remove duplication in acl tables schema 2021-03-29 14:21:27 -04:00
Daniel Nephin c6a1ca701d state: reduce duplication in catalog table schema 2021-03-29 14:21:23 -04:00
Daniel Nephin d9dacb8388 state: share more indexer functions for config_entries 2021-03-29 14:21:20 -04:00
Daniel Nephin f303120f2d state: remove old schema test
This test has been replaced by TestNewDBSchema_Indexers
2021-03-29 14:21:13 -04:00
Daniel Nephin 150decff2a state: use addNamespaceIndex again 2021-03-29 14:21:02 -04:00
Daniel Nephin 4a3b462c28
Merge pull request #9911 from hashicorp/dnephin/state-index-acl-roles
state: convert ACLRoles policies index to new functional indexer pattern
2021-03-24 18:28:19 -04:00
Daniel Nephin 25b791ba47 state: add tests for checks.ID indexer 2021-03-22 18:06:43 -04:00
Daniel Nephin abbe5c3701 state: use tx.First instead of tx.FirstWatch
Where appropriate. After removing the helper function a bunch of  these calls can
be changed to tx.First.
2021-03-22 18:06:33 -04:00
Daniel Nephin 49938bc472 state: convert checks.ID index to new pattern 2021-03-22 18:06:08 -04:00
Hans Hasselberg 052662bcf9
introduce certopts (#9606)
* introduce cert opts

* it should be using the same signer

* lint and omit serial
2021-03-22 10:16:41 +01:00
Daniel Nephin 1d3fe64bba state: use uuid for acl-roles.policies index
Previously we were encoding the UUID as a string, but the index it references uses a UUID
so this index can also use an encoded UUID to save a bit of memory.
2021-03-19 19:45:37 -04:00
Daniel Nephin 3c01bb1156 state: convert acl-roles.policies index to new pattern 2021-03-19 19:45:37 -04:00
Daniel Nephin 474e95b9f5 state: convert acl-roles.name index to the functional indexer pattern 2021-03-19 19:45:37 -04:00