* Add ability to migrate autoseal to autoseal
This adds the ability to migrate from shamir to autoseal, autoseal to
shamir, or autoseal to autoseal, by allowing multiple seal stanzas. A
disabled stanza will be used as the config being migrated from; this can
also be used to provide an unwrap seal on ent over multiple unseals.
A new test is added to ensure that autoseal to autoseal works as
expected.
* Fix test
* Provide default shamir info if not given in config
* Linting feedback
* Remove context var that isn't used
* Don't run auto unseal watcher when in migration, and move SetCores to SetSealsForMigration func
* Slight logic cleanup
* Fix test build and fix bug
* Updates
* remove GetRecoveryKey function
* First pass at filtered-path endpoint. It seems to be working, but there are tests missing, and possibly some optimization to handle large key sets.
* Vendor go-cmp.
* Fix incomplete vendoring of go-cmp.
* Improve test coverage. Fix bug whereby access to a subtree named X would expose existence of a the key named X at the same level.
* Add benchmarks, which showed that hasNonDenyCapability would be "expensive" to call for every member of a large folder. Made a couple of minor tweaks so that now it can be done without allocations.
* Comment cleanup.
* Review requested changes: rename some funcs, use routeCommon instead of
querying storage directly.
* Keep the same endpoint for now, but move it from a LIST to a POST and allow multiple paths to be queried in one operation.
* Modify test to pass multiple paths in at once.
* Add endpoint to default policy.
* Move endpoint to /sys/access/filtered-path.
* Handle ns lease and token renew/revoke via relative paths
* s/usin/using/
* add token and lease lookup paths; set ctx only on non-nil ns
Addtionally, use client token's ns for auth/token/lookup if no token is provided
* Adding Transit Autoseal
* adding tests
* adding more tests
* updating seal info
* send a value to test and set current key id
* updating message
* cleanup
* Adding tls config, addressing some feedback
* adding tls testing
* renaming config fields for tls
* Path globbing
* Add glob support at the beginning
* Ensure when evaluating an ACL that our path never has a leading slash. This already happens in the normal request path but not in tests; putting it here provides it for tests and extra safety in case the request path changes
* Simplify the algorithm, we don't really need to validate the prefix first as glob won't apply if it doesn't
* Add path segment wildcarding
* Disable path globbing for now
* Remove now-unneeded test
* Remove commented out globbing bits
* Remove more holdover glob bits
* Rename k var to something more clear
* Port over OSS cluster port refactor components
* Start forwarding
* Cleanup a bit
* Fix copy error
* Return error from perf standby creation
* Add some more comments
* Fix copy/paste error
* initial commit for prometheus and sys/metrics support
* Throw an error if prometheusRetentionTime is 0,add prometheus in devmode
* return when format=prometheus is used and prom is disable
* parse prometheus_retention_time from string instead of int
* Initialize config.Telemetry if nil
* address PR issues
* add sys/metrics framework.Path in a factory
* Apply requiredMountTable entries's MountConfig to existing core table
* address pr comments
* enable prometheus sink by default
* Move Metric-related code in a separate metricsutil helper
* Fixes a regression in forwarding from #6115
Although removing the authentication header is good defense in depth,
for forwarding mechanisms that use the raw request, we never add it
back. This caused perf standby tests to throw errors. Instead, once
we're past the point at which we would do any raw forwarding, but before
routing the request, remove the header.
To speed this up, a flag is set in the logical.Request to indicate where
the token is sourced from. That way we don't iterate through maps
unnecessarily.
* Merge entities during unseal only on the primary
* Add another guard check
* Add perf standby to the check
* Make primary to not differ from case-insensitivity status w.r.t secondaries
* Ensure mutual exclusivity between loading and invalidations
* Both primary and secondaries won't persist during startup and invalidations
* Allow primary to persist when loading case sensitively
* Using core.perfStandby
* Add a tweak in core for testing
* Address review feedback
* update memdb but not storage in secondaries
* Wire all the things directly do mergeEntity
* Fix persist behavior
* Address review feedback
* Two things:
* Change how we populate and clear leader UUID. This fixes a case where
if a standby disconnects from an active node and reconnects, without the
active node restarting, the UUID doesn't change so triggers on a new
active node don't get run.
* Add a bunch of test helpers and minor updates to things.
This lets other parts of Vault that can't depend on the vault package
take advantage of the subview functionality.
This also allows getting rid of BarrierStorage and vault.Entry, two
totally redundant abstractions.
* Add helper for checking if an error is a fatal error
The double-double negative was really confusing, and this pattern is used a few places in Vault. This negates the double negative, making the devx a bit easier to follow.
* Check return value of UnsealWithStoredKeys in sys/init
* Return proper error types when attempting unseal with stored key
Prior to this commit, "nil" could have meant unsupported auto-unseal, a transient error, or success. This updates the function to return the correct error type, signaling to the caller whether they should retry or fail.
* Continuously attempt to unseal if sealed keys are supported
This fixes a bug that occurs on bootstrapping an initial cluster. Given a collection of Vault nodes and an initialized storage backend, they will all go into standby waiting for initialization. After one node is initialized, the other nodes had no mechanism by which they "re-check" to see if unseal keys are present. This adds a goroutine to the server command which continually waits for unseal keys to exist. It exits in the following conditions:
- the node is unsealed
- the node does not support stored keys
- a fatal error occurs (as defined by Vault)
- the server is shutting down
In all other situations, the routine wakes up at the specified interval and attempts to unseal with the stored keys.
* Upgrade to new Cloud KMS client libraries
We recently released the new Cloud KMS client libraries which use GRPC
instead of HTTP. They are faster and look nicer (</opinion>), but more
importantly they drastically simplify a lot of the logic around client
creation, encryption, and decryption. In particular, we can drop all the
logic around looking up credentials and base64-encoding/decoding.
Tested on a brand new cluster (no pre-existing unseal keys) and against
a cluster with stored keys from a previous version of Vault to ensure no
regressions.
* Use the default scopes the client requests
The client already does the right thing here, so we don't need to
surface it, especially since we aren't allowing users to configure it.
* fix cubbyhole deletion
* Fix error handling
* Move the cubbyhole tidy logic to token store and track the revocation count
* Move fetching of cubby keys before the tidy loop
* Fix context getting cancelled
* Test the cubbyhole cleanup logic
* Add progress counter for cubbyhole cleanup
* Minor polish
* Use map instead of slice for faster computation
* Add test for cubbyhole deletion
* Add a log statement for deletion
* Add SHA1 hashed tokens into the mix
The result will still pass gofmtcheck and won't trigger additional
changes if someone isn't using goimports, but it will avoid the
piecemeal imports changes we've been seeing.
This changes the behavior of the GCPCKMS auto-unsealer setup to attempt
encryption instead of a key lookup. Key lookups are a different API
method not covered by roles/cloudkms.cryptoKeyEncrypterDecrypter. This
means users must grant an extended scope to their service account
(granting the ability to read key data) which only seems to be used to
validate the existence of the key.
Worse, the only roles that include this permission are overly verbose
(e.g. roles/viewer which gives readonly access to everything in the
project and roles/cloudkms.admin which gives full control over all key
operations). This leaves the user stuck between choosing to create a
custom IAM role (which isn't fun) or grant overly broad permissions.
By changing to an encrypt call, we get better verification of the unseal
permissions and users can reduce scope to a single role.
* Refactor mount tune to support upsert options values and unset options.
* Do not allow unsetting options map
* add secret tune version regression test
* Only accept valid options version
* s/meVersion/optVersion/
We're having issues with leases in the GCS backend storage being
corrupted and failing MAC checking. When that happens, we need to know
the lease ID so we can address the corruption by hand and take
appropriate action.
This will hopefully prevent any instances of incomplete data being sent
to GSS
* Use common abstraction for entity deletion
* Update group memberships before deleting entity
* Added test
* Fix return statements
* Update comment
* Cleanup member entity IDs while loading groups
* Added test to ensure that upgrade happens properly
* Ensure that the group gets persisted if upgrade code modifies it
* Continue on plugin registration error in dev mode
* Continue only on unknown type error
* Continue only on unknown type error
* Print plugin registration error on exit
Co-Authored-By: calvn <cleung2010@gmail.com>
* Support registering plugin with name only
* Make RegisterPlugin backwards compatible
* Add CLI backwards compat command to plugin info and deregister
* Add server-side deprecation warnings if old read/dereg API endpoints are called
* Address feedback
The slice returned by `collectGroupsReverseDFS` is an updated copy of
the slice given to it when called. Appending `pGroups` to `groups`
therefore led to expontential memory usage as the slice was repeatedly
appended to itself.
Fixes#5605
* case insensitive identity names
* TestIdentityStore_GroupHierarchyCases
* address review feedback
* Use errwrap.Contains instead of errwrap.ContainsType
* Warn about duplicate names all the time to help fix them
* Address review feedback
* Support Authorization Bearer as token header
* add requestAuth test
* remove spew debug output in test
* Add Authorization in CORS Allowed headers
* use const where applicable
* use less allocations in bearer token checking
* address PR comments on tests and apply last commit
* reorder error checking in a TestHandler_requestAuth
* logical/aws: Harden WAL entry creation
If AWS IAM user creation failed in any way, the WAL corresponding to the
IAM user would get left around and Vault would try to roll it back.
However, because the user never existed, the rollback failed. Thus, the
WAL would essentially get "stuck" and Vault would continually attempt to
roll it back, failing every time. A similar situation could arise if the
IAM user that Vault created got deleted out of band, or if Vault deleted
it but was unable to write the lease revocation back to storage (e.g., a
storage failure).
This attempts to harden it in two ways. One is by deleting the WAL log
entry if the IAM user creation fails. However, the WAL deletion could
still fail, and this wouldn't help where the user is deleted out of
band, so second, consider the user rolled back if the user just doesn't
exist, under certain circumstances.
Fixes#5190
* Fix segfault in expiration unit tests
TestExpiration_Tidy was passing in a leaseEntry that had a nil Secret,
which then caused a segfault as the changes to revokeEntry didn't check
whether Secret was nil; this is probably unlikely to occur in real life,
but good to be extra cautious.
* Fix potential segfault
Missed the else...
* Respond to PR feedback
* Fix for using ExplicitMaxTTL in auth method plugins.
* Reverted pb.go files for readability of PR.
* Fixed indenting of comment.
* Reverted unintended change by go test.
This works around a very, very common error where people write policies
to affect listing but forget the slash at the end. If there is no exact
rule with a slash at the end when doing a list, we look to see if there
is a rule without it, and if so, use those capabilities.
Fixes #mass-user-confusion
* Initial work on templating
* Add check for unbalanced closing in front
* Add missing templated assignment
* Add first cut of end-to-end test on templating.
* Make template errors be 403s and finish up testing
* Review feedback
* Merge Identity Entities if two claim the same alias
Past bugs/race conditions meant two entities could be created each
claiming the same alias. There are planned longer term fixes for this
(outside of the race condition being fixed in 0.10.4) that involve
changing the data model, but this is an immediate workaround that has
the same net effect: if two entities claim the same alias, assume they
were created due to this race condition and merge them.
In this situation, also automatically merge policies so we don't lose
e.g. RGPs.
* plumbing request context to expiration manager
* moar context
* address feedback
* only using active context for revoke prefix
* using active context for revoke commands
* cancel tidy on active context
* address feedback
* core: Cancel context before taking state lock
* Create active context outside of postUnseal
* Attempt to drain requests before canceling context
* fix test
* Add request timeouts in normal request path and to expirations
* Add ability to adjust default max request duration
* Some test fixes
* Ensure tests have defaults set for max request duration
* Add context cancel checking to inmem/file
* Fix tests
* Fix tests
* Set default max request duration to basically infinity for this release for BC
* Address feedback
* Tackle #4929 a different way
This turns c.sealed into an atomic, which allows us to call sealInternal
without a lock. By doing so we can better control lock grabbing when a
condition causing the standby loop to get out of active happens. This
encapsulates that logic into two distinct pieces (although they could
be combined into one), and makes lock guarding more understandable.
* Re-add context canceling to the non-HA version of sealInternal
* Return explicitly after stopCh triggered
1) In backends, ensure they are now using TokenPolicies
2) Don't reassign auth.Policies until after expmgr registration as we
don't need them at that point
Fixes#4829
It's not obvious why this should be secret, and if it were considered
secret, when and what anything would ever be allowed to access it.
Likely the right way to tie secret values to particular
entities/aliases/groups would be to use the upcoming templated ACL
feature.
1) Disable MaxRetries in test cluster clients. We generally want to fail
as fast as possible in tests so adding unpredictable timing in doesn't
help things, especially if we're timing sensitive in the test.
2) EquivalentPolicies is supposed to return true if only one set
contains `default` and the other is empty, but if one set was nil
instead of simply a zero length slice it would always return false. This
means that renewing against, say, `userpass` when not actually
specifying any user policies would always fail.