* rename some seal migration tests
* add comments to seal migration tests
* add comments to seal migration tests
* rename sealhelper
* creat TestSealMigration_ShamirToTestSeal
* fix typo in test suite
* create TestSealMigration_TransitToTestSeal
* Mark deprecated plugins as deprecated
* Add redaction capability to database plugins
* Add x509 client auth
* Update vendored files
* Add integration test for x509 client auth
* Remove redaction logic pending further discussion
* Update vendored files
* Minor updates from code review
* Updated docs with x509 client auth
* Roles are required
* Disable x509 test because it doesn't work in CircleCI
* Add timeouts for container lifetime
* Seal migration after unsealing
* Refactor migration fields migrationInformation in core
* Perform seal migration as part of postUnseal
* Remove the sleep logic
* Use proper seal in the unseal function
* Fix migration from Auto to Shamir
* Fix the recovery config missing issue
* Address the non-ha migration case
* Fix the multi cluster case
* Avoid re-running seal migration
* Run the post migration code in new leaders
* Fix the issue of wrong recovery being set
* Address review feedback
* Add more complete testing coverage for seal migrations. (#8247)
* Add more complete testing coverage for seal migrations. Also remove VAULT_ACC gate from some tests that just depend on docker, cleanup dangling recovery config in storage after migration, and fix a call in adjustCoreForSealMigration that seems broken.
* Fix the issue of wrong recovery key being set
* Adapt tests to work with multiple cores.
* Add missing line to disable raft join.
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
* Fix all known issues
* Remove warning
* Review feedback.
* Revert my previous change that broke raft tests. We'll need to come back and at least comment
this once we better understand why it's needed.
* Don't allow migration between same types for now
* Disable auto to auto tests for now since it uses migration between same types which is not allowed
* Update vault/core.go
Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>
* Add migration logs
* Address review comments
* Add the recovery config check back
* Skip a few steps if migration is already done
* Return from waitForLeadership if migration fails
Co-authored-by: ncabatoff <nick.cabatoff@gmail.com>
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
* external_tests: ensure derived cores are stable before proceeding on tests
* testhelpers: add min duration tolerance when checking stability on derived core
* Raft retry join
* update
* Make retry join work with shamir seal
* Return upon context completion
* Update vault/raft.go
Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>
* Address some review comments
* send leader information slice as a parameter
* Make retry join work properly with Shamir case. This commit has a blocking issue
* Fix join goroutine exiting before the job is done
* Polishing changes
* Don't return after a successful join during unseal
* Added config parsing test
* Add test and fix bugs
* minor changes
* Address review comments
* Fix build error
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
Shamir seals now come in two varieties: legacy and new-style. Legacy
Shamir is automatically converted to new-style when a rekey operation
is performed. All new Vault initializations using Shamir are new-style.
New-style Shamir writes an encrypted master key to storage, just like
AutoUnseal. The stored master key is encrypted using the shared key that
is split via Shamir's algorithm. Thus when unsealing, we take the key
fragments given, combine them into a Key-Encryption-Key, and use that
to decrypt the master key on disk. Then the master key is used to read
the keyring that decrypts the barrier.
* Initial work
* rework
* s/dr/recovery
* Add sys/raw support to recovery mode (#7577)
* Factor the raw paths out so they can be run with a SystemBackend.
# Conflicts:
# vault/logical_system.go
* Add handleLogicalRecovery which is like handleLogical but is only
sufficient for use with the sys-raw endpoint in recovery mode. No
authentication is done yet.
* Integrate with recovery-mode. We now handle unauthenticated sys/raw
requests, albeit on path v1/raw instead v1/sys/raw.
* Use sys/raw instead raw during recovery.
* Don't bother persisting the recovery token. Authenticate sys/raw
requests with it.
* RecoveryMode: Support generate-root for autounseals (#7591)
* Recovery: Abstract config creation and log settings
* Recovery mode integration test. (#7600)
* Recovery: Touch up (#7607)
* Recovery: Touch up
* revert the raw backend creation changes
* Added recovery operation token prefix
* Move RawBackend to its own file
* Update API path and hit it using CLI flag on generate-root
* Fix a panic triggered when handling a request that yields a nil response. (#7618)
* Improve integ test to actually make changes while in recovery mode and
verify they're still there after coming back in regular mode.
* Refuse to allow a second recovery token to be generated.
* Resize raft cluster to size 1 and start as leader (#7626)
* RecoveryMode: Setup raft cluster post unseal (#7635)
* Setup raft cluster post unseal in recovery mode
* Remove marking as unsealed as its not needed
* Address review comments
* Accept only one seal config in recovery mode as there is no scope for migration
* sys: add host-info endpoint, add client API method
* remove old commented handler
* add http tests, fix bugs
* query all partitions for disk usage
* fix Timestamp decoding
* add comments for clarification
* dont append a nil entry on disk usage query error
* remove HostInfo from the sdk api
We can use Logical().Read(...) to query this endpoint since the payload is contained with the data object. All warnings are preserved under Secret.Warnings.
* ensure that we're testing failure case against a standby node
* add and use TestWaitStandby to ensure core is on standby
* remove TestWaitStandby
* respond with local-only error
* move HostInfo into its own helper package
* fix imports; use new no-forward handler
* add cpu times to collection
* emit clearer multierrors/warnings by collection type
* add comments on HostInfo fields
Generalization of the PhysicalFactory notion introduced by Raft, so it can be used by other storage backends in tests. These are the OSS changes needed for my rework of the ent integ tests and cluster helpers.
There are a few different things happening in this change. First, some code that previously lived in enterprise has moved here: this includes some helper code for manipulating clusters and for building storage backends. Second, the existing cluster-building code using inmem storage has been generalized to allow various storage backends. Third, added support for creating two-cluster DR setups. Finally, there are tweaks to handle edge cases that
result in intermittent failures, or to eliminate sleeps in favour of polling to detect state changes.
Also: generalize TestClusterOptions.PhysicalFactory so it can be used either
as a per-core factory (for raft) or a per-cluster factory (for other
storage backends.)
We're waiting to see standbys receive wals but aren't generating traffic, so the condition is never satisfied. Fixed by continuously updating a KV value. It's a little weird to do so in each of the
goroutines, but there's no harm and it's simplest.
Various improvements to testhelpers.
* WaitForActiveNodeAndPerfStandbys is used to make sure a cluster is fully ready, i.e. both its active node and perf standbys are in a good state.
* WaitForReplicationStatus is like WaitForReplicationState but uses the API, part of a general
effort to move us away from interacting with Core directly in these tests.
* WaitForPerfReplicationWorking is similar to some code that exists in ent already: it writes to the primary and waits to see that appear on the secondary.