Seal keys can be rotated. When this happens, the barrier and recovery
keys should be re-encrypted with the new seal key. This change
automatically re-encrypts the barrier and recovery keys with the latest
seal key on the active node during the 'postUnseal' phase.
* sys: add host-info endpoint, add client API method
* remove old commented handler
* add http tests, fix bugs
* query all partitions for disk usage
* fix Timestamp decoding
* add comments for clarification
* dont append a nil entry on disk usage query error
* remove HostInfo from the sdk api
We can use Logical().Read(...) to query this endpoint since the payload is contained with the data object. All warnings are preserved under Secret.Warnings.
* ensure that we're testing failure case against a standby node
* add and use TestWaitStandby to ensure core is on standby
* remove TestWaitStandby
* respond with local-only error
* move HostInfo into its own helper package
* fix imports; use new no-forward handler
* add cpu times to collection
* emit clearer multierrors/warnings by collection type
* add comments on HostInfo fields
The OIDC Discovery standard requires the response_types_supported field
to be returned in the .well-known/openid-configuration response.
Also, the AWS IAM OIDC consumer won't accept Vault as an identity
provider without this field.
Based on examples in the OIDC Core documentation, it appears Vault
supports only the `id_token` flow, and thus that is the only value that
makes sense to be set in this field. See:
https://openid.net/specs/openid-connect-core-1_0.html#AuthorizationExamples
* sys/pprof: add pprof routes to the system backend
* sys/pprof: add pprof paths to handler with local-only check
* fix trailing slash on pprof index endpoint
* use new no-forward handler on pprof
* go mod tidy
* add pprof external tests
* disallow streaming requests to exceed DefaultMaxRequestDuration
* add max request duration test
This allows logical operations (along with a non-nil response writer) to
process http handler funcs within the operation function while keeping
auth and audit checks that the logical request flow provides.
* Move SudoPrivilege out of SystemView
We only use this in token store and it literally doesn't work anything
that isn't the token store or system mount, so we should stop exposing
something that doesn't work.
* Reconcile extended system view with sdk/logical a bit and put an explanation for why SudoPrivilege isn't moved over
Generalization of the PhysicalFactory notion introduced by Raft, so it can be used by other storage backends in tests. These are the OSS changes needed for my rework of the ent integ tests and cluster helpers.
* Fix create token sudo non-root namespace check
* Moved path trimming to SudoPrivilege
* Changed to tokenCtx instead of request ctx
* Use root context for AllowOperation; details in comment
* also flush nilNamespace when a namespace is flushed
* adds test cases with nilNamespace.ID
* adds a test case
* adds a test for oidcCache.Flush
* fixed a typo in an error message
There are a few different things happening in this change. First, some code that previously lived in enterprise has moved here: this includes some helper code for manipulating clusters and for building storage backends. Second, the existing cluster-building code using inmem storage has been generalized to allow various storage backends. Third, added support for creating two-cluster DR setups. Finally, there are tweaks to handle edge cases that
result in intermittent failures, or to eliminate sleeps in favour of polling to detect state changes.
Also: generalize TestClusterOptions.PhysicalFactory so it can be used either
as a per-core factory (for raft) or a per-cluster factory (for other
storage backends.)
* Add maximum amount of random characters requested at any given time
* Readability changes
* Removing sys/tools/random from the default policy
* Setting the maxBytes value as const
* Declaring maxBytes in the package to use it everywhere
* Using maxBytes in the error message
This eliminates a bunch of noise at the end of a test's logs, making it
easier to diagnose test failures.
Add TestCluster.Logger. This is intendend to be used when test helpers
want to log something. If TestClusterOptions.Logger is non-nil, it will
be used as the cluster logger, otherwise we create one based on test
name. In the absence of CoreConfig.Logger, this logger is also used as
the parent of each core's log. This makes it easy when running parallel
tests to identify which log message came from which test.
* flush identity/oidc cache by namespace
* separates and unit tests the logic that looks for a namespace id within a namespace key
* applies pr feedback
* renames nskeyContainsID to isNamespacedKey
We already have a separate log line if rollback fails. It really fills
up logs to always note when rollback is occurring and it usually isn't
useful for incidents.
Wrapping validation was deferring the function to audit log before
actually checking if we were sealed or standby, and without having the
read lock grabbed.
* upgrade aws roles
* test upgrade aws roles
* Initialize aws credential backend at mount time
* add a TODO
* create end-to-end test for builtin/credential/aws
* fix bug in initializer
* improve comments
* add Initialize() to logical.Backend
* use Initialize() in Core.enableCredentialInternal()
* use InitializeRequest to call Initialize()
* improve unit testing for framework.Backend
* call logical.Backend.Initialize() from all of the places that it needs to be called.
* implement backend.proto changes for logical.Backend.Initialize()
* persist current role storage version when upgrading aws roles
* format comments correctly
* improve comments
* use postUnseal funcs to initialize backends
* simplify test suite
* improve test suite
* simplify logic in aws role upgrade
* simplify aws credential initialization logic
* simplify logic in aws role upgrade
* use the core's activeContext for initialization
* refactor builtin/plugin/Backend
* use a goroutine to upgrade the aws roles
* misc improvements and cleanup
* do not run AWS role upgrade on DR Secondary
* always call logical.Backend.Initialize() when loading a plugin.
* improve comments
* on standbys and DR secondaries we do not want to run any kind of upgrade logic
* fix awsVersion struct
* clarify aws version upgrade
* make the upgrade logic for aws auth more explicit
* aws upgrade is now called from a switch
* fix fallthrough bug
* simplify logic
* simplify logic
* rename things
* introduce currentAwsVersion const to track aws version
* improve comments
* rearrange things once more
* conglomerate things into one function
* stub out aws auth initialize e2e test
* improve aws auth initialize e2e test
* finish aws auth initialize e2e test
* tinker with aws auth initialize e2e test
* tinker with aws auth initialize e2e test
* tinker with aws auth initialize e2e test
* fix typo in test suite
* simplify logic a tad
* rearrange assignment
* Fix a few lifecycle related issues in #7025 (#7075)
* Fix panic when plugin fails to load
* Fix various read only storage errors
A mistake we've seen multiple times in our own plugins and that we've
seen in the GCP plugin now is that control flow (how the code is
structured, helper functions, etc.) can obfuscate whether an error came
from storage or some other Vault-core location (in which case likely it
needs to be a 5XX message) or because of user input (thus 4XX). Error
handling for functions therefore often ends up always treating errors as
either user related or internal.
When the error is logical.ErrReadOnly this means that treating errors as
user errors skips the check that triggers forwarding, instead returning
a read only view error to the user.
While it's obviously more correct to fix that code, it's not always
immediately apparent to reviewers or fixers what the issue is and fixing
it when it's found both requires someone to hit the problem and report
it (thus exposing bugs to users) and selective targeted refactoring that
only helps that one specific case.
If instead we check whether the logical.Response is an error and, if so,
whether it contains the error value, we work around this in all of these
cases automatically. It feels hacky since it's a coding mistake, but
it's one we've made too multiple times, and avoiding bugs altogether is
better for our users.
* storage/raft: When restoring a snapshot preseal first
* best-effort allow standbys to apply the restoreOp before sealing active node
* Don't cache the raft tls key
* Update physical/raft/raft.go
* Move pending raft peers to core
* Fix race on close bool
* Extend the leaderlease time for tests
* Update raft deps
* Fix audit hashing
* Fix race with auditing