Shamir seals now come in two varieties: legacy and new-style. Legacy
Shamir is automatically converted to new-style when a rekey operation
is performed. All new Vault initializations using Shamir are new-style.
New-style Shamir writes an encrypted master key to storage, just like
AutoUnseal. The stored master key is encrypted using the shared key that
is split via Shamir's algorithm. Thus when unsealing, we take the key
fragments given, combine them into a Key-Encryption-Key, and use that
to decrypt the master key on disk. Then the master key is used to read
the keyring that decrypts the barrier.
* cli: initial work on debug; server-status target
* debug: add metrics capture target (#7376)
* check against DR secondary
* debug: add compression
* refactor check into preflight func
* debug: set short test time on tests, fix exit code bug
* debug: use temp dir for output on tests
* debug: use mholt/archiver for compression
* first pass on adding pprof
* use logger for output
* refactor polling target capture logic
* debug: poll and collect replication status
* debug: poll and collect host-info; rename output files and collection refactor
* fix comments
* add archive test; fix bugs found
* rename flag name to singular target
* add target output test; scaffold other tests cases
* debug/test: add pprof and index file tests
* debug/test: add min timing check tests
* debug: fix index gen race and collection goroutine race
* debug: extend archive tests, handle race between program exit and polling goroutines
* update docstring
* debug: correctly add to pollingWg
* debug: add config target support
* debug: don't wait on interrupt shutdown; add file exists unit tests
* move pprof bits into its goroutine
* debug: skip empty metrics and some pprof file creation if permission denied, add matching unit test
* address comments and feedback
* Vault debug using run.Group (#7658)
* debug: switch to use oklog/run.Group
* debug: use context to cancel requests and interrupt rungroups.
* debug: trigger the first interval properly
* debug: metrics collection should use metrics interval
* debug: add missing continue on metrics error
* debug: remove the use of buffered chan to trigger first interval
* debug: don't shadow BaseCommand's client, properly block on interval capture failures
* debug: actually use c.cachedClient everywhere
* go mod vendor
* debug: run all pprof in goroutines; bump pprof timings in tests to reduce flakiness
* debug: update help text
* Initial work
* rework
* s/dr/recovery
* Add sys/raw support to recovery mode (#7577)
* Factor the raw paths out so they can be run with a SystemBackend.
# Conflicts:
# vault/logical_system.go
* Add handleLogicalRecovery which is like handleLogical but is only
sufficient for use with the sys-raw endpoint in recovery mode. No
authentication is done yet.
* Integrate with recovery-mode. We now handle unauthenticated sys/raw
requests, albeit on path v1/raw instead v1/sys/raw.
* Use sys/raw instead raw during recovery.
* Don't bother persisting the recovery token. Authenticate sys/raw
requests with it.
* RecoveryMode: Support generate-root for autounseals (#7591)
* Recovery: Abstract config creation and log settings
* Recovery mode integration test. (#7600)
* Recovery: Touch up (#7607)
* Recovery: Touch up
* revert the raw backend creation changes
* Added recovery operation token prefix
* Move RawBackend to its own file
* Update API path and hit it using CLI flag on generate-root
* Fix a panic triggered when handling a request that yields a nil response. (#7618)
* Improve integ test to actually make changes while in recovery mode and
verify they're still there after coming back in regular mode.
* Refuse to allow a second recovery token to be generated.
* Resize raft cluster to size 1 and start as leader (#7626)
* RecoveryMode: Setup raft cluster post unseal (#7635)
* Setup raft cluster post unseal in recovery mode
* Remove marking as unsealed as its not needed
* Address review comments
* Accept only one seal config in recovery mode as there is no scope for migration
* implement SSRF protection header
* add test for SSRF protection header
* cleanup
* refactor
* implement SSRF header on a per-listener basis
* cleanup
* cleanup
* creat unit test for agent SSRF
* improve unit test for agent SSRF
* add VaultRequest SSRF header to CLI
* fix unit test
* cleanup
* improve test suite
* simplify check for Vault-Request header
* add constant for Vault-Request header
* improve test suite
* change 'config' to 'agentConfig'
* Revert "change 'config' to 'agentConfig'"
This reverts commit 14ee72d21fff8027966ee3c89dd3ac41d849206f.
* do not remove header from request
* change header name to X-Vault-Request
* simplify http.Handler logic
* cleanup
* simplify http.Handler logic
* use stdlib errors package
* command/server: fix TestLoadConfigFile_json2 test, fix hcl tags
Fixes test to call the equality check, and add missing values to the expected object. Fixes hcl tags in the Telemetry structs.
* fix PrometheusRetentionTime tag
When starting a vault dev server the token helper is invoked to store
the dev root token.
This option gives the user the ability to not store the token.
Storing the token can be undesirable in certain circumstances
(e.g. running local tests) as the user's existing vault token is
clobbered without warning.
Fixes#1861
* Read config before creating logger when booting vault server
* Allow for specifying log output in JSON format in a config file, via a 'log_level' flag
* Create parser for log format flag
* Allow for specifying log format in a config file, via a 'log_format' flag. Also, get rid of 'log_json' flag.
* Add 'log-format' command line flag
* Update documentation to include description of log_format setting
* Tweak comment for VAULT_LOG_FORMAT environment variable
* add test for ParseEnvLogFormat()
* clarify how log format is set
* fix typos in documentation
* Implemented token backend support for identity
* Fixed tests
* Refactored a few checks for the token entity overwrite. Fixed tests.
* Moved entity alias check up so that the entity and entity alias is only created when it has been specified in allowed_entity_aliases list
* go mod vendor
* Added glob pattern
* Optimized allowed entity alias check
* Added test for asterisk only
* Changed to glob pattern anywhere
* Changed response code in case of failure. Changed globbing pattern check. Added docs.
* Added missing token role get parameter. Added more samples
* Fixed failing tests
* Corrected some cosmetical review points
* Changed response code for invalid provided entity alias
* Fixed minor things
* Fixed failing test
* Work on raft backend
* Add logstore locally
* Add encryptor and unsealable interfaces
* Add clustering support to raft
* Remove client and handler
* Bootstrap raft on init
* Cleanup raft logic a bit
* More raft work
* Work on TLS config
* More work on bootstrapping
* Fix build
* More work on bootstrapping
* More bootstrapping work
* fix build
* Remove consul dep
* Fix build
* merged oss/master into raft-storage
* Work on bootstrapping
* Get bootstrapping to work
* Clean up FMS and node-id
* Update local node ID logic
* Cleanup node-id change
* Work on snapshotting
* Raft: Add remove peer API (#906)
* Add remove peer API
* Add some comments
* Fix existing snapshotting (#909)
* Raft get peers API (#912)
* Read raft configuration
* address review feedback
* Use the Leadership Transfer API to step-down the active node (#918)
* Raft join and unseal using Shamir keys (#917)
* Raft join using shamir
* Store AEAD instead of master key
* Split the raft join process to answer the challenge after a successful unseal
* get the follower to standby state
* Make unseal work
* minor changes
* Some input checks
* reuse the shamir seal access instead of new default seal access
* refactor joinRaftSendAnswer function
* Synchronously send answer in auto-unseal case
* Address review feedback
* Raft snapshots (#910)
* Fix existing snapshotting
* implement the noop snapshotting
* Add comments and switch log libraries
* add some snapshot tests
* add snapshot test file
* add TODO
* More work on raft snapshotting
* progress on the ConfigStore strategy
* Don't use two buckets
* Update the snapshot store logic to hide the file logic
* Add more backend tests
* Cleanup code a bit
* [WIP] Raft recovery (#938)
* Add recovery functionality
* remove fmt.Printfs
* Fix a few fsm bugs
* Add max size value for raft backend (#942)
* Add max size value for raft backend
* Include physical.ErrValueTooLarge in the message
* Raft snapshot Take/Restore API (#926)
* Inital work on raft snapshot APIs
* Always redirect snapshot install/download requests
* More work on the snapshot APIs
* Cleanup code a bit
* On restore handle special cases
* Use the seal to encrypt the sha sum file
* Add sealer mechanism and fix some bugs
* Call restore while state lock is held
* Send restore cb trigger through raft log
* Make error messages nicer
* Add test helpers
* Add snapshot test
* Add shamir unseal test
* Add more raft snapshot API tests
* Fix locking
* Change working to initalize
* Add underlying raw object to test cluster core
* Move leaderUUID to core
* Add raft TLS rotation logic (#950)
* Add TLS rotation logic
* Cleanup logic a bit
* Add/Remove from follower state on add/remove peer
* add comments
* Update more comments
* Update request_forwarding_service.proto
* Make sure we populate all nodes in the followerstate obj
* Update times
* Apply review feedback
* Add more raft config setting (#947)
* Add performance config setting
* Add more config options and fix tests
* Test Raft Recovery (#944)
* Test raft recovery
* Leave out a node during recovery
* remove unused struct
* Update physical/raft/snapshot_test.go
* Update physical/raft/snapshot_test.go
* fix vendoring
* Switch to new raft interface
* Remove unused files
* Switch a gogo -> proto instance
* Remove unneeded vault dep in go.sum
* Update helper/testhelpers/testhelpers.go
Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>
* Update vault/cluster/cluster.go
* track active key within the keyring itself (#6915)
* track active key within the keyring itself
* lookup and store using the active key ID
* update docstring
* minor refactor
* Small text fixes (#6912)
* Update physical/raft/raft.go
Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>
* review feedback
* Move raft logical system into separate file
* Update help text a bit
* Enforce cluster addr is set and use it for raft bootstrapping
* Fix tests
* fix http test panic
* Pull in latest raft-snapshot library
* Add comment
* Audit listing with format json returns json, not a string
Fixes#6775
* list, kv list and namespace list with format json returns json, not a string
* Changed audit list return code to 2 which aligns with other list commands return codes
* http timeout fields are configurable
* move return statement for server config tests outside of range loop
* adds documentation for configurable listener http_* values
* fixed some formatting for the docs markdown
* adding auto auth for cert auth methods
* Adding Docs for Cert Auto-auth method
* fixes errors in docs
* wrong documentation title
* repariting a few typos in the docs and being very clear about where the certificates should be configured
* clarifying the name parameter documentation
* Fixes gofmt issues in command/agent.go
* Fix typo in docs
* Add nil check to Config for cert auto-auth and add test with and without a specified name
* Simplify Run(): the function that was being sent over a channel doesn't
need to close over anything except latestToken, and we don't need to
create a new one each iteration. Instead just pass the relevant items,
namely the token and sink to work on.
* Disallow the following config combinations:
1. auto_auth.method.wrap_ttl > 0 and multiple file sinks
2. auto_auth.method.wrap_ttl > 0 and single file sink with wrap_ttl > 0
3. auto_auth.method.wrap_ttl > 0 and cache.use_auto_auth_token = true
* Expose errors that occur when APIProxy is forwarding request to Vault.
* Fix merge issues.
If you were migrating to Shamir but didn't specify a Shamir block
migration would fail. Being explicit is nice but it's also not really
obvious since you don't need the block normally.
* Since we want to use the Agent listener for #6384, move listener config
from top-level 'cache' block to new top-level 'listeners' block.
* Make agent config allow cache and listener blocks without auto-auth
configured.
Append call in form of `append(s)` has no effect,
it just returns `s`. Sometimes such invocation is a sign
of a programming error, so it's better to remove these.
Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>
* Listener refactoring and file system permissions
* added listenerutil and move some common code there
* Added test for verifying socket file permissions
* Change default port of agent to 8200
* address review feedback
* Address review feedback
* Read socket options from listener config
* Added warning when init command uses auto unseal and -key-shares and -key-threshold was set
* Reworked
* Update command/operator_init.go
Co-Authored-By: michelvocks <michelvocks@gmail.com>
* agent/caching: proxy redirect and non-json responses; update tests
* agent/caching: do not wrap error responses as internal errors, simply proxy them back as-is
* minor refactoring of APIProxy.Send logic
* add test case to ensure error response is not wrapped
* Add ability to migrate autoseal to autoseal
This adds the ability to migrate from shamir to autoseal, autoseal to
shamir, or autoseal to autoseal, by allowing multiple seal stanzas. A
disabled stanza will be used as the config being migrated from; this can
also be used to provide an unwrap seal on ent over multiple unseals.
A new test is added to ensure that autoseal to autoseal works as
expected.
* Fix test
* Provide default shamir info if not given in config
* Linting feedback
* Remove context var that isn't used
* Don't run auto unseal watcher when in migration, and move SetCores to SetSealsForMigration func
* Slight logic cleanup
* Fix test build and fix bug
* Updates
* remove GetRecoveryKey function
* Implemented a warning when tls_cipher_suites includes only cipher suites which are not supprted by the HTTP/2 spec
* Added test for cipher suites
* Added hard fail on startup when all defined cipher suites are blacklisted. Added warning when some ciphers are blacklisted.
* Replaced hard failure with warning. Removed bad cipher util function and replaced it by external library.
* Added missing dependency. Fixed renaming of package name.
* Adding Transit Autoseal
* adding tests
* adding more tests
* updating seal info
* send a value to test and set current key id
* updating message
* cleanup
* Adding tls config, addressing some feedback
* adding tls testing
* renaming config fields for tls
* agent/caching: enable caching of leases generated by autoauth token
* add test for auth/token/create path
* update error message log
* Some minor updates
* add sleep timer for renewal logic to process
* Properly compute auto-auth token
* Use inmem sink to track auto-auth token
* update debug statement
* Only add inmem sink if using auto-auth token is allowed
* vault-agent-cache: squashed 250+ commits
* Add proper token revocation validations to the tests
* Add more test cases
* Avoid leaking by not closing request/response bodies; add comments
* Fix revoke orphan use case; update tests
* Add CLI test for making request over unix socket
* agent/cache: remove namespace-related tests
* Strip-off the auto-auth token from the lookup response
* Output listener details along with configuration
* Add scheme to API address output
* leasecache: use IndexNameLease for prefix lease revocations
* Make CLI accept the fully qualified unix address
* export VAULT_AGENT_ADDR=unix://path/to/socket
* unix:/ to unix://
* initial commit for prometheus and sys/metrics support
* Throw an error if prometheusRetentionTime is 0,add prometheus in devmode
* return when format=prometheus is used and prom is disable
* parse prometheus_retention_time from string instead of int
* Initialize config.Telemetry if nil
* address PR issues
* add sys/metrics framework.Path in a factory
* Apply requiredMountTable entries's MountConfig to existing core table
* address pr comments
* enable prometheus sink by default
* Move Metric-related code in a separate metricsutil helper
* Add helper for checking if an error is a fatal error
The double-double negative was really confusing, and this pattern is used a few places in Vault. This negates the double negative, making the devx a bit easier to follow.
* Check return value of UnsealWithStoredKeys in sys/init
* Return proper error types when attempting unseal with stored key
Prior to this commit, "nil" could have meant unsupported auto-unseal, a transient error, or success. This updates the function to return the correct error type, signaling to the caller whether they should retry or fail.
* Continuously attempt to unseal if sealed keys are supported
This fixes a bug that occurs on bootstrapping an initial cluster. Given a collection of Vault nodes and an initialized storage backend, they will all go into standby waiting for initialization. After one node is initialized, the other nodes had no mechanism by which they "re-check" to see if unseal keys are present. This adds a goroutine to the server command which continually waits for unseal keys to exist. It exits in the following conditions:
- the node is unsealed
- the node does not support stored keys
- a fatal error occurs (as defined by Vault)
- the server is shutting down
In all other situations, the routine wakes up at the specified interval and attempts to unseal with the stored keys.
This changes (*Config).Merge to merge all fields of a Config.
Previously, when merging Configs, some configuration fields were
ignored and completely lost, including APIAddr, ClusterAddr, and
a couple boolean fields. This only occurs when using multiple config
files and does not affect single config files (even when loading from
a directory -- Merge is only called after a second file is loaded).
- Fix APIAddr not being merged.
- Fix ClusterAddr not being merged.
- Fix DisablePrintableCheck not being merged.
- Fix DisableClustering not being merged. The DisableClusteringRaw
value is also preserved so that it can be used in overrides for
storage fields.
- Use merged top-level config as storage field overrides.
- Update config dir test fixtures to set some fields missed by
(*Config).Merge previously.
The result will still pass gofmtcheck and won't trigger additional
changes if someone isn't using goimports, but it will avoid the
piecemeal imports changes we've been seeing.
* re-add plugin name for backwards compat
* add plugin name to table for backwards compat
* conditionally set the plugin name
* Update command/auth_list.go
Co-Authored-By: tyrannosaurus-becks <beccapetrin@posteo.net>
* Update command/secrets_list.go
Co-Authored-By: tyrannosaurus-becks <beccapetrin@posteo.net>
* update auth and secrets list commands
* add comments regarding deprecation
* Add support for custom JWT path in Agent: kubernetes auth
- add support for "token_path" configuration
- add a reader for mocking in tests
* add documentation for token_path
* Refactor mount tune to support upsert options values and unset options.
* Do not allow unsetting options map
* add secret tune version regression test
* Only accept valid options version
* s/meVersion/optVersion/
* Continue on plugin registration error in dev mode
* Continue only on unknown type error
* Continue only on unknown type error
* Print plugin registration error on exit
Co-Authored-By: calvn <cleung2010@gmail.com>
* Support registering plugin with name only
* Make RegisterPlugin backwards compatible
* Add CLI backwards compat command to plugin info and deregister
* Add server-side deprecation warnings if old read/dereg API endpoints are called
* Address feedback
Running 'vault auth' with no parameters was panicking:
panic: assignment to entry in nil map
github.com/hashicorp/vault/command/login.go:255 +0xdee
Now it will show help.
The addition of CheckMigration to the server startup process means
that physical backends in this test need to be able to respond to Get() without error.
We support this in the API as of 0.10.2 so read should support it too.
Trivially tested with some log info:
`core: data: data="map[string]interface {}{"zip":[]string{"zap", "zap2"}}"`
This allows it to authenticate once, then exit once all sinks have
reported success. Useful for things like an init container vs. a
sidecard container.
Also adds command-level testing of it.
* Add request timeouts in normal request path and to expirations
* Add ability to adjust default max request duration
* Some test fixes
* Ensure tests have defaults set for max request duration
* Add context cancel checking to inmem/file
* Fix tests
* Fix tests
* Set default max request duration to basically infinity for this release for BC
* Address feedback
* Tackle #4929 a different way
This turns c.sealed into an atomic, which allows us to call sealInternal
without a lock. By doing so we can better control lock grabbing when a
condition causing the standby loop to get out of active happens. This
encapsulates that logic into two distinct pieces (although they could
be combined into one), and makes lock guarding more understandable.
* Re-add context canceling to the non-HA version of sealInternal
* Return explicitly after stopCh triggered
Making this configurable is useful for windows users which may not be
using the default `ssh` executable. It also means that users can point to a
specify SSH executable if multiple are available.
* Add 'plugin list' command
* Add 'plugin register' command
* Add 'plugin deregister' command
* Use a shared plugin helper
* Add 'plugin read' command
* Rename to plugin info
* Add base plugin for help text
* Fix arg ordering
* Add docs
* Rearrange to alphabetize
* Fix arg ordering in example
* Don't use "sudo" in command description
* Add description flag to secrets and auth tune subcommands
* Allow empty description to be provided in secret and auth mount tune
* Use flagNameDescription
This change makes it so that if a lease is revoked through user action,
we set the expiration time to now and update pending, just as we do with
tokens. This allows the normal retry logic to apply in these cases as
well, instead of just erroring out immediately. The idea being that once
you tell Vault to revoke something it should keep doing its darndest to
actually make that happen.
* Allow lease_duration to be pulled out with -field
This also provides an easy way to verify that when -field is used we
don't string format the value.
This also changes the human string helper to accept more than one type
of incoming int.
* Address review feedback
* Allow max request size to be user-specified
This turned out to be way more impactful than I'd expected because I
felt like the right granularity was per-listener, since an org may want
to treat external clients differently from internal clients. It's pretty
straightforward though.
This also introduces actually using request contexts for values, which
so far we have not done (using our own logical.Request struct instead),
but this allows non-logical methods to still get this benefit.
* Switch to ioutil.ReadAll()
In current Vault server EncodedToken will always be populated regardless
of type (root, DR), so prioritize that, and properly refer to it as
Encoded Token instead of Root Token.
Additionally refer to the nonce as the Operation nonce instead of the
Root generation operation nonce since it's used for both strategies.
* Add an idle timeout for the server
Because tidy operations can be long-running, this also changes all tidy
operations to behave the same operationally (kick off the process, get a
warning back, log errors to server log) and makes them all run in a
goroutine.
This could mean a sort of hard stop if Vault gets sealed because the
function won't have the read lock. This should generally be okay
(running tidy again should pick back up where it left off), but future
work could use cleanup funcs to trigger the functions to stop.
* Fix up tidy test
* Add deadline to cluster connections and an idle timeout to the cluster server, plus add readheader/read timeout to api server
* Add `kv rollback`
Like `kv patch` this is more of a helper than anything else; it provides
a single command to fetch the current version (for CAS), read the
version you want to roll back to, and set it as the new version (using
CAS for safety).
This can be used when errors are happening early on to avoid them being
swallowed by logGate.
This also does a bit of cleanup of format env var checking --
helper/logging internally looks for this so it was totally unnecessary
since moving to hclog.
* Allow vault ssh to accept ssh commands in any ssh compatible format
Previously vault ssh required ssh commands to be in the format
`username@hostname <flags> command`. While this works just fine for human
users this breaks a lot of automation workflows and is not compatible
with the options that the ssh client supports.
Motivation
We currently run ansible which uses vault ssh to connect to hosts.
Ansible generates ssh commands with the format `ssh <flags> -o User=username hostname
command`. While this is a valid ssh command it currently breaks with
vault because vault expects the format to be `username@hostname`. To work
around this we currently use a wrapper script to parse the correct username being set
by ansible and translate this into a vault ssh compatible `username@hostname` format
Changes
* You can now specify arguments in any order that ssh client allows. All
arguments are passed directly to the ssh command and the format isn't
modified in any way.
* The username and port are parsed from the specified ssh command. It
will accept all of the options supported by the ssh command and also
will properly prefer `-p` and `user@` if both options are specified.
* The ssh port is only added from the vault credentials if it hasn't
been specified on the command line
* This changes the way policies are reported in audit logs.
Previously, only policies tied to tokens would be reported. This could
make it difficult to perform after-the-fact analysis based on both the
initial response entry and further requests. Now, the full set of
applicable policies from both the token and any derived policies from
Identity are reported.
To keep things consistent, token authentications now also return the
full set of policies in api.Secret.Auth responses, so this both makes it
easier for users to understand their actual full set, and it matches
what the audit logs now report.
* Fix writing to KVv2 root via `kv put`
The check that adds the API path wasn't taking into account the root,
e.g. if it's mounted at `kv`, `kv` and `kv/` would end up creating an
extra copy of the mount path in front, leading to paths like
`kv/data/kv`.
* Output warnings if they come back and fix a panic in metadata_get
* Also add to metadata put/delete
Don't set a default value for the UserKnownHostsFile flag.
Only append `-o UserKnownHostsFile` to the ssh command if it
has been specified by the user or vault ssh has set it based on another
flag (such as flagHostKeyMountPoint)
Fixes https://github.com/hashicorp/vault/issues/4672
This is implementing the same fix that was added for the CA mode for vault
ssh in https://github.com/hashicorp/vault/pull/3922
Using the IP address caused `Host` entries in the ssh_config to not
match anymore meaning you would need to hardcode all of your IP
addresses in your ssh config instead of using DNS to connect to hosts