open-vault

Commit Graph

Author	SHA1	Message	Date
miagilepner	e873932bce	VAULT-8436 remove <-time.After statements in for loops (#18818 ) * replace time.After with ticker in loops * add semgrep rule * update to use timers * remove stop	2023-02-06 17:49:01 +01:00
Josh Black	c9763996d4	Enable undo logs by default (#18692 ) * Enable undo logs by default * add consul test * update go.mod/sum * add a better non-existent key	2023-01-17 13:38:18 -08:00
Tom Proctor	dc85e37cf4	storage/raft: Add retry_join_as_non_voter config option (#18030 )	2022-11-18 17:58:16 +00:00
Josh Black	ad1503ebcd	disable undo logs by default for 1.12.0 (#17453 )	2022-10-07 08:47:40 -07:00
Josh Black	c45c6e51c0	only enable undo logs if all cluster members support it (#17378 )	2022-10-06 11:24:16 -07:00
Nick Cabatoff	559754d580	Break grabLockOrStop into two pieces to facilitate investigating deadlocks (#17187 ) Break grabLockOrStop into two pieces to facilitate investigating deadlocks. Without this change, the "grab" goroutine looks the same regardless of who was calling grabLockOrStop, so there's no way to identify one of the deadlock parties.	2022-09-20 11:03:16 -04:00
Nick Cabatoff	3075c5bd65	Do not attempt to write a new TLS keyring at startup if raft is already setup (#17079 )	2022-09-09 12:19:57 -04:00
Nick Cabatoff	5db952eada	autopilot: assume nodes we haven't received heartbeats from are running the same version as we are (#17019 ) OSS parts of ent PR #3172: assume nodes we haven't received heartbeats from are running the same version as we are. Failing to provide a version/upgrade_version will result in Autopilot (on ent) demoting those unversioned nodes to non-voters until we receive a heartbeat from them.	2022-09-06 14:49:04 -04:00
Scott Miller	3bd38fd5dc	OSS portion of wrapper-v2 (#16811 ) * OSS portion of wrapper-v2 * Prefetch barrier type to avoid encountering an error in the simple BarrierType() getter * Rename the OveriddenType to WrapperType and use it for the barrier type prefetch * Fix unit test	2022-08-23 15:37:16 -04:00
Mike Palmiotto	cd1157a905	Vault 7338/fix retry join (#16550 ) * storage/raft: Fix cluster init with retry_join Commit 8db66f4853abce3f432adcf1724b1f237b275415 introduced an error wherein a join() would return nil (no error) with no information on its channel if a joining node had been initialized. This was not handled properly by the caller and resulted in a canceled `retry_join`. Fix this by handling the `nil` channel respone by treating it as an error and allowing the existing mechanics to work as intended. * storage/raft: Improve retry_join go test * storage/raft: Make VerifyRaftPeers pollable * storage/raft: Add changelog entry for retry_join fix * storage/raft: Add description to VerifyRaftPeers	2022-08-03 20:44:57 -05:00
Mike Palmiotto	42900b554b	storage/raft: Make raftInfo atomic (#16565 ) * storage/raft: Make raftInfo atomic This fixes some racy behavior discovered in parallel testing. Change the core struct member to an atomic and update references throughout.	2022-08-03 18:40:49 -04:00
Mike Palmiotto	439e35f50f	Vault 6773/raft rejoin nonvoter (#16324 ) * raft: Ensure init before setting suffrage As reported in https://hashicorp.atlassian.net/browse/VAULT-6773: The /sys/storage/raft/join endpoint is intended to be unauthenticated. We rely on the seal to manage trust. It’s possible to use multiple join requests to switch nodes from voter to non-voter. The screenshot shows a 3 node cluster where vault_2 is the leader, and vault_3 and vault_4 are followers with non-voters set to false. sent two requests to the raft join endpoint to have vault_3 and vault_4 join the cluster with non_voters:true. This commit fixes the issue by delaying the call to SetDesiredSuffrage until after the initialization check, preventing unauthenticated mangling of voter status. Tested locally using https://github.com/hashicorp/vault-tools/blob/main/users/ncabatoff/cluster/raft.sh and the reproducer outlined in VAULT-6773. * raft: Return join err on failure This is necessary to correctly distinguish errors returned from the Join workflow. Previously, errors were being masked as timeouts. * raft: Default autopilot parameters in teststorage Change some defaults so we don't have to pass in parameters or set them in the originating tests. These storage types are only used in two places: 1) Raft HA testing 2) Seal migration testing Both consumers have been tested and pass with this change. * changelog: Unauthn voter status change bugfix	2022-07-18 14:37:12 -04:00
Chris Capurso	9501d44ed5	Add endpoints to provide ability to modify logging verbosity (#16111 ) * add func to set level for specific logger * add endpoints to modify log level * initialize base logger with IndependentLevels * test to ensure other loggers remain unchanged * add DELETE loggers endpoints to revert back to config * add API docs page * add changelog entry * remove extraneous line * add log level field to Core struct * add godoc for getLogLevel * add some loggers to c.allLoggers	2022-06-27 11:39:53 -04:00
Josh Black	416504d8c3	Add autopilot automated upgrades and redundancy zones (#15521 )	2022-05-20 16:49:11 -04:00
Violet Hynes	6d4497bcbf	VAULT-4306 Ensure /raft/bootstrap/challenge call ignores erroneous namespaces set (#15519 ) * VAULT-4306 Ensure /raft/bootstrap/challenge call ignores erroneous namespaces set * VAULT-4306 Add changelog * VAULT-4306 Update changelog/15519.txt Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com> Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>	2022-05-19 16:27:51 -04:00
Chris Capurso	cc531c793d	fix raft tls key rotation panic when rotation time in past (#15156 ) * fix raft tls key rotation panic when rotation time in past * add changelog entry * push out next raft TLS rotation time in case close to elapsing * consolidate tls key rotation duration calculation * reduce raft getNextRotationTime padding to 10 seconds * move tls rotation ticker reset to where its duration is calculated	2022-04-25 21:48:34 -04:00
hghaf099	aafb5d6427	VAULT-4240 time.After() in a select statement can lead to memory leak (#14814 ) * VAULT-4240 time.After() in a select statement can lead to memory leak * CL	2022-04-01 10:17:11 -04:00
Pratyoy Mukhopadhyay	69c22b8078	Fix raft paralle retry bug (#14303 )	2022-02-28 10:38:34 -08:00
Nick Cabatoff	400996ef0d	Parallel retry join (#13606 )	2022-01-17 10:33:03 -05:00
Jim Kalafut	22c4ae5933	Rename master key to root key (#13324 ) * See what it looks like to replace "master key" with "root key". There are two places that would require more challenging code changes: the storage path `core/master`, and its contents (the JSON-serialized EncodedKeyringtructure.) * Restore accidentally deleted line * Add changelog * Update root->recovery * Fix test Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>	2021-12-06 17:12:20 -08:00
Daniel Kimsey	b4b61efc75	Auto-join support for IPv6 discovery (#12366 ) * Auto-join support for IPv6 discovery The go-discover library returns IP addresses and not URLs. It just so happens net.URL parses "127.0.0.1", which isn't a valid URL. Instead, we construct the URL ourselves. Being careful to check if it's an ipv6 address and making sure it's in explicit form if so. Fixes #12323 * feedback: addrs & ipv6 test Rename addrs to clusterIPs to improve clarity and intent Tighten up our IPv6 address detection to be more correct and to ensure it's actually in implicit form	2021-09-07 11:55:07 -07:00
Jeff Mitchell	f7147025dd	Migrate to sdk/internalshared libs in go-secure-stdlib (#12090 ) * Swap sdk/helper libs to go-secure-stdlib * Migrate to go-secure-stdlib reloadutil * Migrate to go-secure-stdlib kv-builder * Migrate to go-secure-stdlib gatedwriter	2021-07-15 20:17:31 -04:00
Vishal Nayak	eecb39a57f	OSS parts of Autopilot in DR secondaries (#12014 )	2021-07-08 12:30:01 -04:00
Nick Cabatoff	01f96f18ce	VAULT-2439: OSS parts of #1889 (raft licensing init) (#11665 )	2021-05-19 16:07:58 -04:00
Brian Kassouf	f498d0d389	Reload raft TLS keys on active startup (#11660 )	2021-05-19 10:03:32 -07:00
Lars Lehtonen	53dd619d2f	vault: deprecate errwrap.Wrapf() (#11577 )	2021-05-11 13:12:54 -04:00
Josh Black	06809930a3	Add HTTP response headers for hostname and raft node ID (if applicable) (#11289 )	2021-04-20 15:25:04 -07:00
Vishal Nayak	4666f40925	Support autopilot when raft is for HA only (#11260 )	2021-04-12 09:33:21 -04:00
Nick Cabatoff	44c00cd54f	Fix: leader_tls_servername raft option only worked when used with mTLS and/or an explicit CA cert. (#11252 )	2021-04-06 09:16:54 -04:00
Nick Cabatoff	41d9030fbb	Disable autopilot in raft-ha mode. (#11181 ) * Disable autopilot in raft-ha mode. * Also don't run autopilot on DR secondaries.	2021-03-23 14:13:44 -07:00
Vishal Nayak	3e55e79a3f	Autopilot: Server Stabilization, State and Dead Server Cleanup (#10856 ) * k8s doc: update for 0.9.1 and 0.8.0 releases (#10825) * k8s doc: update for 0.9.1 and 0.8.0 releases * Update website/content/docs/platform/k8s/helm/configuration.mdx Co-authored-by: Theron Voran <tvoran@users.noreply.github.com> Co-authored-by: Theron Voran <tvoran@users.noreply.github.com> * Autopilot initial commit * Move autopilot related backend implementations to its own file * Abstract promoter creation * Add nil check for health * Add server state oss no-ops * Config ext stub for oss * Make way for non-voters * s/health/state * s/ReadReplica/NonVoter * Add synopsis and description * Remove struct tags from AutopilotConfig * Use var for config storage path * Handle nin-config when reading * Enable testing autopilot by using inmem cluster * First passing test * Only report the server as known if it is present in raft config * Autopilot defaults to on for all existing and new clusters * Add locking to some functions * Persist initial config * Clarify the command usage doc * Add health metric for each node * Fix audit logging issue * Don't set DisablePerformanceStandby to true in test * Use node id label for health metric * Log updates to autopilot config * Less aggressively consume config loading failures * Return a mutable config * Return early from known servers if raft config is unable to be pulled * Update metrics name * Reduce log level for potentially noisy log * Add knob to disable autopilot * Don't persist if default config is in use * Autopilot: Dead server cleanup (#10857) * Dead server cleanup * Initialize channel in any case * Fix a bunch of tests * Fix panic * Add follower locking in heartbeat tracker * Add LastContactFailureThreshold to config * Add log when marking node as dead * Update follower state locking in heartbeat tracker * Avoid follower states being nil * Pull test to its own file * Add execution status to state response * Optionally enable autopilot in some tests * Updates * Added API function to fetch autopilot configuration * Add test for default autopilot configuration * Configuration tests * Add State API test * Update test * Added TestClusterOptions.PhysicalFactoryConfig * Update locking * Adjust locking in heartbeat tracker * s/last_contact_failure_threshold/left_server_last_contact_threshold * Add disabling autopilot as a core config option * Disable autopilot in some tests * s/left_server_last_contact_threshold/dead_server_last_contact_threshold * Set the lastheartbeat of followers to now when setting up active node * Don't use config defaults from CLI command * Remove config file support * Remove HCL test as well * Persist only supplied config; merge supplied config with default to operate * Use pointer to structs for storing follower information * Test update * Retrieve non voter status from configbucket and set it up when a node comes up * Manage desired suffrage * Consider bucket being created already * Move desired suffrage to its own entry * s/DesiredSuffrageKey/LocalNodeConfigKey * s/witnessSuffrage/recordSuffrage * Fix test compilation * Handle local node config post a snapshot install * Commit to storage first; then record suffrage in fsm * No need of local node config being nili case, post snapshot restore * Reconcile autopilot config when a new leader takes over duty * Grab fsm lock when recording suffrage * s/Suffrage/DesiredSuffrage in FollowerState * Instantiate autopilot only in leader * Default to old ways in more scenarios * Make API gracefully handle 404 * Address some feedback * Make IsDead an atomic.Value * Simplify follower hearbeat tracking * Use uber.atomic * Don't have multiple causes for having autopilot disabled * Don't remove node from follower states if we fail to remove the dead server * Autopilot server removals map (#11019) * Don't remove node from follower states if we fail to remove the dead server * Use map to track dead server removals * Use lock and map * Use delegate lock * Adjust when to remove entry from map * Only hold the lock while accessing map * Fix race * Don't set default min_quorum * Fix test * Ensure follower states is not nil before starting autopilot * Fix race Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com> Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>	2021-03-03 13:59:50 -05:00
Vishal Nayak	53cb1deb38	Revert "Read-replica instead of non-voter (#10875 )" (#10890 ) This reverts commit fc745670cf34821f5834357d9caebc3351dbc1e7.	2021-02-10 16:41:58 -05:00
Vishal Nayak	a2394e7353	Read-replica instead of non-voter (#10875 )	2021-02-10 09:58:18 -05:00
Nick Cabatoff	8cbc63d572	Add configuration to specify a TLS ServerName to use in the TLS handshake when performing a raft join. (#10698 )	2021-01-19 17:54:28 -05:00
Nick Cabatoff	84d566db9e	Be consistent with how we report init status. (#10498 ) Also make half-joined raft peers consider storage to be initialized, whether or not they're sealed.	2020-12-08 13:55:34 -05:00
Aleksandr Bezobchuk	95bbd8d920	Merge PR #10192 : Auto-Join: Configurable Scheme & Port (and add k8s provider)	2020-10-23 16:13:09 -04:00
Aleksandr Bezobchuk	d37be9af6e	Merge PR #10095 : Integrated Storage Cloud Auto-Join	2020-10-13 16:26:39 -04:00
Brian Kassouf	fd72d92434	raft: Fix some snapshot restore issues (#9533 ) * raft: Remove double read lock * Reload TLS keyring after reloading the barrier keys	2020-07-21 10:59:07 -07:00
ncabatoff	d2436a9c56	Make standbyStopCh atomic to avoid data races (#9539 )	2020-07-21 08:34:07 -04:00
Calvin Leung Huang	c45bdca0b3	raft: add support for using backend for ha_storage (#9193 ) * raft: initial work on raft ha storage support * add note on join * add todo note * raft: add support for bootstrapping and joining existing nodes * raft: gate bootstrap join by reading leader api address from storage * raft: properly check for raft-only for certain conditionals * raft: add bootstrap to api and cli * raft: fix bootstrap cli command * raft: add test for setting up new cluster with raft HA * raft: extend TestRaft_HA_NewCluster to include inmem and consul backends * raft: add test for updating an existing cluster to use raft HA * raft: remove debug log lines, clean up verifyRaftPeers * raft: minor cleanup * raft: minor cleanup * Update physical/raft/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/ha.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/ha.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/logical_system_raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * address feedback comments * address feedback comments * raft: refactor tls keyring logic * address feedback comments * Update vault/raft.go Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * address feedback comments * testing: fix import ordering * raft: rename var, cleanup comment line * docs: remove ha_storage restriction note on raft * docs: more raft HA interaction updates with migration and recovery mode * docs: update the raft join command * raft: update comments * raft: add missing isRaftHAOnly check for clearing out state set earlier * raft: update a few ha_storage config checks * Update command/operator_raft_bootstrap.go Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com> * raft: address feedback comments * raft: fix panic when checking for config.HAStorage.Type * Update vault/raft.go Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * Update website/pages/docs/commands/operator/raft.mdx Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> * raft: remove bootstrap cli command * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * Update vault/raft.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * raft: address review feedback * raft: revert vendored sdk * raft: don't send applied index and node ID info if we're HA-only Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com> Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>	2020-06-23 12:04:13 -07:00
Brian Kassouf	c8dde052f2	storage/raft: Advertise the configured cluster address (#9008 ) * storage/raft: Advertise the configured cluster address * Don't allow raft to start with unspecified IP * Fix concurrent map write panic * Add test file * changelog++ * changelog++ * changelog++ * Update tcp_layer.go * Update tcp_layer.go * Only set the adverise addr if set	2020-05-18 18:22:25 -07:00
Brian Kassouf	1bb0bd489d	storage/raft: Add committed and applied indexes to the status output (#9011 ) * storage/raft: Add committed and applied indexes to the status output * Update api vendor * changelog++ * Update http/sys_leader.go Co-authored-by: Jim Kalafut <jkalafut@hashicorp.com> Co-authored-by: Jim Kalafut <jkalafut@hashicorp.com>	2020-05-18 16:07:27 -07:00
Brian Kassouf	05eea911bd	storage/raft: Refresh TLS keyring on snapshot restore (#8546 )	2020-03-13 13:39:14 -07:00
Jim Kalafut	f17fc4e5c1	Run goimports (#8251 )	2020-01-27 21:11:00 -08:00
Vishal Nayak	8891f2ba88	Raft retry join (#7856 ) * Raft retry join * update * Make retry join work with shamir seal * Return upon context completion * Update vault/raft.go Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com> * Address some review comments * send leader information slice as a parameter * Make retry join work properly with Shamir case. This commit has a blocking issue * Fix join goroutine exiting before the job is done * Polishing changes * Don't return after a successful join during unseal * Added config parsing test * Add test and fix bugs * minor changes * Address review comments * Fix build error Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>	2020-01-13 17:02:16 -08:00
Jeff Mitchell	a0694943cc	Migrate built in auto seal to go-kms-wrapping (#8118 )	2020-01-10 20:39:52 -05:00
Lexman	c86fe212c0	oss changes for entropy augmentation feature (#7670 ) * oss changes for entropy augmentation feature * fix oss command/server/config tests * update go.sum * fix logical_system and http/ tests * adds vendored files * removes unused variable	2019-10-17 10:33:00 -07:00
Brian Kassouf	024c29c36a	OSS portions of raft non-voters (#7634 ) * OSS portions of raft non-voters * add file * Update vault/raft.go Co-Authored-By: Vishal Nayak <vishalnayak@users.noreply.github.com>	2019-10-11 11:56:59 -07:00
isbric	e6e20e9eb3	Correct spelling of error message (#7630 )	2019-10-11 11:14:41 -04:00
ncabatoff	ed147b7ae7	Make clusterListener an atomic.Value to avoid races with getGRPCDialer. (#7408 )	2019-09-03 11:59:56 -04:00

1 2

55 Commits