* Login MFA
* ENT OSS segragation (#14088)
* Delete method id if not used in an MFA enforcement config (#14063)
* Delete an MFA methodID only if it is not used by an MFA enforcement config
* Fixing a bug: mfa/validate is an unauthenticated path, and goes through the handleLoginRequest path
* adding use_passcode field to DUO config (#14059)
* add changelog
* preventing replay attack on MFA passcodes (#14056)
* preventing replay attack on MFA passcodes
* using %w instead of %s for error
* Improve CLI command for login mfa (#14106)
CLI prints a warning message indicating the login request needs to get validated
* adding the validity period of a passcode to error messages (#14115)
* PR feedback
* duo to handle preventing passcode reuse
Co-authored-by: hghaf099 <83242695+hghaf099@users.noreply.github.com>
Co-authored-by: hamid ghaf <hamid@hashicorp.com>
* feat: DB plugin multiplexing (#13734)
* WIP: start from main and get a plugin runner from core
* move MultiplexedClient map to plugin catalog
- call sys.NewPluginClient from PluginFactory
- updates to getPluginClient
- thread through isMetadataMode
* use go-plugin ClientProtocol interface
- call sys.NewPluginClient from dbplugin.NewPluginClient
* move PluginSets to dbplugin package
- export dbplugin HandshakeConfig
- small refactor of PluginCatalog.getPluginClient
* add removeMultiplexedClient; clean up on Close()
- call client.Kill from plugin catalog
- set rpcClient when muxed client exists
* add ID to dbplugin.DatabasePluginClient struct
* only create one plugin process per plugin type
* update NewPluginClient to return connection ID to sdk
- wrap grpc.ClientConn so we can inject the ID into context
- get ID from context on grpc server
* add v6 multiplexing protocol version
* WIP: backwards compat for db plugins
* Ensure locking on plugin catalog access
- Create public GetPluginClient method for plugin catalog
- rename postgres db plugin
* use the New constructor for db plugins
* grpc server: use write lock for Close and rlock for CRUD
* cleanup MultiplexedClients on Close
* remove TODO
* fix multiplexing regression with grpc server connection
* cleanup grpc server instances on close
* embed ClientProtocol in Multiplexer interface
* use PluginClientConfig arg to make NewPluginClient plugin type agnostic
* create a new plugin process for non-muxed plugins
* feat: plugin multiplexing: handle plugin client cleanup (#13896)
* use closure for plugin client cleanup
* log and return errors; add comments
* move rpcClient wrapping to core for ID injection
* refactor core plugin client and sdk
* remove unused ID method
* refactor and only wrap clientConn on multiplexed plugins
* rename structs and do not export types
* Slight refactor of system view interface
* Revert "Slight refactor of system view interface"
This reverts commit 73d420e5cd2f0415e000c5a9284ea72a58016dd6.
* Revert "Revert "Slight refactor of system view interface""
This reverts commit f75527008a1db06d04a23e04c3059674be8adb5f.
* only provide pluginRunner arg to the internal newPluginClient method
* embed ClientProtocol in pluginClient and name logger
* Add back MLock support
* remove enableMlock arg from setupPluginCatalog
* rename plugin util interface to PluginClient
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
* feature: multiplexing: fix unit tests (#14007)
* fix grpc_server tests and add coverage
* update run_config tests
* add happy path test case for grpc_server ID from context
* update test helpers
* feat: multiplexing: handle v5 plugin compiled with new sdk
* add mux supported flag and increase test coverage
* set multiplexingSupport field in plugin server
* remove multiplexingSupport field in sdk
* revert postgres to non-multiplexed
* add comments on grpc server fields
* use pointer receiver on grpc server methods
* add changelog
* use pointer for grpcserver instance
* Use a gRPC server to determine if a plugin should be multiplexed
* Apply suggestions from code review
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
* add lock to removePluginClient
* add multiplexingSupport field to externalPlugin struct
* do not send nil to grpc MultiplexingSupport
* check err before logging
* handle locking scenario for cleanupFunc
* allow ServeConfigMultiplex to dispense v5 plugin
* reposition structs, add err check and comments
* add comment on locking for cleanupExternalPlugin
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
* build out zombie lease system
* add typo for CI
* undo test CI commit
* time equality test isn't working on CI, so let's see what this does...
* add unrecoverable proto error, make proto, go mod vendor
* zombify leases if unrecoverable error, tests
* test fix: somehow pointer in pointer rx is null after pointer rx called
* tweaks based on roy feedback
* improve zombie errors
* update which errors are unrecoverable
* combine zombie logic
* keep subset of zombie lease in memory
* k8s doc: update for 0.9.1 and 0.8.0 releases (#10825)
* k8s doc: update for 0.9.1 and 0.8.0 releases
* Update website/content/docs/platform/k8s/helm/configuration.mdx
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
* Autopilot initial commit
* Move autopilot related backend implementations to its own file
* Abstract promoter creation
* Add nil check for health
* Add server state oss no-ops
* Config ext stub for oss
* Make way for non-voters
* s/health/state
* s/ReadReplica/NonVoter
* Add synopsis and description
* Remove struct tags from AutopilotConfig
* Use var for config storage path
* Handle nin-config when reading
* Enable testing autopilot by using inmem cluster
* First passing test
* Only report the server as known if it is present in raft config
* Autopilot defaults to on for all existing and new clusters
* Add locking to some functions
* Persist initial config
* Clarify the command usage doc
* Add health metric for each node
* Fix audit logging issue
* Don't set DisablePerformanceStandby to true in test
* Use node id label for health metric
* Log updates to autopilot config
* Less aggressively consume config loading failures
* Return a mutable config
* Return early from known servers if raft config is unable to be pulled
* Update metrics name
* Reduce log level for potentially noisy log
* Add knob to disable autopilot
* Don't persist if default config is in use
* Autopilot: Dead server cleanup (#10857)
* Dead server cleanup
* Initialize channel in any case
* Fix a bunch of tests
* Fix panic
* Add follower locking in heartbeat tracker
* Add LastContactFailureThreshold to config
* Add log when marking node as dead
* Update follower state locking in heartbeat tracker
* Avoid follower states being nil
* Pull test to its own file
* Add execution status to state response
* Optionally enable autopilot in some tests
* Updates
* Added API function to fetch autopilot configuration
* Add test for default autopilot configuration
* Configuration tests
* Add State API test
* Update test
* Added TestClusterOptions.PhysicalFactoryConfig
* Update locking
* Adjust locking in heartbeat tracker
* s/last_contact_failure_threshold/left_server_last_contact_threshold
* Add disabling autopilot as a core config option
* Disable autopilot in some tests
* s/left_server_last_contact_threshold/dead_server_last_contact_threshold
* Set the lastheartbeat of followers to now when setting up active node
* Don't use config defaults from CLI command
* Remove config file support
* Remove HCL test as well
* Persist only supplied config; merge supplied config with default to operate
* Use pointer to structs for storing follower information
* Test update
* Retrieve non voter status from configbucket and set it up when a node comes up
* Manage desired suffrage
* Consider bucket being created already
* Move desired suffrage to its own entry
* s/DesiredSuffrageKey/LocalNodeConfigKey
* s/witnessSuffrage/recordSuffrage
* Fix test compilation
* Handle local node config post a snapshot install
* Commit to storage first; then record suffrage in fsm
* No need of local node config being nili case, post snapshot restore
* Reconcile autopilot config when a new leader takes over duty
* Grab fsm lock when recording suffrage
* s/Suffrage/DesiredSuffrage in FollowerState
* Instantiate autopilot only in leader
* Default to old ways in more scenarios
* Make API gracefully handle 404
* Address some feedback
* Make IsDead an atomic.Value
* Simplify follower hearbeat tracking
* Use uber.atomic
* Don't have multiple causes for having autopilot disabled
* Don't remove node from follower states if we fail to remove the dead server
* Autopilot server removals map (#11019)
* Don't remove node from follower states if we fail to remove the dead server
* Use map to track dead server removals
* Use lock and map
* Use delegate lock
* Adjust when to remove entry from map
* Only hold the lock while accessing map
* Fix race
* Don't set default min_quorum
* Fix test
* Ensure follower states is not nil before starting autopilot
* Fix race
Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
* Work on raft backend
* Add logstore locally
* Add encryptor and unsealable interfaces
* Add clustering support to raft
* Remove client and handler
* Bootstrap raft on init
* Cleanup raft logic a bit
* More raft work
* Work on TLS config
* More work on bootstrapping
* Fix build
* More work on bootstrapping
* More bootstrapping work
* fix build
* Remove consul dep
* Fix build
* merged oss/master into raft-storage
* Work on bootstrapping
* Get bootstrapping to work
* Clean up FMS and node-id
* Update local node ID logic
* Cleanup node-id change
* Work on snapshotting
* Raft: Add remove peer API (#906)
* Add remove peer API
* Add some comments
* Fix existing snapshotting (#909)
* Raft get peers API (#912)
* Read raft configuration
* address review feedback
* Use the Leadership Transfer API to step-down the active node (#918)
* Raft join and unseal using Shamir keys (#917)
* Raft join using shamir
* Store AEAD instead of master key
* Split the raft join process to answer the challenge after a successful unseal
* get the follower to standby state
* Make unseal work
* minor changes
* Some input checks
* reuse the shamir seal access instead of new default seal access
* refactor joinRaftSendAnswer function
* Synchronously send answer in auto-unseal case
* Address review feedback
* Raft snapshots (#910)
* Fix existing snapshotting
* implement the noop snapshotting
* Add comments and switch log libraries
* add some snapshot tests
* add snapshot test file
* add TODO
* More work on raft snapshotting
* progress on the ConfigStore strategy
* Don't use two buckets
* Update the snapshot store logic to hide the file logic
* Add more backend tests
* Cleanup code a bit
* [WIP] Raft recovery (#938)
* Add recovery functionality
* remove fmt.Printfs
* Fix a few fsm bugs
* Add max size value for raft backend (#942)
* Add max size value for raft backend
* Include physical.ErrValueTooLarge in the message
* Raft snapshot Take/Restore API (#926)
* Inital work on raft snapshot APIs
* Always redirect snapshot install/download requests
* More work on the snapshot APIs
* Cleanup code a bit
* On restore handle special cases
* Use the seal to encrypt the sha sum file
* Add sealer mechanism and fix some bugs
* Call restore while state lock is held
* Send restore cb trigger through raft log
* Make error messages nicer
* Add test helpers
* Add snapshot test
* Add shamir unseal test
* Add more raft snapshot API tests
* Fix locking
* Change working to initalize
* Add underlying raw object to test cluster core
* Move leaderUUID to core
* Add raft TLS rotation logic (#950)
* Add TLS rotation logic
* Cleanup logic a bit
* Add/Remove from follower state on add/remove peer
* add comments
* Update more comments
* Update request_forwarding_service.proto
* Make sure we populate all nodes in the followerstate obj
* Update times
* Apply review feedback
* Add more raft config setting (#947)
* Add performance config setting
* Add more config options and fix tests
* Test Raft Recovery (#944)
* Test raft recovery
* Leave out a node during recovery
* remove unused struct
* Update physical/raft/snapshot_test.go
* Update physical/raft/snapshot_test.go
* fix vendoring
* Switch to new raft interface
* Remove unused files
* Switch a gogo -> proto instance
* Remove unneeded vault dep in go.sum
* Update helper/testhelpers/testhelpers.go
Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>
* Update vault/cluster/cluster.go
* track active key within the keyring itself (#6915)
* track active key within the keyring itself
* lookup and store using the active key ID
* update docstring
* minor refactor
* Small text fixes (#6912)
* Update physical/raft/raft.go
Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>
* review feedback
* Move raft logical system into separate file
* Update help text a bit
* Enforce cluster addr is set and use it for raft bootstrapping
* Fix tests
* fix http test panic
* Pull in latest raft-snapshot library
* Add comment