* Boost max_operations to the greater of that specified or absoluteMinOperations
* Forward rotation config requests to the primary
* Reject rotation configs outside the min/max range
* Minor wording fix
* Make sure we sanitize the rotation config on each clone
* Add regression test for missing rotation config
* use Equals
* simplify
Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
* k8s doc: update for 0.9.1 and 0.8.0 releases (#10825)
* k8s doc: update for 0.9.1 and 0.8.0 releases
* Update website/content/docs/platform/k8s/helm/configuration.mdx
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
* Autopilot initial commit
* Move autopilot related backend implementations to its own file
* Abstract promoter creation
* Add nil check for health
* Add server state oss no-ops
* Config ext stub for oss
* Make way for non-voters
* s/health/state
* s/ReadReplica/NonVoter
* Add synopsis and description
* Remove struct tags from AutopilotConfig
* Use var for config storage path
* Handle nin-config when reading
* Enable testing autopilot by using inmem cluster
* First passing test
* Only report the server as known if it is present in raft config
* Autopilot defaults to on for all existing and new clusters
* Add locking to some functions
* Persist initial config
* Clarify the command usage doc
* Add health metric for each node
* Fix audit logging issue
* Don't set DisablePerformanceStandby to true in test
* Use node id label for health metric
* Log updates to autopilot config
* Less aggressively consume config loading failures
* Return a mutable config
* Return early from known servers if raft config is unable to be pulled
* Update metrics name
* Reduce log level for potentially noisy log
* Add knob to disable autopilot
* Don't persist if default config is in use
* Autopilot: Dead server cleanup (#10857)
* Dead server cleanup
* Initialize channel in any case
* Fix a bunch of tests
* Fix panic
* Add follower locking in heartbeat tracker
* Add LastContactFailureThreshold to config
* Add log when marking node as dead
* Update follower state locking in heartbeat tracker
* Avoid follower states being nil
* Pull test to its own file
* Add execution status to state response
* Optionally enable autopilot in some tests
* Updates
* Added API function to fetch autopilot configuration
* Add test for default autopilot configuration
* Configuration tests
* Add State API test
* Update test
* Added TestClusterOptions.PhysicalFactoryConfig
* Update locking
* Adjust locking in heartbeat tracker
* s/last_contact_failure_threshold/left_server_last_contact_threshold
* Add disabling autopilot as a core config option
* Disable autopilot in some tests
* s/left_server_last_contact_threshold/dead_server_last_contact_threshold
* Set the lastheartbeat of followers to now when setting up active node
* Don't use config defaults from CLI command
* Remove config file support
* Remove HCL test as well
* Persist only supplied config; merge supplied config with default to operate
* Use pointer to structs for storing follower information
* Test update
* Retrieve non voter status from configbucket and set it up when a node comes up
* Manage desired suffrage
* Consider bucket being created already
* Move desired suffrage to its own entry
* s/DesiredSuffrageKey/LocalNodeConfigKey
* s/witnessSuffrage/recordSuffrage
* Fix test compilation
* Handle local node config post a snapshot install
* Commit to storage first; then record suffrage in fsm
* No need of local node config being nili case, post snapshot restore
* Reconcile autopilot config when a new leader takes over duty
* Grab fsm lock when recording suffrage
* s/Suffrage/DesiredSuffrage in FollowerState
* Instantiate autopilot only in leader
* Default to old ways in more scenarios
* Make API gracefully handle 404
* Address some feedback
* Make IsDead an atomic.Value
* Simplify follower hearbeat tracking
* Use uber.atomic
* Don't have multiple causes for having autopilot disabled
* Don't remove node from follower states if we fail to remove the dead server
* Autopilot server removals map (#11019)
* Don't remove node from follower states if we fail to remove the dead server
* Use map to track dead server removals
* Use lock and map
* Use delegate lock
* Adjust when to remove entry from map
* Only hold the lock while accessing map
* Fix race
* Don't set default min_quorum
* Fix test
* Ensure follower states is not nil before starting autopilot
* Fix race
Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
* sketch out partial month activity log client API
* unit test partialMonthClientCount
* cleanup api
* add api doc, fix test, update api nomenclature to match existing
* cleanup
* add PR changelog file
* integration test for API
* report entities and tokens separately
* upgrade vault dependency set
* etcd and grpc issues:
* better for tests
* testing
* all upgrades for hashicorp deps
* kubernetes plugin upgrade seems to work
* kubernetes plugin upgrade seems to work
* etcd and a bunch of other stuff
* all vulnerable packages upgraded
* k8s is broken in linux env but not locally
* test fixes
* fix testing
* fix etcd and grpc
* fix etcd and grpc
* use master branch of go-testing-interface
* roll back etcd upgrade
* have to fix grpc since other vendors pull in grpc 1.35.0 but we cant due to etcd
* rolling back in the replace directives
* a few more testing dependencies to clean up
* fix go mod vendor
* basic pool and start testing
* refactor a bit for testing
* workFunc, start/stop safety, testing
* cleanup function for worker quit, more tests
* redo public/private members
* improve tests, export types, switch uuid package
* fix loop capture bug, cleanup
* cleanup tests
* update worker pool file name, other improvements
* add job manager prototype
* remove remnants
* add functions to wait for job manager and worker pool to stop, other fixes
* test job manager functionality, fix bugs
* encapsulate how jobs are distributed to workers
* make worker job channel read only
* add job interface, more testing, fixes
* set name for dispatcher
* fix test races
* wire up expiration manager most of the way
* dispatcher and job manager constructors don't return errors
* logger now dependency injected
* make some members private, test fcn to get worker pool size
* make GetNumWorkers public
* Update helper/fairshare/jobmanager_test.go
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
* update fairsharing usage, add tests
* make workerpool private
* remove custom worker names
* concurrency improvements
* remove worker pool cleanup function
* remove cleanup func from job manager, remove non blocking stop from fairshare
* update job manager for new constructor
* stop job manager when expiration manager stopped
* unset env var after test
* stop fairshare when started in tests
* stop leaking job manager goroutine
* prototype channel for waking up to assign work
* fix typo/bug and add tests
* improve job manager wake up, fix test typo
* put channel drain back
* better start/pause test for job manager
* comment cleanup
* degrade possible noisy log
* remove closure, clean up context
* improve revocation context timer
* test: reduce number of revocation workers during many tests
* Update vault/expiration.go
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
* feedback tweaks
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
* Updates identity/group to allow updating a group by name (#10223)
* Now that lookup by name is outside handleGroupUpdateCommon, do not
use the second name lookup as the object to update.
* Added changelog.
Co-authored-by: dr-db <25711615+dr-db@users.noreply.github.com>
* Add NIST guidance on rotating keys used for AES-GCM encryption
* Capture more places barrier encryption is used
* spacing issue
* Probabilistically track an estimated encryption count by key term
* Un-reorder imports
* wip
* get rid of sampling