Commit Graph

34 Commits

Author SHA1 Message Date
Pratyoy Mukhopadhyay fa29e780f0
[NO-TICKET] Upgrade protoc-gen-go to 1.26, upgrade protoc to 3.17.3 (#12171)
* [NO-TICKET] Set protoc-gen-go to 1.23, upgrade protoc to 3.17.3

* [NO-TICKET] Upgrade version of protoc-gen-go to 1.26
2021-07-28 14:51:36 -07:00
swayne275 335e4c3711
Introduce Logical Unrecoverable Error, Use it in Expiration Manager (#11477)
* build out zombie lease system

* add typo for CI

* undo test CI commit

* time equality test isn't working on CI, so let's see what this does...

* add unrecoverable proto error, make proto, go mod vendor

* zombify leases if unrecoverable error, tests

* test fix: somehow pointer in pointer rx is null after pointer rx called

* tweaks based on roy feedback

* improve zombie errors

* update which errors are unrecoverable

* combine zombie logic

* keep subset of zombie lease in memory
2021-05-03 17:56:06 -06:00
Vishal Nayak 3e55e79a3f
Autopilot: Server Stabilization, State and Dead Server Cleanup (#10856)
* k8s doc: update for 0.9.1 and 0.8.0 releases (#10825)

* k8s doc: update for 0.9.1 and 0.8.0 releases

* Update website/content/docs/platform/k8s/helm/configuration.mdx

Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>

Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>

* Autopilot initial commit

* Move autopilot related backend implementations to its own file

* Abstract promoter creation

* Add nil check for health

* Add server state oss no-ops

* Config ext stub for oss

* Make way for non-voters

* s/health/state

* s/ReadReplica/NonVoter

* Add synopsis and description

* Remove struct tags from AutopilotConfig

* Use var for config storage path

* Handle nin-config when reading

* Enable testing autopilot by using inmem cluster

* First passing test

* Only report the server as known if it is present in raft config

* Autopilot defaults to on for all existing and new clusters

* Add locking to some functions

* Persist initial config

* Clarify the command usage doc

* Add health metric for each node

* Fix audit logging issue

* Don't set DisablePerformanceStandby to true in test

* Use node id label for health metric

* Log updates to autopilot config

* Less aggressively consume config loading failures

* Return a mutable config

* Return early from known servers if raft config is unable to be pulled

* Update metrics name

* Reduce log level for potentially noisy log

* Add knob to disable autopilot

* Don't persist if default config is in use

* Autopilot: Dead server cleanup (#10857)

* Dead server cleanup

* Initialize channel in any case

* Fix a bunch of tests

* Fix panic

* Add follower locking in heartbeat tracker

* Add LastContactFailureThreshold to config

* Add log when marking node as dead

* Update follower state locking in heartbeat tracker

* Avoid follower states being nil

* Pull test to its own file

* Add execution status to state response

* Optionally enable autopilot in some tests

* Updates

* Added API function to fetch autopilot configuration

* Add test for default autopilot configuration

* Configuration tests

* Add State API test

* Update test

* Added TestClusterOptions.PhysicalFactoryConfig

* Update locking

* Adjust locking in heartbeat tracker

* s/last_contact_failure_threshold/left_server_last_contact_threshold

* Add disabling autopilot as a core config option

* Disable autopilot in some tests

* s/left_server_last_contact_threshold/dead_server_last_contact_threshold

* Set the lastheartbeat of followers to now when setting up active node

* Don't use config defaults from CLI command

* Remove config file support

* Remove HCL test as well

* Persist only supplied config; merge supplied config with default to operate

* Use pointer to structs for storing follower information

* Test update

* Retrieve non voter status from configbucket and set it up when a node comes up

* Manage desired suffrage

* Consider bucket being created already

* Move desired suffrage to its own entry

* s/DesiredSuffrageKey/LocalNodeConfigKey

* s/witnessSuffrage/recordSuffrage

* Fix test compilation

* Handle local node config post a snapshot install

* Commit to storage first; then record suffrage in fsm

* No need of local node config being nili case, post snapshot restore

* Reconcile autopilot config when a new leader takes over duty

* Grab fsm lock when recording suffrage

* s/Suffrage/DesiredSuffrage in FollowerState

* Instantiate autopilot only in leader

* Default to old ways in more scenarios

* Make API gracefully handle 404

* Address some feedback

* Make IsDead an atomic.Value

* Simplify follower hearbeat tracking

* Use uber.atomic

* Don't have multiple causes for having autopilot disabled

* Don't remove node from follower states if we fail to remove the dead server

* Autopilot server removals map (#11019)

* Don't remove node from follower states if we fail to remove the dead server

* Use map to track dead server removals

* Use lock and map

* Use delegate lock

* Adjust when to remove entry from map

* Only hold the lock while accessing map

* Fix race

* Don't set default min_quorum

* Fix test

* Ensure follower states is not nil before starting autopilot

* Fix race

Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
2021-03-03 13:59:50 -05:00
Brian Kassouf 3f30fc5f4e
Port changes from enterprise lease fix (#10020) 2020-09-22 14:47:13 -07:00
Josh Black 38fc012817
Backport the pieces of the replication API changes (#9425) 2020-07-09 15:11:37 -07:00
Scott Miller e92f8f5a81
Revert global plugin reload commits (#9344)
* Revert "Some of the OSS changes were clobbered when merging with quotas out of, master (#9343)"

This reverts commit 8719a9b7c4d6ca7afb2e0a85e7c570cc17081f41.

* Revert "OSS side of Global Plugin Reload (#9340)"

This reverts commit f98afb998ae50346849050e882b6be50807983ad.
2020-06-29 17:36:22 -05:00
Scott Miller a83fe0fc6d
OSS side of Global Plugin Reload (#9340)
* OSS side of Global Plugin Reload

* changelog++
2020-06-29 16:23:28 -05:00
Jeff Mitchell b4f5d38916
Update to latest go-kms-wrapping and fix protos/etcd (#8996) 2020-05-14 18:45:10 -04:00
Jeff Mitchell d810758ca2
Rerun proto gen as some got gen'd with old proto version (#7090) 2019-07-09 01:02:20 +02:00
Brian Kassouf ed14061578
Raft Storage Backend (#6888)
* Work on raft backend

* Add logstore locally

* Add encryptor and unsealable interfaces

* Add clustering support to raft

* Remove client and handler

* Bootstrap raft on init

* Cleanup raft logic a bit

* More raft work

* Work on TLS config

* More work on bootstrapping

* Fix build

* More work on bootstrapping

* More bootstrapping work

* fix build

* Remove consul dep

* Fix build

* merged oss/master into raft-storage

* Work on bootstrapping

* Get bootstrapping to work

* Clean up FMS and node-id

* Update local node ID logic

* Cleanup node-id change

* Work on snapshotting

* Raft: Add remove peer API (#906)

* Add remove peer API

* Add some comments

* Fix existing snapshotting (#909)

* Raft get peers API (#912)

* Read raft configuration

* address review feedback

* Use the Leadership Transfer API to step-down the active node (#918)

* Raft join and unseal using Shamir keys (#917)

* Raft join using shamir

* Store AEAD instead of master key

* Split the raft join process to answer the challenge after a successful unseal

* get the follower to standby state

* Make unseal work

* minor changes

* Some input checks

* reuse the shamir seal access instead of new default seal access

* refactor joinRaftSendAnswer function

* Synchronously send answer in auto-unseal case

* Address review feedback

* Raft snapshots (#910)

* Fix existing snapshotting

* implement the noop snapshotting

* Add comments and switch log libraries

* add some snapshot tests

* add snapshot test file

* add TODO

* More work on raft snapshotting

* progress on the ConfigStore strategy

* Don't use two buckets

* Update the snapshot store logic to hide the file logic

* Add more backend tests

* Cleanup code a bit

* [WIP] Raft recovery (#938)

* Add recovery functionality

* remove fmt.Printfs

* Fix a few fsm bugs

* Add max size value for raft backend (#942)

* Add max size value for raft backend

* Include physical.ErrValueTooLarge in the message

* Raft snapshot Take/Restore API  (#926)

* Inital work on raft snapshot APIs

* Always redirect snapshot install/download requests

* More work on the snapshot APIs

* Cleanup code a bit

* On restore handle special cases

* Use the seal to encrypt the sha sum file

* Add sealer mechanism and fix some bugs

* Call restore while state lock is held

* Send restore cb trigger through raft log

* Make error messages nicer

* Add test helpers

* Add snapshot test

* Add shamir unseal test

* Add more raft snapshot API tests

* Fix locking

* Change working to initalize

* Add underlying raw object to test cluster core

* Move leaderUUID to core

* Add raft TLS rotation logic (#950)

* Add TLS rotation logic

* Cleanup logic a bit

* Add/Remove from follower state on add/remove peer

* add comments

* Update more comments

* Update request_forwarding_service.proto

* Make sure we populate all nodes in the followerstate obj

* Update times

* Apply review feedback

* Add more raft config setting (#947)

* Add performance config setting

* Add more config options and fix tests

* Test Raft Recovery (#944)

* Test raft recovery

* Leave out a node during recovery

* remove unused struct

* Update physical/raft/snapshot_test.go

* Update physical/raft/snapshot_test.go

* fix vendoring

* Switch to new raft interface

* Remove unused files

* Switch a gogo -> proto instance

* Remove unneeded vault dep in go.sum

* Update helper/testhelpers/testhelpers.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* Update vault/cluster/cluster.go

* track active key within the keyring itself (#6915)

* track active key within the keyring itself

* lookup and store using the active key ID

* update docstring

* minor refactor

* Small text fixes (#6912)

* Update physical/raft/raft.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* review feedback

* Move raft logical system into separate file

* Update help text a bit

* Enforce cluster addr is set and use it for raft bootstrapping

* Fix tests

* fix http test panic

* Pull in latest raft-snapshot library

* Add comment
2019-06-20 12:14:58 -07:00
Clint b55303eddb
Combined Database Backend: Static Accounts (#6834)
* Add priority queue to sdk

* fix issue of storing pointers and now copy

* update to use copy structure

* Remove file, put Item struct def. into other file

* add link

* clean up docs

* refactor internal data structure to hide heap method implementations. Other cleanup after feedback

* rename PushItem and PopItem to just Push/Pop, after encapsulating the heap methods

* updates after feedback

* refactoring/renaming

* guard against pushing a nil item

* minor updates after feedback

* Add SetCredentials, GenerateCredentials gRPC methods to combined database backend gPRC

* Initial Combined database backend implementation of static accounts and automatic rotation

* vendor updates

* initial implementation of static accounts with Combined database backend, starting with PostgreSQL implementation

* add lock and setup of rotation queue

* vendor the queue

* rebase on new method signature of queue

* remove mongo tests for now

* update default role sql

* gofmt after rebase

* cleanup after rebasing to remove checks for ErrNotFound error

* rebase cdcr-priority-queue

* vendor dependencies with 'go mod vendor'

* website database docs for Static Role support

* document the rotate-role API endpoint

* postgres specific static role docs

* use constants for paths

* updates from review

* remove dead code

* combine and clarify error message for older plugins

* Update builtin/logical/database/backend.go

Co-Authored-By: Jim Kalafut <jim@kalafut.net>

* cleanups from feedback

* code and comment cleanups

* move db.RLock higher to protect db.GenerateCredentials call

* Return output with WALID if we failed to delete the WAL

* Update builtin/logical/database/path_creds_create.go

Co-Authored-By: Jim Kalafut <jim@kalafut.net>

* updates after running 'make fmt'

* update after running 'make proto'

* Update builtin/logical/database/path_roles.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update builtin/logical/database/path_roles.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* update comment and remove and rearrange some dead code

* Update website/source/api/secret/databases/index.html.md

Co-Authored-By: Jim Kalafut <jim@kalafut.net>

* cleanups after review

* Update sdk/database/dbplugin/grpc_transport.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* code cleanup after feedback

* remove PasswordLastSet; it's not used

* document GenerateCredentials and SetCredentials

* Update builtin/logical/database/path_rotate_credentials.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* wrap pop and popbykey in backend methods to protect against nil cred rotation queue

* use strings.HasPrefix instead of direct equality check for path

* Forgot to commit this

* updates after feedback

* re-purpose an outdated test to now check that static and dynamic roles cannot share a name

* check for unique name across dynamic and static roles

* refactor loadStaticWALs to return a map of name/setCredentialsWAL struct to consolidate where we're calling set credentials

* remove commented out code

* refactor to have loadstaticwals filter out wals for roles that no longer exist

* return error if nil input given

* add nil check for input into setStaticAccount

* Update builtin/logical/database/path_roles.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* add constant for queue tick time in seconds, used for comparrison in updates

* Update builtin/logical/database/path_roles.go

Co-Authored-By: Jim Kalafut <jim@kalafut.net>

* code cleanup after review

* remove misplaced code comment

* remove commented out code

* create a queue in the Factory method, even if it's never used

* update path_roles to use a common set of fields, with specific overrides for dynamic/static roles by type

* document new method

* move rotation things into a specific file

* rename test file and consolidate some static account tests

* Update builtin/logical/database/path_roles.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update builtin/logical/database/rotation.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update builtin/logical/database/rotation.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update builtin/logical/database/rotation.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update builtin/logical/database/rotation.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update builtin/logical/database/rotation.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* update code comments, method names, and move more methods into rotation.go

* update comments to be capitalized

* remove the item from the queue before we try to destroy it

* findStaticWAL returns an error

* use lowercase keys when encoding WAL entries

* small cleanups

* remove vestigial static account check

* remove redundant DeleteWAL call in populate queue

* if we error on loading role, push back to queue with 10 second backoff

* poll in initqueue to make sure the backend is setup and can write/delete data

* add revoke_user_on_delete flag to allow users to opt-in to revoking the static database user on delete of the Vault role. Default false

* add code comments on read-only loop

* code comment updates

* re-push if error returned from find static wal

* add locksutil and acquire locks when pop'ing from the queue

* grab exclusive locks for updating static roles

* Add SetCredentials and GenerateCredentials stubs to mockPlugin

* add a switch in initQueue to listen for cancelation

* remove guard on zero time, it should have no affect

* create a new context in Factory to pass on and use for closing the backend queue

* restore master copy of vendor dir
2019-06-19 14:45:39 -05:00
Jeff Mitchell ec3ea45858
Update grpc and protos (#6725)
gRPC updated to 1.20.1 which fixes a couple of important bugs.

Updates protos as well.
2019-05-13 12:09:30 -04:00
Jeff Mitchell 1b5155080b Update protobufs, sdk changes 2019-04-12 23:13:14 -04:00
Jeff Mitchell 5f249d4005
Add allowed_response_headers (#6115) 2019-02-05 16:02:15 -05:00
Jeff Mitchell bbc1d53a5d Revert "Refactor common token fields and operations into a helper (#5953)"
This reverts commit 66c226c593bb1cd48cfd8364ac8510cb42b7d67a.
2019-02-01 11:23:40 -05:00
Jeff Mitchell 85a560abba
Refactor common token fields and operations into a helper (#5953) 2019-01-30 16:23:28 -05:00
Jim Kalafut d0e2badbae Run goimports across the repository (#6010)
The result will still pass gofmtcheck and won't trigger additional
changes if someone isn't using goimports, but it will avoid the
piecemeal imports changes we've been seeing.
2019-01-08 16:48:57 -08:00
Jeff Mitchell 33065a60db Fix compilation/protobuf 2018-09-22 17:58:39 -04:00
Jeff Mitchell 919b968c27
The big one (#5346) 2018-09-17 23:03:00 -04:00
Jeff Mitchell 28890ee198 Make proto
It appears the only thing that actually change is that the tag got
`proto3` values added.
2018-07-10 20:49:48 -04:00
Chris Hoffman 5344b7c5ae
adding option go_package to protos (#4687)
* adding option go_package to protos

* switching proto output dir to relative paths
2018-06-04 10:19:26 -04:00
Brian Kassouf 893d874291 Update proto files (#4651) 2018-05-29 18:23:51 -04:00
Jeff Mitchell c0ed57feae
Revert "proto changes (#4503)" (#4504)
This reverts commit 14594bd76e04ff09c442738800be5fdebc45512f.
2018-05-03 15:38:53 -04:00
Vishal Nayak 7549ea0d12
proto changes (#4503) 2018-05-03 15:23:14 -04:00
Becca Petrin d51acbde68
New proto version (#4501) 2018-05-03 10:19:39 -07:00
Jeff Mitchell a71c74aa3f
Add replication state to EchoReply (#3810) 2018-01-17 22:17:47 -05:00
Jeff Mitchell a80791ce7e Update proto 2017-10-23 17:45:05 -04:00
Jeff Mitchell 47dae8ffc7 Sync 2017-10-23 14:59:37 -04:00
Jeff Mitchell bbe27aaedf Add heartbeating and cluster address sharing to request forwarding (#2762) 2017-05-24 15:06:56 -04:00
Jeff Mitchell 0d4e7fba69 Remove non-gRPC request forwarding 2017-05-24 09:34:59 -04:00
Jeff Mitchell 6c02e9357a Update protos 2017-02-02 16:20:32 -05:00
Jeff Mitchell 336dfed5c3 Rename gRPC request forwarding method 2017-01-06 17:08:43 -05:00
Jeff Mitchell ee29b329fb Bump proto files after update 2016-11-17 10:06:26 -05:00
Jeff Mitchell b7acf5b5ab Rename proto service stuff and change log levels for some messages 2016-08-19 11:49:25 -04:00