Commit graph

671 commits

Author SHA1 Message Date
Vishal Nayak 9839e76192
Remove unneeded fields from state output (#11073) 2021-03-10 12:08:12 -05:00
Vishal Nayak e5b6ec4d05
Reset IsDead upon each heartbeat (#11049) 2021-03-05 19:50:36 -05:00
Vishal Nayak 910b45413b
Handle error (#11039) 2021-03-03 15:55:50 -05:00
Vishal Nayak 3e55e79a3f
Autopilot: Server Stabilization, State and Dead Server Cleanup (#10856)
* k8s doc: update for 0.9.1 and 0.8.0 releases (#10825)

* k8s doc: update for 0.9.1 and 0.8.0 releases

* Update website/content/docs/platform/k8s/helm/configuration.mdx

Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>

Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>

* Autopilot initial commit

* Move autopilot related backend implementations to its own file

* Abstract promoter creation

* Add nil check for health

* Add server state oss no-ops

* Config ext stub for oss

* Make way for non-voters

* s/health/state

* s/ReadReplica/NonVoter

* Add synopsis and description

* Remove struct tags from AutopilotConfig

* Use var for config storage path

* Handle nin-config when reading

* Enable testing autopilot by using inmem cluster

* First passing test

* Only report the server as known if it is present in raft config

* Autopilot defaults to on for all existing and new clusters

* Add locking to some functions

* Persist initial config

* Clarify the command usage doc

* Add health metric for each node

* Fix audit logging issue

* Don't set DisablePerformanceStandby to true in test

* Use node id label for health metric

* Log updates to autopilot config

* Less aggressively consume config loading failures

* Return a mutable config

* Return early from known servers if raft config is unable to be pulled

* Update metrics name

* Reduce log level for potentially noisy log

* Add knob to disable autopilot

* Don't persist if default config is in use

* Autopilot: Dead server cleanup (#10857)

* Dead server cleanup

* Initialize channel in any case

* Fix a bunch of tests

* Fix panic

* Add follower locking in heartbeat tracker

* Add LastContactFailureThreshold to config

* Add log when marking node as dead

* Update follower state locking in heartbeat tracker

* Avoid follower states being nil

* Pull test to its own file

* Add execution status to state response

* Optionally enable autopilot in some tests

* Updates

* Added API function to fetch autopilot configuration

* Add test for default autopilot configuration

* Configuration tests

* Add State API test

* Update test

* Added TestClusterOptions.PhysicalFactoryConfig

* Update locking

* Adjust locking in heartbeat tracker

* s/last_contact_failure_threshold/left_server_last_contact_threshold

* Add disabling autopilot as a core config option

* Disable autopilot in some tests

* s/left_server_last_contact_threshold/dead_server_last_contact_threshold

* Set the lastheartbeat of followers to now when setting up active node

* Don't use config defaults from CLI command

* Remove config file support

* Remove HCL test as well

* Persist only supplied config; merge supplied config with default to operate

* Use pointer to structs for storing follower information

* Test update

* Retrieve non voter status from configbucket and set it up when a node comes up

* Manage desired suffrage

* Consider bucket being created already

* Move desired suffrage to its own entry

* s/DesiredSuffrageKey/LocalNodeConfigKey

* s/witnessSuffrage/recordSuffrage

* Fix test compilation

* Handle local node config post a snapshot install

* Commit to storage first; then record suffrage in fsm

* No need of local node config being nili case, post snapshot restore

* Reconcile autopilot config when a new leader takes over duty

* Grab fsm lock when recording suffrage

* s/Suffrage/DesiredSuffrage in FollowerState

* Instantiate autopilot only in leader

* Default to old ways in more scenarios

* Make API gracefully handle 404

* Address some feedback

* Make IsDead an atomic.Value

* Simplify follower hearbeat tracking

* Use uber.atomic

* Don't have multiple causes for having autopilot disabled

* Don't remove node from follower states if we fail to remove the dead server

* Autopilot server removals map (#11019)

* Don't remove node from follower states if we fail to remove the dead server

* Use map to track dead server removals

* Use lock and map

* Use delegate lock

* Adjust when to remove entry from map

* Only hold the lock while accessing map

* Fix race

* Don't set default min_quorum

* Fix test

* Ensure follower states is not nil before starting autopilot

* Fix race

Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
2021-03-03 13:59:50 -05:00
Nick Cabatoff c1ddfbb538
OSS parts of the new client controlled consistency feature (#10974) 2021-02-24 06:58:10 -05:00
Vishal Nayak 53cb1deb38
Revert "Read-replica instead of non-voter (#10875)" (#10890)
This reverts commit fc745670cf34821f5834357d9caebc3351dbc1e7.
2021-02-10 16:41:58 -05:00
Vishal Nayak a2394e7353
Read-replica instead of non-voter (#10875) 2021-02-10 09:58:18 -05:00
Brian Kassouf 64c4850b45
storage/raft: Fix panic when no Join TLS info is being used (#10801) 2021-01-27 17:21:45 -08:00
Nick Cabatoff 8cbc63d572
Add configuration to specify a TLS ServerName to use in the TLS handshake when performing a raft join. (#10698) 2021-01-19 17:54:28 -05:00
Eugene R 331529fc94
Aerospike storage backend (#10131)
* add an Aerospike storage backend

* go mod vendor

* add Aerospike storage configuration docs

* review fixes

* bump aerospike client to v3.1.1

* rename the defaultHostname variable

* relocate the docs page
2021-01-12 15:26:07 -08:00
Brian Kassouf 264f12e72a
Telemetry consul transaction (#10486)
* Update consul.go

* Update telemetry.mdx
2020-12-03 11:19:08 -08:00
Brian Kassouf 81a86f48e8
Backport some OSS changes (#10267)
* Backport some OSS changes

* go mod vendor
2020-10-29 16:47:34 -07:00
Jonas-Taha El Sesiy b7cf4a05ff
Add support for Managed Identity auth for physical/Azure (#10189)
* Add support for Managed Identity auth for physical/Azure

Obtain OAuth token from IMDS to allow for access to Azure Blob with
short-lived dynamic credentials

Fix #7322

* add tests & update docs/dependencies
2020-10-28 15:04:26 -07:00
Vishal Nayak f832d3da66
OCI: Don't store region in the backend struct (#10248)
* OCI: Don't store region in the backend struct

* Update physical/oci/oci.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>
2020-10-26 19:41:51 -04:00
Jason Witkowski ebfaa551eb
Add ability to specify region for OCI Storage Backend (#9302)
* Add ability to specify region for OCI Storage Backend

* Fix capitalization in Vault documentation

Co-authored-by: Josh Black <raskchanky@users.noreply.github.com>
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-10-26 18:28:32 -04:00
Aleksandr Bezobchuk 95bbd8d920
Merge PR #10192: Auto-Join: Configurable Scheme & Port (and add k8s provider) 2020-10-23 16:13:09 -04:00
Nick Cabatoff 66274607b7
OSS changes for enterprise automated snapshots (#10160) 2020-10-16 14:57:11 -04:00
Aleksandr Bezobchuk d37be9af6e
Merge PR #10095: Integrated Storage Cloud Auto-Join 2020-10-13 16:26:39 -04:00
Jonas-Taha El Sesiy 9b599c8162
Migrate to azure-storage-blob-go (#9577)
The azure sdk for go is maintenance-only for storage, see https://github.com/Azure/azure-sdk-for-go/tree/master/storage\#azure-storage-sdk-for-go-preview
Migrate to new azure-storage-blob-go SDK
Minor test improvements

Fix #9661
2020-10-05 14:37:13 -07:00
Theron Voran 52581cd472
Add logging during awskms auto-unseal (#9794)
Adds debug and warn logging around AWS credential chain generation,
specifically to help users debugging auto-unseal problems on AWS, by
logging which role is being used in the case of a webidentity token.

Adds a deferred call to flush the log output as well, to ensure logs
are output in the event of an initialization failure.
2020-09-28 14:06:49 -07:00
Scott Miller a8cbda1713
Add retry to TestPostgresqlBackend (#10032) 2020-09-24 16:19:11 -05:00
ncabatoff 3fb20ed529
Logging tweaks, mostly for tests (#10028)
Fix some places where raft wasn't hooking into the core logger as it should.
Revisited the code that was setting the log level to Error during cleanup: it's normal for there to be a bunch of errors then, which makes it harder to see what went wrong up to the point where the test was deemed to have failed.  So now, instead of setting log level to Error, we actually stop logging altogether.  This only applies if the test didn't pass in its own logger during cluster creation, but we should be moving away from that anyway.
2020-09-23 13:40:00 -04:00
Brian Kassouf 3f30fc5f4e
Port changes from enterprise lease fix (#10020) 2020-09-22 14:47:13 -07:00
ncabatoff b615da43d7
Run CI tests in docker instead of a machine. (#8948) 2020-09-15 10:01:26 -04:00
ncabatoff b2908d1744
Avoid O(n^2) lookup to remove duplicate subfolders in list output. (#9694) 2020-08-31 09:23:34 -04:00
Brian Kassouf fd72d92434
raft: Fix some snapshot restore issues (#9533)
* raft: Remove double read lock

* Reload TLS keyring after reloading the barrier keys
2020-07-21 10:59:07 -07:00
Mike Jarmy a3ab902e18
set path properly in NewRaftBackend() (#9128)
* set path properly in NewRaftBackend()

* get rid of storeLatestState
2020-07-21 12:48:24 -04:00
Calvin Leung Huang 9b59917ee5
raft/test: fix snapshot peers race test by using committed and applied indexes (#9459) 2020-07-13 10:02:51 -07:00
Josh Black 38fc012817
Backport the pieces of the replication API changes (#9425) 2020-07-09 15:11:37 -07:00
Seth Vargo d00adf89c9
physical/spanner: use separate client for updating locks (#9423)
* physical/spanner: use separate client for updating locks

We believe this mitigates an issue where a large influx of requests
cause the leader to be unable to update the lock table (since it cannot
grab a client from the pool or the client has no more open connections),
which causes cascading failure.
2020-07-09 13:52:29 -07:00
Seth Vargo 95a2d61651
physical/gcs: use separate client for updating locks (#9424)
* physical/gcs: use separate client for updating locks

* Address review comments

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>
2020-07-09 13:40:50 -07:00
Alexander Bezobchuk bd96d1dae1
physical/raft: Add nil check to shutdown (#9322) 2020-06-25 17:51:13 -07:00
Calvin Leung Huang c45bdca0b3
raft: add support for using backend for ha_storage (#9193)
* raft: initial work on raft ha storage support

* add note on join

* add todo note

* raft: add support for bootstrapping and joining existing nodes

* raft: gate bootstrap join by reading leader api address from storage

* raft: properly check for raft-only for certain conditionals

* raft: add bootstrap to api and cli

* raft: fix bootstrap cli command

* raft: add test for setting up new cluster with raft HA

* raft: extend TestRaft_HA_NewCluster to include inmem and consul backends

* raft: add test for updating an existing cluster to use raft HA

* raft: remove debug log lines, clean up verifyRaftPeers

* raft: minor cleanup

* raft: minor cleanup

* Update physical/raft/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/ha.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/ha.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/logical_system_raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* address feedback comments

* address feedback comments

* raft: refactor tls keyring logic

* address feedback comments

* Update vault/raft.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* address feedback comments

* testing: fix import ordering

* raft: rename var, cleanup comment line

* docs: remove ha_storage restriction note on raft

* docs: more raft HA interaction updates with migration and recovery mode

* docs: update the raft join command

* raft: update comments

* raft: add missing isRaftHAOnly check for clearing out state set earlier

* raft: update a few ha_storage config checks

* Update command/operator_raft_bootstrap.go

Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>

* raft: address feedback comments

* raft: fix panic when checking for config.HAStorage.Type

* Update vault/raft.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* Update website/pages/docs/commands/operator/raft.mdx

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* raft: remove bootstrap cli command

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* raft: address review feedback

* raft: revert vendored sdk

* raft: don't send applied index and node ID info if we're HA-only

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-06-23 12:04:13 -07:00
Brian Kassouf 09593283b8
Improve the performance of snapshot installs by using rename (#9247)
* initial work on improving snapshot performance

* Work on snapshots

* rename a few functions

* Cleanup the snapshot file

* vendor the safeio library

* Add a test

* Add more tests

* Some review comments

* Fix comment

* Update physical/raft/snapshot.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* Update physical/raft/snapshot.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* Review feedback

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>
2020-06-23 11:08:30 -07:00
Michael Golowka 993291da07
Validate physical CockroachDB table config value before using it (#9191)
* Validate table name (and database if specified) prior to using it in SQL
2020-06-12 11:09:38 -06:00
Michael Golowka 8d022cbe9c
Validate physical MySQL database and table config values before using them (#9189)
* Validate database & table names prior to using it in SQL
2020-06-12 11:08:56 -06:00
Brian Kassouf 3b4ba9d1fb
Upgrade raft library (#9170)
* Upgrade raft library

* Update vendor

* Update physical/raft/snapshot_test.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>

* Update physical/raft/snapshot_test.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>
2020-06-08 16:34:20 -07:00
Alexander Bezobchuk 9dd67cbeb6
Merge PR #9027: Integrated Storage (Raft): Add Support for max_entry_size Config 2020-06-01 10:17:24 -04:00
ncabatoff 8870b2e51c
Add mongodbatlas static roles support (#8987)
* Refactor PG container creation.
* Rework rotation tests to use shorter sleeps.
* Refactor rotation tests.
* Add a static role rotation test for MongoDB Atlas.
2020-05-29 14:21:23 -04:00
Lauren Voswinkel 8fc08f8c91
Require TLS or plaintext flagging in MySQL configuration (#9012)
* Adds a safety switch to configuration files.

This requires a user to either use TLS, or acknowledge that they are sending
credentials over plaintext.

* Warn if plaintext credentials will be passed

* Add true/false support to the plaintext transmission ack

* Updated website docs and ensured ToLower is used for true comparison
2020-05-21 09:09:37 -07:00
Brian Kassouf c8dde052f2
storage/raft: Advertise the configured cluster address (#9008)
* storage/raft: Advertise the configured cluster address

* Don't allow raft to start with unspecified IP

* Fix concurrent map write panic

* Add test file

* changelog++

* changelog++

* changelog++

* Update tcp_layer.go

* Update tcp_layer.go

* Only set the adverise addr if set
2020-05-18 18:22:25 -07:00
Brian Kassouf 1bb0bd489d
storage/raft: Add committed and applied indexes to the status output (#9011)
* storage/raft: Add committed and applied indexes to the status output

* Update api vendor

* changelog++

* Update http/sys_leader.go

Co-authored-by: Jim Kalafut <jkalafut@hashicorp.com>

Co-authored-by: Jim Kalafut <jkalafut@hashicorp.com>
2020-05-18 16:07:27 -07:00
Jeff Mitchell 6cb26312af
Fix code copied from gogo using a different proto import (#9009) 2020-05-15 13:45:22 -07:00
Jeff Mitchell b4f5d38916
Update to latest go-kms-wrapping and fix protos/etcd (#8996) 2020-05-14 18:45:10 -04:00
Mike Jarmy 8f0183a981
move resuable storage test to avoid creating import cycle (#8999) 2020-05-14 13:53:56 -04:00
Mike Jarmy 724af764bb
Test reusable storage (#8983)
* stub out reusable storage test

* implement reusable inmem test

* work on reusable raft test

* stub out simple raft test

* switch to reusable raft storage

* cleanup tests

* cleanup tests

* refactor tests

* verify raft configuration

* cleanup tests

* stub out reuseStorage

* use common base address across clusters

* attempt to reuse raft cluster

* tinker with test

* fix typo

* start debugging

* debug raft configuration

* add BaseClusterListenPort to TestCluster options

* use BaseClusterListenPort in test

* raft join works now

* misc cleanup of raft tests

* use configurable base port for raft test

* clean up raft tests

* add parallelized tests for all backends

* clean up reusable storage tests

* remove debugging code from startClusterListener()

* improve comments in testhelpers

* improve comments in teststorage

* improve comments and test logging

* fix typo in vault/testing

* fix typo in comments

* remove debugging code

* make number of cores parameterizable in test
2020-05-14 08:31:02 -04:00
Calvin Leung Huang e7af25b969
raft: use file paths for TLS info in the retry_join block (#8894)
* raft: use file paths for TLS info in the retry_join stanza

* raft: maintain backward compat for existing tls params

* docs: update raft docs with new file-based TLS params

* Update godoc comment, fix docs
2020-05-06 18:26:08 -07:00
Brian Kassouf d979279015
storage/raft: Fix memory allocation issue and Metadata tracking issues with snapshots (#8793)
* storage/raft: Split snapshot restore disk write into batches

* Work on snapshot consistency

* make sure tests send a snapshot

* Fix comment

* Don't remove metrics

* Fix comment
2020-04-23 11:11:08 -07:00
Calvin Leung Huang 2659c34910
raft: check for nil on concrete type in SetupCluster (#8784)
* raft: check for nil on concrete type in SetupCluster

* raft: move check to its own func

* raft: func cleanup

* raft: disallow disable_clustering = true when raft storage is used

* docs: update disable_clustering to mention new behavior
2020-04-21 13:45:07 -07:00
Vishal Nayak 387677e251
Raft recovery peers non voter (#8681)
* Disallow non-voter setting from peers.json

* Fix bug that would make the actual fix a no-op

* Change order of evaluation

* Error out instead of resetting the value

* Update physical/raft/raft.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* Print node ID

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>
2020-04-03 19:13:51 -04:00
Brian Kassouf 1ede406559
storage/raft: Buffer leader notify channel more aggresively (#8650) 2020-03-31 17:42:48 -07:00
Brian Kassouf 4aa582a6f4
storage/raft: Fix leadership deadlock (#8547)
* storage/raft: Fix leadership deadlock

* Update comment
2020-03-13 15:03:58 -07:00
Seth Vargo 737fdc34e7
Fix Spanner metrics (#8541)
* Fix metric for Get

This was incorrectly being reported as List

* Measure transactions
2020-03-13 12:35:56 +01:00
Gerardo Di Giacomo 8573eefe90
enabling TLS 1.3 support for TCP listeners (#8305)
* adding support for TLS 1.3 for TCP listeners

* removed test as CI uses go 1.12

* removed Cassandra support, added deprecation notice

* re-added TestTCPListener_tls13
2020-02-15 11:40:18 -08:00
Clint d3cda0fe2c
Guard against using Raft as a seperate HA Storage (#8239)
* Guard against using Raft as a seperate HA Storage

* Document that Raft cannot be used as a seperate ha_storage backend at this time

* remove duplicate imports from updating with master
2020-02-14 14:25:53 -06:00
Clint 58381ca0e7
MySQL HA: Return an error if we fail to get a lock on standby (#8229)
* return an error if we fail to get a lock on standby

* Add regression test

* minor refactoring to remove a race condition in the test
2020-02-05 14:08:48 -06:00
Vitaly Velikodny 230c53d18b
Clean AlibabaCloud physical backend code (#8186) 2020-01-30 12:08:24 -08:00
Jim Kalafut f17fc4e5c1
Run goimports (#8251) 2020-01-27 21:11:00 -08:00
Becca Petrin 759f9b38f7
Observer pattern for service registration interface (#8123)
* use observer pattern for service discovery

* update perf standby method

* fix test

* revert usersTags to being called serviceTags

* use previous consul code

* vault isnt a performance standby before starting

* log err

* changes from feedback

* add Run method to interface

* changes from feedback

* fix core test

* update example
2020-01-24 09:42:03 -08:00
Brian Kassouf f32a86ee7a
Create network layer abstraction to allow in-memory cluster traffic (#8173) 2020-01-16 23:03:02 -08:00
Vishal Nayak 8891f2ba88 Raft retry join (#7856)
* Raft retry join

* update

* Make retry join work with shamir seal

* Return upon context completion

* Update vault/raft.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Address some review comments

* send leader information slice as a parameter

* Make retry join work properly with Shamir case. This commit has a blocking issue

* Fix join goroutine exiting before the job is done

* Polishing changes

* Don't return after a successful join during unseal

* Added config parsing test

* Add test and fix bugs

* minor changes

* Address review comments

* Fix build error

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2020-01-13 17:02:16 -08:00
Jeff Mitchell a0694943cc
Migrate built in auto seal to go-kms-wrapping (#8118) 2020-01-10 20:39:52 -05:00
Becca Petrin a94f2d3e6f
Replace deprecated AWS client instantiations (#8060)
* replace deprecated aws client instantiation

* fix imports
2020-01-09 14:58:33 -08:00
Mike Jarmy e42bc0ffc0
Introduce optional service_registration stanza (#7887)
* move ServiceDiscovery into methods

* add ServiceDiscoveryFactory

* add serviceDiscovery field to vault.Core

* refactor ConsulServiceDiscovery into separate struct

* cleanup

* revert accidental change to go.mod

* cleanup

* get rid of un-needed struct tags in vault.CoreConfig

* add service_discovery parser

* add ServiceDiscovery to config

* cleanup

* cleanup

* add test for ConfigServiceDiscovery to Core

* unit testing for config service_discovery stanza

* cleanup

* get rid of un-needed redirect_addr stuff in service_discovery stanza

* improve test suite

* cleanup

* clean up test a bit

* create docs for service_discovery

* check if service_discovery is configured, but storage does not support HA

* tinker with test

* tinker with test

* tweak docs

* move ServiceDiscovery into its own package

* tweak a variable name

* fix comment

* rename service_discovery to service_registration

* tweak service_registration config

* Revert "tweak service_registration config"

This reverts commit 5509920a8ab4c5a216468f262fc07c98121dce35.

* simplify naming

* refactor into ./serviceregistration/consul
2019-12-06 09:46:39 -05:00
Jim Kalafut 5d5c1374d2
Fix S3 configurable path handling (#7966)
Also remove some incorrect skipping of the S3 test.

Fixes #7362
2019-12-04 06:18:45 -08:00
Colton J. McCurdy e4c909fa3e physical/posgresql: add ability to prefer VAULT_PG_CONNECTION_URL envar over config file (#7937)
* physical/posgresql: add ability to use CONNECTION_URL environment variable instead of requiring it to be configured in the Vault config file.

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* storage/postgresql: update configuration documentation for postgresql storage backend to include connection_url configuration via the PG_CONNECTION_URL environment variable

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: add a configuration file and tests for getting the connection_url from the config file or environment

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: update postgresql backend to pull the required connection_url from the PG_CONNECTION_URL environment variable if it exists, otherwise, fallback to using the config file

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: remove configure*.go files and prefer the postgresql*.go files

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: move and simplify connectionURL function

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: update connectionURL test to use an unordered map instead of slice to avoid test flakiness

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: update config env to be prefixed with VAULT_ - VAULT_PG_CONNECTION_URL

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* docs/web: update postgresql backend docs to use updated, VAULT_ prefixed config env

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>
2019-12-03 15:48:38 -06:00
Clint 72944892c2
physical/postgresql: Refactor test code to avoid panic if tests ran multiple times (#7939)
* refactor test code to avoid panic if tests ran multiple times

* cleanup: don't actually send just close

* move comment to a better location

* move error check to a more obvious spot

* Revert "move error check to a more obvious spot"

Reverting because methods like this should only be called on the main
goroutine running the test:

- https://golang.org/pkg/testing/#T

This reverts commit db7641948317785bff15b3d9dbe6fb18a2d19c2c.
2019-12-02 15:05:02 -06:00
Brian Kassouf dbc0d75ad8
Update raft_test.go 2019-11-21 11:12:28 -08:00
Jim Kalafut 59e526614d
Run go fmt (#7823) 2019-11-07 08:54:34 -08:00
Daniel Lohse de2d3073d7 Allow Raft storage to be configured via env variables (#7745)
* Fix unordered imports

* Allow Raft node ID to be set via the environment variable `VAULT_RAFT_NODE_ID`

* Allow Raft path to be set via the environment variable `VAULT_RAFT_PATH`

* Prioritize the environment when fetching the Raft configuration values

Values in environment variables should override the config as per the
documentation as well as common sense.
2019-10-28 09:43:12 -07:00
Jeff Mitchell 0c88218dd4 Port some changes that got out of sync 2019-10-28 11:38:14 -04:00
Lexman c86fe212c0
oss changes for entropy augmentation feature (#7670)
* oss changes for entropy augmentation feature

* fix oss command/server/config tests

* update go.sum

* fix logical_system and http/ tests

* adds vendored files

* removes unused variable
2019-10-17 10:33:00 -07:00
Vishal Nayak 0d077d7945
Recovery Mode (#7559)
* Initial work

* rework

* s/dr/recovery

* Add sys/raw support to recovery mode (#7577)

* Factor the raw paths out so they can be run with a SystemBackend.

# Conflicts:
#	vault/logical_system.go

* Add handleLogicalRecovery which is like handleLogical but is only
sufficient for use with the sys-raw endpoint in recovery mode.  No
authentication is done yet.

* Integrate with recovery-mode.  We now handle unauthenticated sys/raw
requests, albeit on path v1/raw instead v1/sys/raw.

* Use sys/raw instead raw during recovery.

* Don't bother persisting the recovery token.  Authenticate sys/raw
requests with it.

* RecoveryMode: Support generate-root for autounseals (#7591)

* Recovery: Abstract config creation and log settings

* Recovery mode integration test. (#7600)

* Recovery: Touch up (#7607)

* Recovery: Touch up

* revert the raw backend creation changes

* Added recovery operation token prefix

* Move RawBackend to its own file

* Update API path and hit it using CLI flag on generate-root

* Fix a panic triggered when handling a request that yields a nil response. (#7618)

* Improve integ test to actually make changes while in recovery mode and
verify they're still there after coming back in regular mode.

* Refuse to allow a second recovery token to be generated.

* Resize raft cluster to size 1 and start as leader (#7626)

* RecoveryMode: Setup raft cluster post unseal (#7635)

* Setup raft cluster post unseal in recovery mode

* Remove marking as unsealed as its not needed

* Address review comments

* Accept only one seal config in recovery mode as there is no scope for migration
2019-10-15 00:55:31 -04:00
Brian Kassouf 1167fad704
Improve raft write performance by utilizing FSM Batching (#7527)
* Start benchmark work

* Add batching FSM function

* dedupe some code

* Update dependency on chunking FSM

* fix raft external tests

* fix go.mod

* Add batching test

* uncomment test

* update raft deps

* update vendor

* Update physical/raft/fsm.go

Co-Authored-By: Michel Vocks <michelvocks@gmail.com>

* Update physical/raft/fsm.go
2019-10-14 09:25:07 -06:00
ncabatoff cbde4d4357
Use a much longer delay before giving up and failing the test. (#7646)
On circleci we've seen the test run very slowly so it's often failing here.
2019-10-14 11:06:02 -04:00
Brian Kassouf 024c29c36a
OSS portions of raft non-voters (#7634)
* OSS portions of raft non-voters

* add file

* Update vault/raft.go

Co-Authored-By: Vishal Nayak <vishalnayak@users.noreply.github.com>
2019-10-11 11:56:59 -07:00
James Stoker 49c9352f75 Add config parameter to Azure storage backend to allow specifying the ARM endpoint to support Azure Stack. (#7567) 2019-10-08 08:51:36 -07:00
Calvin Leung Huang 297e7bca38
raft: fix TestRaft_Snapshot_Take_Restore (#7443) 2019-09-06 15:20:08 -07:00
Brian Kassouf c2905773e4
Add download headers to snapshot take API (#7369)
* Add download headers to snapshot take API

* Add content type
2019-09-06 10:34:36 -07:00
Vu Pham a47b2faf34 Added OCI Object Storage Plugin (#6985) 2019-09-04 11:33:16 -07:00
mhobbs-ibm cd2356fe7c plugging file descriptor leak in couchdb backend (#7345) 2019-08-21 12:23:20 -07:00
Jack Kleeman 1977305ffa Store less data in Cassandra prefix buckets (#7199)
* Store less data in Cassandra prefix buckets

The Cassandra physical backend relies on storing data for sys/foo/bar
under sys, sys/foo, and sys/foo/bar. This is necessary so that we
can list the sys bucket, get a list of all child keys, and then trim
this down to find child 'folders' eg food. Right now however, we store
the full value of every storage entry in all three buckets. This is
unnecessary as the value will only ever be read out in the leaf bucket
ie sys/foo/bar. We use the intermediary buckets simply for listing keys.

We have seen some issues around compaction where certain buckets,
particularly intermediary buckets that are exclusively for listing,
get really clogged up with data to the point of not being listable.
Buckets like sys/expire/id are huge, combining lease expiry data for
all auth methods, and need to be listed for vault to successfully
become leader. This PR tries to cut down on the amount of data stored
in intermediary buckets.

* Avoid goroutine leak by buffering results channel up to the bucket count
2019-08-19 11:50:00 -07:00
Brian Kassouf bdfa2c7828
Add additional raft chunk test (#7192)
* Add an end-to-end raft chunk test

* Apply suggestions from code review

Co-Authored-By: Jim Kalafut <jkalafut@hashicorp.com>
2019-07-29 14:11:46 -07:00
Brian Kassouf b83aaf7331
storage/raft: Support storage migration to raft storage (#7207)
* Support raft in the migration command

* Add comments
2019-07-29 13:05:43 -07:00
Brian Kassouf a77995cdb1 Ensure raft configuration properly lists the leader (#7188) 2019-07-25 08:41:14 -04:00
Jeff Mitchell 0425db59ab
Raft chunk snapshotting (#7185)
Support chunking, including snapshot handling
2019-07-24 20:44:13 -04:00
Brian Kassouf 965066161a
Revert "Fix the config output (#7113)" (#7184)
This reverts commit 2f7cfc9aae911c8860db37e556363fbfb1567075.
2019-07-24 10:23:30 -07:00
Michel Vocks 524d101008 Added s3 storage path parameter (#7157) 2019-07-24 12:48:26 -04:00
ncabatoff 35b944460d
TestCassandraBackend times out pretty often on the ent build. Bump (#7130)
its timeout from 5s to 15s in the hopes that helps.  The theory is that
since I haven't seen this on the OSS side, it's failing because the ent
side is heavier in terms of test load and thus the tests face more
resource contention.
2019-07-23 13:26:36 -04:00
Jeff Mitchell fd376b4bdf Use ChunkingConfigurationStore for raft 2019-07-23 10:59:21 -04:00
Vishal Nayak 0010d79498 Fix the config output (#7113) 2019-07-22 12:59:46 -04:00
Jeff Mitchell 3b22ab2486 Add chunking support to raft 2019-07-22 12:17:58 -04:00
Christian Muehlhaeuser e6febc5839 Fixed a bunch of typos (#7146) 2019-07-18 21:10:15 -04:00
Jeff Mitchell 7b8c0b58f1
Call goimports as well as gofmt when doing a make fmt (#7148)
Closes #7147
2019-07-18 21:04:56 -04:00
Jeff Mitchell d810758ca2
Rerun proto gen as some got gen'd with old proto version (#7090) 2019-07-09 01:02:20 +02:00
Brian Kassouf 4d7d0d729a
storage/raft: When restoring a snapshot preseal first (#7011)
* storage/raft: When restoring a snapshot preseal first

* best-effort allow standbys to apply the restoreOp before sealing active node

* Don't cache the raft tls key

* Update physical/raft/raft.go

* Move pending raft peers to core

* Fix race on close bool

* Extend the leaderlease time for tests

* Update raft deps

* Fix audit hashing

* Fix race with auditing
2019-07-03 13:56:30 -07:00
Garrett T 8fc4a63796 Set MaxIdleConns to reduce connection churn (postgresql physical) (#6967)
* Set MaxIdleConns to reduce connection churn (postgresql physical)

* Make new  "max_idle_connection" config option for physical postgresql

* Add docs for "max_idle_connections" for postgresql storage

* Add minimum version to docs for max_idle_connections
2019-07-02 15:03:56 -07:00
Vishal Nayak 4484de3ea6
Fix raft config response (#6975) 2019-06-27 17:39:52 -04:00
Brian Kassouf 62e14c280d
storage/raft: fix races in tests (#6996)
* storage/raft: fix races in tests

* Fix another test race
2019-06-27 10:00:03 -07:00
Brian Kassouf 5d0c68ca74
Fix 32-bit builds (#6948) 2019-06-21 09:52:02 -06:00
Vishal Nayak 53035ce390
Raft CLI (#6893)
* raft cli

* Reuse the command's client

* Better response handling

* minor touchups
2019-06-20 21:32:00 -04:00
Jeff Mitchell 07dcdc8b79 Sync 2019-06-20 20:55:10 -04:00
Brian Kassouf ed14061578
Raft Storage Backend (#6888)
* Work on raft backend

* Add logstore locally

* Add encryptor and unsealable interfaces

* Add clustering support to raft

* Remove client and handler

* Bootstrap raft on init

* Cleanup raft logic a bit

* More raft work

* Work on TLS config

* More work on bootstrapping

* Fix build

* More work on bootstrapping

* More bootstrapping work

* fix build

* Remove consul dep

* Fix build

* merged oss/master into raft-storage

* Work on bootstrapping

* Get bootstrapping to work

* Clean up FMS and node-id

* Update local node ID logic

* Cleanup node-id change

* Work on snapshotting

* Raft: Add remove peer API (#906)

* Add remove peer API

* Add some comments

* Fix existing snapshotting (#909)

* Raft get peers API (#912)

* Read raft configuration

* address review feedback

* Use the Leadership Transfer API to step-down the active node (#918)

* Raft join and unseal using Shamir keys (#917)

* Raft join using shamir

* Store AEAD instead of master key

* Split the raft join process to answer the challenge after a successful unseal

* get the follower to standby state

* Make unseal work

* minor changes

* Some input checks

* reuse the shamir seal access instead of new default seal access

* refactor joinRaftSendAnswer function

* Synchronously send answer in auto-unseal case

* Address review feedback

* Raft snapshots (#910)

* Fix existing snapshotting

* implement the noop snapshotting

* Add comments and switch log libraries

* add some snapshot tests

* add snapshot test file

* add TODO

* More work on raft snapshotting

* progress on the ConfigStore strategy

* Don't use two buckets

* Update the snapshot store logic to hide the file logic

* Add more backend tests

* Cleanup code a bit

* [WIP] Raft recovery (#938)

* Add recovery functionality

* remove fmt.Printfs

* Fix a few fsm bugs

* Add max size value for raft backend (#942)

* Add max size value for raft backend

* Include physical.ErrValueTooLarge in the message

* Raft snapshot Take/Restore API  (#926)

* Inital work on raft snapshot APIs

* Always redirect snapshot install/download requests

* More work on the snapshot APIs

* Cleanup code a bit

* On restore handle special cases

* Use the seal to encrypt the sha sum file

* Add sealer mechanism and fix some bugs

* Call restore while state lock is held

* Send restore cb trigger through raft log

* Make error messages nicer

* Add test helpers

* Add snapshot test

* Add shamir unseal test

* Add more raft snapshot API tests

* Fix locking

* Change working to initalize

* Add underlying raw object to test cluster core

* Move leaderUUID to core

* Add raft TLS rotation logic (#950)

* Add TLS rotation logic

* Cleanup logic a bit

* Add/Remove from follower state on add/remove peer

* add comments

* Update more comments

* Update request_forwarding_service.proto

* Make sure we populate all nodes in the followerstate obj

* Update times

* Apply review feedback

* Add more raft config setting (#947)

* Add performance config setting

* Add more config options and fix tests

* Test Raft Recovery (#944)

* Test raft recovery

* Leave out a node during recovery

* remove unused struct

* Update physical/raft/snapshot_test.go

* Update physical/raft/snapshot_test.go

* fix vendoring

* Switch to new raft interface

* Remove unused files

* Switch a gogo -> proto instance

* Remove unneeded vault dep in go.sum

* Update helper/testhelpers/testhelpers.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* Update vault/cluster/cluster.go

* track active key within the keyring itself (#6915)

* track active key within the keyring itself

* lookup and store using the active key ID

* update docstring

* minor refactor

* Small text fixes (#6912)

* Update physical/raft/raft.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* review feedback

* Move raft logical system into separate file

* Update help text a bit

* Enforce cluster addr is set and use it for raft bootstrapping

* Fix tests

* fix http test panic

* Pull in latest raft-snapshot library

* Add comment
2019-06-20 12:14:58 -07:00
ncabatoff 2cf5009ff5
Fix deadlock in TestPostgreSQLBackend (#6884)
Make lock2's retryInterval smaller so it grabs the lock as soon as lock1's renewer fails to renew in time.  Fix the logic to test if lock1's leader channel gets closed: we don't need a goroutine, and
the logic was broken in that if we timed out we'd never write to the blocking channel we then try to read from.  Moreover the timeout was wrong.
2019-06-14 12:59:24 -04:00
Mark Gritter d4d1ae55a0
Be more permissive in what URLs to allow; added test for unix socket. (#6859) 2019-06-10 13:59:13 -05:00
Justin Weissig ef17e7cbaa Update comment spelling in postgresql.go (#6817)
Fixed minor comment: PostgresSQL/PostgreSQL
2019-06-04 02:29:36 -05:00
Jim Kalafut 8a0d423ed8
Fix gofmt (#6764) 2019-05-20 15:15:05 -07:00
ncabatoff 64adec23cd
Since the two branches of the UNION produce disjoint sets, do a UNION ALL (#6546)
This obviates the need for a sort/eliminate duplicates step, giving a performance boost in some key distributions with large datasets.
2019-05-10 13:26:01 -04:00
bjorndolk e8f10814b6 Added HA backend for postgres based on dynamodb model (#5731)
Add optional HA support for postgres backend if Postgres version >= 9.5.
2019-05-10 12:48:42 -04:00
Mark Gritter 5d4fc08fc4
Determine scheme from configured address. (#6689)
* Determine scheme from configured address.

* Added test; use parsed version of address for TLS config.
2019-05-08 11:48:31 -05:00
Scott Stevenson f145baa581 Set chunkSize when creating a new GCS backend (#6655)
Adds a small step to TestBackend to prevent regression.
2019-05-03 13:08:29 -07:00
Jeff Mitchell e8a9d47aca
Port over some SP v2 bits (#6516)
* Port over some SP v2 bits

Specifically:

* Add too-large handling to Physical (Consul only for now)
* Contextify some identity funcs
* Update SP protos

* Add size limiting to inmem storage
2019-05-01 13:47:41 -04:00
EdwinRobbins de885660e8 DynamoDB: Make Unlock key delete conditional on being old leader's (#6637) 2019-04-30 17:44:47 -07:00
Jim Kalafut fa2accd000
Exit DynamoDB tryToLock when stop channel is closed (#6640)
* Exit DynamoDB tryToLock when stop channel is closed

If the stop channel is closed (e.g. an error is returned which triggers
close(stop) in Lock), this loop will spin and use 100% CPU.

* Ensure ticker is stopped
2019-04-29 22:51:48 -07:00
Becca Petrin 96374f27c7
Merge pull request #6356 from kedarkale27/master
Update mssql.go
2019-04-23 09:06:14 -07:00
ncabatoff 06574da57a
Merge multiple functions for creating consul containers into one. (#6612)
Merge both functions for creating mongodb containers into one.
Add retries to docker container cleanups.
Require $VAULT_ACC be set to enable AWS tests.
2019-04-22 12:26:10 -04:00
Kedar Kale 435d19bc25 Update mssql.go
Issue with GO Lang not able to execute the USE database line. Fixed the issue by using the database where it is called
2019-04-17 13:41:13 +05:30
Jeff Mitchell 24b92d6e29 Move physical/file to sdk 2019-04-15 14:51:33 -04:00
Jeff Mitchell be78ad2366 Remove dependency on Consul lib dir 2019-04-13 01:35:48 -04:00
Jeff Mitchell 80c303ac83 Move ldaputil and tlsutil over to sdk 2019-04-12 18:26:54 -04:00
Jeff Mitchell 371db36ede Move useragent to sdk 2019-04-12 18:17:49 -04:00
Jeff Mitchell 8bcb533a1b
Create sdk/ and api/ submodules (#6583) 2019-04-12 17:54:35 -04:00
Maarten van der Hoef d98b477a7e Spelling fix (#6523) 2019-04-03 06:57:32 -07:00
Jim Kalafut a939613f78 Reset retry count after a successful lock check (#6522) 2019-04-02 17:08:49 -07:00
Mahmoud Abdelsalam 6479a73f3a Fix dynamodb HA lock race (#6512)
* Fix DynamoDB HA race issue

* Add test for race condition (which fails on the released DynamoDB code)
2019-04-02 16:19:21 -07:00
Kedar Kale 601d839c77 Update mssql_test.go
adding schema env variable
2019-04-02 15:12:30 +05:30
Naoki Ainoya f5345bc711 use option.WithCredentialsFile() instead of deprecated method (#6500) 2019-03-29 10:58:26 -04:00
Kedar Kale b2c51de7ab Update mssql_test.go
added a new line at the end of the code
2019-03-22 12:34:49 +05:30
Kedar Kale b076f25a8e Update mssql_test.go
updated the test case, schema is now written in double quotes
2019-03-22 12:30:44 +05:30
T.K 453f1ac109 changed misspelled english words (#6432) 2019-03-19 09:32:45 -04:00
Lokesh Jain fd994cfceb
Added unit test case 2019-03-13 09:40:09 +05:30
Kedar Kale f930705ed2 Update mssql.go
query will run on the database passed as the parameter instead of the master database
2019-03-06 15:45:41 +05:30
Jeff Mitchell 718ae5a010 Minor syncs 2019-03-04 13:35:22 -05:00
Clint Shryock 7e1a4885e1
fix typo in code comment 2019-02-28 10:26:40 -06:00
Andrey Kuzmin b496fea4ad Etcd timeouts (#6285)
* Configurable lock and request etcd timeouts.

If etcd cluster placed on slow servers - request timeouts may be much greater, then hardcoded default values.
Also, in etcd setup, like above - may be need to greater lock timeout.

* Configurable lock and request etcd timeouts.

Docs.

* Use user friendly timeout syntax.

To allow specify more readable time values.
2019-02-27 18:34:29 -08:00
Jeff Mitchell 82a85aa8c8 Make fmt 2019-02-08 09:12:55 -05:00
Lexman c51de76dd1
adds port parameter (#6145) 2019-02-05 15:40:00 -08:00
Jeff Mitchell 5f249d4005
Add allowed_response_headers (#6115) 2019-02-05 16:02:15 -05:00
Jeff Mitchell bbc1d53a5d Revert "Refactor common token fields and operations into a helper (#5953)"
This reverts commit 66c226c593bb1cd48cfd8364ac8510cb42b7d67a.
2019-02-01 11:23:40 -05:00
Jeff Mitchell 85a560abba
Refactor common token fields and operations into a helper (#5953) 2019-01-30 16:23:28 -05:00
Gordon Shankman cd2f7bbde8 Adding support for SSE in the S3 storage backend. (#5996) 2019-01-26 16:48:08 -05:00
Jeff Mitchell 94e56d964f Fix build 2019-01-23 16:52:51 -05:00
Jeff Mitchell 3f1a7d4fdd
Update to latest etcd and use the new repository packages (#6087)
This will be necessary for go mod work

Additionally, the srv api has changed. This adapts to it.
2019-01-23 14:35:03 -05:00
Becca Petrin aac271ed7f swap the forked aliyun sdk for the original (#6024) 2019-01-23 11:24:51 -05:00
Jim Kalafut f5b19f9ce8
Fix physical/postgresql to return the full entry key (#6044)
Fixes #6041
2019-01-15 12:06:01 -08:00
Jim Kalafut d0e2badbae Run goimports across the repository (#6010)
The result will still pass gofmtcheck and won't trigger additional
changes if someone isn't using goimports, but it will avoid the
piecemeal imports changes we've been seeing.
2019-01-08 16:48:57 -08:00
Julien Blache 91d432fc85 FoundationDB backend TLS support and housekeeping (#5800)
* Fix typo in documentation

* Update fdb-go-install.sh for new release tags

* Exclude FoundationDB bindings from vendoring, delete vendored copy

FoundationDB bindings are tightly coupled to the server version and
client library version used in a specific deployment. Bindings need
to be installed using the fdb-go-install.sh script, as documented in
the foundationdb backend documentation.

* Add TLS support to FoundationDB backend

TLS support appeared in FoundationDB 5.2.4, raising the minimum API version
for TLS-aware FoundationDB code to 520.

* Update documentation for FoundationDB TLS support
2019-01-08 09:01:44 -08:00
Jeff Mitchell c8a029210b
Don't read AWS env vars (#5974)
* Don't read AWS env vars

Let AWS SDK env cred chain provider do it for us

Fixes #5965
2019-01-04 15:03:57 -05:00
bjorndolk 4dbe61568a Physical postgres docker test (#5926)
* Docker support for postgres backend testing

* Bug in handling of postgres connection url for non docker testing

* Test should fail if it cannot retrieve pg version

* internal helperfunctions pascalCasing
2019-01-03 09:25:32 -05:00
Brian Kassouf 9d74c933cf
Update encoding.go 2018-11-19 15:24:41 -08:00