Commit Graph

512 Commits

Author SHA1 Message Date
Brian Kassouf f32a86ee7a
Create network layer abstraction to allow in-memory cluster traffic (#8173) 2020-01-16 23:03:02 -08:00
Vishal Nayak 8891f2ba88 Raft retry join (#7856)
* Raft retry join

* update

* Make retry join work with shamir seal

* Return upon context completion

* Update vault/raft.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Address some review comments

* send leader information slice as a parameter

* Make retry join work properly with Shamir case. This commit has a blocking issue

* Fix join goroutine exiting before the job is done

* Polishing changes

* Don't return after a successful join during unseal

* Added config parsing test

* Add test and fix bugs

* minor changes

* Address review comments

* Fix build error

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2020-01-13 17:02:16 -08:00
Jeff Mitchell a0694943cc
Migrate built in auto seal to go-kms-wrapping (#8118) 2020-01-10 20:39:52 -05:00
Becca Petrin a94f2d3e6f
Replace deprecated AWS client instantiations (#8060)
* replace deprecated aws client instantiation

* fix imports
2020-01-09 14:58:33 -08:00
Mike Jarmy e42bc0ffc0
Introduce optional service_registration stanza (#7887)
* move ServiceDiscovery into methods

* add ServiceDiscoveryFactory

* add serviceDiscovery field to vault.Core

* refactor ConsulServiceDiscovery into separate struct

* cleanup

* revert accidental change to go.mod

* cleanup

* get rid of un-needed struct tags in vault.CoreConfig

* add service_discovery parser

* add ServiceDiscovery to config

* cleanup

* cleanup

* add test for ConfigServiceDiscovery to Core

* unit testing for config service_discovery stanza

* cleanup

* get rid of un-needed redirect_addr stuff in service_discovery stanza

* improve test suite

* cleanup

* clean up test a bit

* create docs for service_discovery

* check if service_discovery is configured, but storage does not support HA

* tinker with test

* tinker with test

* tweak docs

* move ServiceDiscovery into its own package

* tweak a variable name

* fix comment

* rename service_discovery to service_registration

* tweak service_registration config

* Revert "tweak service_registration config"

This reverts commit 5509920a8ab4c5a216468f262fc07c98121dce35.

* simplify naming

* refactor into ./serviceregistration/consul
2019-12-06 09:46:39 -05:00
Jim Kalafut 5d5c1374d2
Fix S3 configurable path handling (#7966)
Also remove some incorrect skipping of the S3 test.

Fixes #7362
2019-12-04 06:18:45 -08:00
Colton J. McCurdy e4c909fa3e physical/posgresql: add ability to prefer VAULT_PG_CONNECTION_URL envar over config file (#7937)
* physical/posgresql: add ability to use CONNECTION_URL environment variable instead of requiring it to be configured in the Vault config file.

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* storage/postgresql: update configuration documentation for postgresql storage backend to include connection_url configuration via the PG_CONNECTION_URL environment variable

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: add a configuration file and tests for getting the connection_url from the config file or environment

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: update postgresql backend to pull the required connection_url from the PG_CONNECTION_URL environment variable if it exists, otherwise, fallback to using the config file

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: remove configure*.go files and prefer the postgresql*.go files

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: move and simplify connectionURL function

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: update connectionURL test to use an unordered map instead of slice to avoid test flakiness

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* physical/postgresql: update config env to be prefixed with VAULT_ - VAULT_PG_CONNECTION_URL

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* docs/web: update postgresql backend docs to use updated, VAULT_ prefixed config env

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>
2019-12-03 15:48:38 -06:00
Clint 72944892c2
physical/postgresql: Refactor test code to avoid panic if tests ran multiple times (#7939)
* refactor test code to avoid panic if tests ran multiple times

* cleanup: don't actually send just close

* move comment to a better location

* move error check to a more obvious spot

* Revert "move error check to a more obvious spot"

Reverting because methods like this should only be called on the main
goroutine running the test:

- https://golang.org/pkg/testing/#T

This reverts commit db7641948317785bff15b3d9dbe6fb18a2d19c2c.
2019-12-02 15:05:02 -06:00
Brian Kassouf dbc0d75ad8
Update raft_test.go 2019-11-21 11:12:28 -08:00
Jim Kalafut 59e526614d
Run go fmt (#7823) 2019-11-07 08:54:34 -08:00
Daniel Lohse de2d3073d7 Allow Raft storage to be configured via env variables (#7745)
* Fix unordered imports

* Allow Raft node ID to be set via the environment variable `VAULT_RAFT_NODE_ID`

* Allow Raft path to be set via the environment variable `VAULT_RAFT_PATH`

* Prioritize the environment when fetching the Raft configuration values

Values in environment variables should override the config as per the
documentation as well as common sense.
2019-10-28 09:43:12 -07:00
Jeff Mitchell 0c88218dd4 Port some changes that got out of sync 2019-10-28 11:38:14 -04:00
Lexman c86fe212c0
oss changes for entropy augmentation feature (#7670)
* oss changes for entropy augmentation feature

* fix oss command/server/config tests

* update go.sum

* fix logical_system and http/ tests

* adds vendored files

* removes unused variable
2019-10-17 10:33:00 -07:00
Vishal Nayak 0d077d7945
Recovery Mode (#7559)
* Initial work

* rework

* s/dr/recovery

* Add sys/raw support to recovery mode (#7577)

* Factor the raw paths out so they can be run with a SystemBackend.

# Conflicts:
#	vault/logical_system.go

* Add handleLogicalRecovery which is like handleLogical but is only
sufficient for use with the sys-raw endpoint in recovery mode.  No
authentication is done yet.

* Integrate with recovery-mode.  We now handle unauthenticated sys/raw
requests, albeit on path v1/raw instead v1/sys/raw.

* Use sys/raw instead raw during recovery.

* Don't bother persisting the recovery token.  Authenticate sys/raw
requests with it.

* RecoveryMode: Support generate-root for autounseals (#7591)

* Recovery: Abstract config creation and log settings

* Recovery mode integration test. (#7600)

* Recovery: Touch up (#7607)

* Recovery: Touch up

* revert the raw backend creation changes

* Added recovery operation token prefix

* Move RawBackend to its own file

* Update API path and hit it using CLI flag on generate-root

* Fix a panic triggered when handling a request that yields a nil response. (#7618)

* Improve integ test to actually make changes while in recovery mode and
verify they're still there after coming back in regular mode.

* Refuse to allow a second recovery token to be generated.

* Resize raft cluster to size 1 and start as leader (#7626)

* RecoveryMode: Setup raft cluster post unseal (#7635)

* Setup raft cluster post unseal in recovery mode

* Remove marking as unsealed as its not needed

* Address review comments

* Accept only one seal config in recovery mode as there is no scope for migration
2019-10-15 00:55:31 -04:00
Brian Kassouf 1167fad704
Improve raft write performance by utilizing FSM Batching (#7527)
* Start benchmark work

* Add batching FSM function

* dedupe some code

* Update dependency on chunking FSM

* fix raft external tests

* fix go.mod

* Add batching test

* uncomment test

* update raft deps

* update vendor

* Update physical/raft/fsm.go

Co-Authored-By: Michel Vocks <michelvocks@gmail.com>

* Update physical/raft/fsm.go
2019-10-14 09:25:07 -06:00
ncabatoff cbde4d4357
Use a much longer delay before giving up and failing the test. (#7646)
On circleci we've seen the test run very slowly so it's often failing here.
2019-10-14 11:06:02 -04:00
Brian Kassouf 024c29c36a
OSS portions of raft non-voters (#7634)
* OSS portions of raft non-voters

* add file

* Update vault/raft.go

Co-Authored-By: Vishal Nayak <vishalnayak@users.noreply.github.com>
2019-10-11 11:56:59 -07:00
James Stoker 49c9352f75 Add config parameter to Azure storage backend to allow specifying the ARM endpoint to support Azure Stack. (#7567) 2019-10-08 08:51:36 -07:00
Calvin Leung Huang 297e7bca38
raft: fix TestRaft_Snapshot_Take_Restore (#7443) 2019-09-06 15:20:08 -07:00
Brian Kassouf c2905773e4
Add download headers to snapshot take API (#7369)
* Add download headers to snapshot take API

* Add content type
2019-09-06 10:34:36 -07:00
Vu Pham a47b2faf34 Added OCI Object Storage Plugin (#6985) 2019-09-04 11:33:16 -07:00
mhobbs-ibm cd2356fe7c plugging file descriptor leak in couchdb backend (#7345) 2019-08-21 12:23:20 -07:00
Jack Kleeman 1977305ffa Store less data in Cassandra prefix buckets (#7199)
* Store less data in Cassandra prefix buckets

The Cassandra physical backend relies on storing data for sys/foo/bar
under sys, sys/foo, and sys/foo/bar. This is necessary so that we
can list the sys bucket, get a list of all child keys, and then trim
this down to find child 'folders' eg food. Right now however, we store
the full value of every storage entry in all three buckets. This is
unnecessary as the value will only ever be read out in the leaf bucket
ie sys/foo/bar. We use the intermediary buckets simply for listing keys.

We have seen some issues around compaction where certain buckets,
particularly intermediary buckets that are exclusively for listing,
get really clogged up with data to the point of not being listable.
Buckets like sys/expire/id are huge, combining lease expiry data for
all auth methods, and need to be listed for vault to successfully
become leader. This PR tries to cut down on the amount of data stored
in intermediary buckets.

* Avoid goroutine leak by buffering results channel up to the bucket count
2019-08-19 11:50:00 -07:00
Brian Kassouf bdfa2c7828
Add additional raft chunk test (#7192)
* Add an end-to-end raft chunk test

* Apply suggestions from code review

Co-Authored-By: Jim Kalafut <jkalafut@hashicorp.com>
2019-07-29 14:11:46 -07:00
Brian Kassouf b83aaf7331
storage/raft: Support storage migration to raft storage (#7207)
* Support raft in the migration command

* Add comments
2019-07-29 13:05:43 -07:00
Brian Kassouf a77995cdb1 Ensure raft configuration properly lists the leader (#7188) 2019-07-25 08:41:14 -04:00
Jeff Mitchell 0425db59ab
Raft chunk snapshotting (#7185)
Support chunking, including snapshot handling
2019-07-24 20:44:13 -04:00
Brian Kassouf 965066161a
Revert "Fix the config output (#7113)" (#7184)
This reverts commit 2f7cfc9aae911c8860db37e556363fbfb1567075.
2019-07-24 10:23:30 -07:00
Michel Vocks 524d101008 Added s3 storage path parameter (#7157) 2019-07-24 12:48:26 -04:00
ncabatoff 35b944460d
TestCassandraBackend times out pretty often on the ent build. Bump (#7130)
its timeout from 5s to 15s in the hopes that helps.  The theory is that
since I haven't seen this on the OSS side, it's failing because the ent
side is heavier in terms of test load and thus the tests face more
resource contention.
2019-07-23 13:26:36 -04:00
Jeff Mitchell fd376b4bdf Use ChunkingConfigurationStore for raft 2019-07-23 10:59:21 -04:00
Vishal Nayak 0010d79498 Fix the config output (#7113) 2019-07-22 12:59:46 -04:00
Jeff Mitchell 3b22ab2486 Add chunking support to raft 2019-07-22 12:17:58 -04:00
Christian Muehlhaeuser e6febc5839 Fixed a bunch of typos (#7146) 2019-07-18 21:10:15 -04:00
Jeff Mitchell 7b8c0b58f1
Call goimports as well as gofmt when doing a `make fmt` (#7148)
Closes #7147
2019-07-18 21:04:56 -04:00
Jeff Mitchell d810758ca2
Rerun proto gen as some got gen'd with old proto version (#7090) 2019-07-09 01:02:20 +02:00
Brian Kassouf 4d7d0d729a
storage/raft: When restoring a snapshot preseal first (#7011)
* storage/raft: When restoring a snapshot preseal first

* best-effort allow standbys to apply the restoreOp before sealing active node

* Don't cache the raft tls key

* Update physical/raft/raft.go

* Move pending raft peers to core

* Fix race on close bool

* Extend the leaderlease time for tests

* Update raft deps

* Fix audit hashing

* Fix race with auditing
2019-07-03 13:56:30 -07:00
Garrett T 8fc4a63796 Set MaxIdleConns to reduce connection churn (postgresql physical) (#6967)
* Set MaxIdleConns to reduce connection churn (postgresql physical)

* Make new  "max_idle_connection" config option for physical postgresql

* Add docs for "max_idle_connections" for postgresql storage

* Add minimum version to docs for max_idle_connections
2019-07-02 15:03:56 -07:00
Vishal Nayak 4484de3ea6
Fix raft config response (#6975) 2019-06-27 17:39:52 -04:00
Brian Kassouf 62e14c280d
storage/raft: fix races in tests (#6996)
* storage/raft: fix races in tests

* Fix another test race
2019-06-27 10:00:03 -07:00
Brian Kassouf 5d0c68ca74
Fix 32-bit builds (#6948) 2019-06-21 09:52:02 -06:00
Vishal Nayak 53035ce390
Raft CLI (#6893)
* raft cli

* Reuse the command's client

* Better response handling

* minor touchups
2019-06-20 21:32:00 -04:00
Jeff Mitchell 07dcdc8b79 Sync 2019-06-20 20:55:10 -04:00
Brian Kassouf ed14061578
Raft Storage Backend (#6888)
* Work on raft backend

* Add logstore locally

* Add encryptor and unsealable interfaces

* Add clustering support to raft

* Remove client and handler

* Bootstrap raft on init

* Cleanup raft logic a bit

* More raft work

* Work on TLS config

* More work on bootstrapping

* Fix build

* More work on bootstrapping

* More bootstrapping work

* fix build

* Remove consul dep

* Fix build

* merged oss/master into raft-storage

* Work on bootstrapping

* Get bootstrapping to work

* Clean up FMS and node-id

* Update local node ID logic

* Cleanup node-id change

* Work on snapshotting

* Raft: Add remove peer API (#906)

* Add remove peer API

* Add some comments

* Fix existing snapshotting (#909)

* Raft get peers API (#912)

* Read raft configuration

* address review feedback

* Use the Leadership Transfer API to step-down the active node (#918)

* Raft join and unseal using Shamir keys (#917)

* Raft join using shamir

* Store AEAD instead of master key

* Split the raft join process to answer the challenge after a successful unseal

* get the follower to standby state

* Make unseal work

* minor changes

* Some input checks

* reuse the shamir seal access instead of new default seal access

* refactor joinRaftSendAnswer function

* Synchronously send answer in auto-unseal case

* Address review feedback

* Raft snapshots (#910)

* Fix existing snapshotting

* implement the noop snapshotting

* Add comments and switch log libraries

* add some snapshot tests

* add snapshot test file

* add TODO

* More work on raft snapshotting

* progress on the ConfigStore strategy

* Don't use two buckets

* Update the snapshot store logic to hide the file logic

* Add more backend tests

* Cleanup code a bit

* [WIP] Raft recovery (#938)

* Add recovery functionality

* remove fmt.Printfs

* Fix a few fsm bugs

* Add max size value for raft backend (#942)

* Add max size value for raft backend

* Include physical.ErrValueTooLarge in the message

* Raft snapshot Take/Restore API  (#926)

* Inital work on raft snapshot APIs

* Always redirect snapshot install/download requests

* More work on the snapshot APIs

* Cleanup code a bit

* On restore handle special cases

* Use the seal to encrypt the sha sum file

* Add sealer mechanism and fix some bugs

* Call restore while state lock is held

* Send restore cb trigger through raft log

* Make error messages nicer

* Add test helpers

* Add snapshot test

* Add shamir unseal test

* Add more raft snapshot API tests

* Fix locking

* Change working to initalize

* Add underlying raw object to test cluster core

* Move leaderUUID to core

* Add raft TLS rotation logic (#950)

* Add TLS rotation logic

* Cleanup logic a bit

* Add/Remove from follower state on add/remove peer

* add comments

* Update more comments

* Update request_forwarding_service.proto

* Make sure we populate all nodes in the followerstate obj

* Update times

* Apply review feedback

* Add more raft config setting (#947)

* Add performance config setting

* Add more config options and fix tests

* Test Raft Recovery (#944)

* Test raft recovery

* Leave out a node during recovery

* remove unused struct

* Update physical/raft/snapshot_test.go

* Update physical/raft/snapshot_test.go

* fix vendoring

* Switch to new raft interface

* Remove unused files

* Switch a gogo -> proto instance

* Remove unneeded vault dep in go.sum

* Update helper/testhelpers/testhelpers.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* Update vault/cluster/cluster.go

* track active key within the keyring itself (#6915)

* track active key within the keyring itself

* lookup and store using the active key ID

* update docstring

* minor refactor

* Small text fixes (#6912)

* Update physical/raft/raft.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* review feedback

* Move raft logical system into separate file

* Update help text a bit

* Enforce cluster addr is set and use it for raft bootstrapping

* Fix tests

* fix http test panic

* Pull in latest raft-snapshot library

* Add comment
2019-06-20 12:14:58 -07:00
ncabatoff 2cf5009ff5
Fix deadlock in TestPostgreSQLBackend (#6884)
Make lock2's retryInterval smaller so it grabs the lock as soon as lock1's renewer fails to renew in time.  Fix the logic to test if lock1's leader channel gets closed: we don't need a goroutine, and
the logic was broken in that if we timed out we'd never write to the blocking channel we then try to read from.  Moreover the timeout was wrong.
2019-06-14 12:59:24 -04:00
Mark Gritter d4d1ae55a0
Be more permissive in what URLs to allow; added test for unix socket. (#6859) 2019-06-10 13:59:13 -05:00
Justin Weissig ef17e7cbaa Update comment spelling in postgresql.go (#6817)
Fixed minor comment: PostgresSQL/PostgreSQL
2019-06-04 02:29:36 -05:00
Jim Kalafut 8a0d423ed8
Fix gofmt (#6764) 2019-05-20 15:15:05 -07:00
ncabatoff 64adec23cd
Since the two branches of the UNION produce disjoint sets, do a UNION ALL (#6546)
This obviates the need for a sort/eliminate duplicates step, giving a performance boost in some key distributions with large datasets.
2019-05-10 13:26:01 -04:00
bjorndolk e8f10814b6 Added HA backend for postgres based on dynamodb model (#5731)
Add optional HA support for postgres backend if Postgres version >= 9.5.
2019-05-10 12:48:42 -04:00