Commit Graph

504 Commits

Author SHA1 Message Date
Brian Kassouf c8dde052f2
storage/raft: Advertise the configured cluster address (#9008)
* storage/raft: Advertise the configured cluster address

* Don't allow raft to start with unspecified IP

* Fix concurrent map write panic

* Add test file

* changelog++

* changelog++

* changelog++

* Update tcp_layer.go

* Update tcp_layer.go

* Only set the adverise addr if set
2020-05-18 18:22:25 -07:00
Calvin Leung Huang 8cefbca1c9
Refactor service registration (#8976)
* serivceregistration: refactor service registration logic to run later

* move state check to the internal func

* sr/kubernetes: update setInitialStateInternal godoc

* sr/kubernetes: remove return in setInitialState

* core/test: fix mockServiceRegistration

* address review feedback
2020-05-15 11:06:58 -07:00
Jeff Mitchell 1d3d89e2aa
Create configutil and move some common config and setup functions there (#8362) 2020-05-14 09:19:27 -04:00
Mark Gritter bd766d7bae
Metrics wrapper that adds the cluster name as a label. (#8961) 2020-05-12 21:00:59 -05:00
Dustin Decker 08571a0ac3
Add identity num_entities gauge metric (#8816)
Signed-off-by: Dustin Decker <dustindecker@protonmail.com>
2020-04-23 19:29:42 -05:00
Calvin Leung Huang df23b481a6
core: change rawConfig to be atomic.Value (#8755)
This avoids SetConfig from having to grab a write lock which is called on a SIGHUP, and may block, along with a long-running requests that has a read lock held, any other operation that requires a state lock.
2020-04-16 16:34:46 -07:00
ncabatoff 5fe1ab766b
Add option to detect deadlocks in Core.stateLock using build tag `deadlock` (#8524) 2020-03-10 16:01:20 -04:00
ncabatoff e5721310ac
Add persistent feature flags to be used on enterprise non-primaries. (#8391) 2020-02-19 18:06:53 -05:00
Jeff Mitchell 844b2c3a5d Bump API/SDK and adapt to move from SDK stuff 2020-02-15 14:58:05 -05:00
Joe Dollard 8f74b4d2b6
provide vault server flag to exit on core shutdown (#7561)
* provide vault server flag to exit on core shutdown

* Update command/server.go

Co-Authored-By: Jeff Mitchell <jeffrey.mitchell@gmail.com>

Co-authored-by: Jeff Mitchell <jeffrey.mitchell@gmail.com>
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2020-02-14 18:07:31 -08:00
Brian Kassouf 3bbd9dc34c
Port some changes back to OSS (#8359) 2020-02-14 16:39:13 -08:00
Vishal Nayak c87d34d1a4
Seal migration with Raft (#8103)
* Seal migration after unsealing

* Refactor migration fields migrationInformation in core

* Perform seal migration as part of postUnseal

* Remove the sleep logic

* Use proper seal in the unseal function

* Fix migration from Auto to Shamir

* Fix the recovery config missing issue

* Address the non-ha migration case

* Fix the multi cluster case

* Avoid re-running seal migration

* Run the post migration code in new leaders

* Fix the issue of wrong recovery being set

* Address review feedback

* Add more complete testing coverage for seal migrations.   (#8247)

* Add more complete testing coverage for seal migrations.  Also remove VAULT_ACC gate from some tests that just depend on docker, cleanup dangling recovery config in storage after migration, and fix a call in adjustCoreForSealMigration that seems broken.

* Fix the issue of wrong recovery key being set

* Adapt tests to work with multiple cores.

* Add missing line to disable raft join.

Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>

* Fix all known issues

* Remove warning

* Review feedback.

* Revert my previous change that broke raft tests.  We'll need to come back and at least comment
this once we better understand why it's needed.

* Don't allow migration between same types for now

* Disable auto to auto tests for now since it uses migration between same types which is not allowed

* Update vault/core.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Add migration logs

* Address review comments

* Add the recovery config check back

* Skip a few steps if migration is already done

* Return from waitForLeadership if migration fails

Co-authored-by: ncabatoff <nick.cabatoff@gmail.com>
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2020-02-13 16:27:31 -05:00
Jeff Mitchell 86327b8010 Bump api/sdk and fix imports 2020-02-13 10:41:16 -05:00
ncabatoff a0ac4bdd2a
Audit generate root requests and responses. (#8301) 2020-02-06 11:56:37 -05:00
Becca Petrin 759f9b38f7
Observer pattern for service registration interface (#8123)
* use observer pattern for service discovery

* update perf standby method

* fix test

* revert usersTags to being called serviceTags

* use previous consul code

* vault isnt a performance standby before starting

* log err

* changes from feedback

* add Run method to interface

* changes from feedback

* fix core test

* update example
2020-01-24 09:42:03 -08:00
Vishal Nayak fb93b0bf9c Use Shamir as KeK when migrating from auto-seal to shamir (#8172)
* Use Shamir as KeK when migrating from auto-seal to shamir

* Use the correct number of shares/threshold for the migrated seal.

* Fix log message

* Add WaitForActiveNode to test

* Make test fail

* Minor updates

* Test with more shares and a threshold

* Add seal/unseal step to the test

* Update the logic that prepares seal migration (#8187)

* Update the logic that preps seal migration

* Add test and update recovery logic

Co-authored-by: ncabatoff <nick.cabatoff@gmail.com>
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2020-01-21 09:24:33 -08:00
Brian Kassouf f32a86ee7a
Create network layer abstraction to allow in-memory cluster traffic (#8173) 2020-01-16 23:03:02 -08:00
Vishal Nayak 8891f2ba88 Raft retry join (#7856)
* Raft retry join

* update

* Make retry join work with shamir seal

* Return upon context completion

* Update vault/raft.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>

* Address some review comments

* send leader information slice as a parameter

* Make retry join work properly with Shamir case. This commit has a blocking issue

* Fix join goroutine exiting before the job is done

* Polishing changes

* Don't return after a successful join during unseal

* Added config parsing test

* Add test and fix bugs

* minor changes

* Address review comments

* Fix build error

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2020-01-13 17:02:16 -08:00
Jeff Mitchell a0694943cc
Migrate built in auto seal to go-kms-wrapping (#8118) 2020-01-10 20:39:52 -05:00
Mike Jarmy e42bc0ffc0
Introduce optional service_registration stanza (#7887)
* move ServiceDiscovery into methods

* add ServiceDiscoveryFactory

* add serviceDiscovery field to vault.Core

* refactor ConsulServiceDiscovery into separate struct

* cleanup

* revert accidental change to go.mod

* cleanup

* get rid of un-needed struct tags in vault.CoreConfig

* add service_discovery parser

* add ServiceDiscovery to config

* cleanup

* cleanup

* add test for ConfigServiceDiscovery to Core

* unit testing for config service_discovery stanza

* cleanup

* get rid of un-needed redirect_addr stuff in service_discovery stanza

* improve test suite

* cleanup

* clean up test a bit

* create docs for service_discovery

* check if service_discovery is configured, but storage does not support HA

* tinker with test

* tinker with test

* tweak docs

* move ServiceDiscovery into its own package

* tweak a variable name

* fix comment

* rename service_discovery to service_registration

* tweak service_registration config

* Revert "tweak service_registration config"

This reverts commit 5509920a8ab4c5a216468f262fc07c98121dce35.

* simplify naming

* refactor into ./serviceregistration/consul
2019-12-06 09:46:39 -05:00
Jeff Mitchell 9b5392bc8f
Fix cluster cipher test (#7900)
Go 1.13 flipped TLS 1.3 to opt-out instead of opt-in, and its TLS 1.3
support does not allow configuring cipher suites. Simply remove the
affected test; it's not relevant going forward and there's ample
evidence it works properly prior to Go 1.13.
2019-11-18 23:04:49 -05:00
Jeff Mitchell 4e1470f483
Handpick cluster cipher suites when they're not user-set (#7487)
* Handpick cluster cipher suites when they're not user-set

There is an undocumented way for users to choose cluster cipher suites
but for the most part this is to paper over the fact that there are
undesirable suites in TLS 1.2.

If not explicitly set, have the set of cipher suites for the cluster
port come from a hand-picked list; either the allowed TLS 1.3 set (for
forwards compatibility) or the three identical ones for TLS 1.2.

The 1.2 suites have been supported in Go until at least as far back as
Go 1.9 from two years ago. As a result in cases where no specific suites
have been chosen this _ought_ to have no compatibility issues.

Also includes a useful test script.
2019-10-28 12:51:45 -04:00
Brian Kassouf a20e73c2da
Port filtered paths changes back to OSS (#7741)
* Port filtered paths changes back to OSS

* Fix build
2019-10-27 13:30:38 -07:00
Vishal Nayak 23b0fb62de Abstract generate-root authentication into the strategy interface (#7698)
* Abstract generate-root authentication into the strategy interface

* Generate root strategy ncabatoff (#7700)

* Adapt to new shamir-as-kek reality.

* Don't try to verify the master key when we might still be sealed (in
recovery mode).  Instead, verify it in the authenticate methods.
2019-10-23 09:52:28 -07:00
ncabatoff 1c98152fa0
Shamir seals now come in two varieties: legacy and new-style. (#7694)
Shamir seals now come in two varieties: legacy and new-style. Legacy
Shamir is automatically converted to new-style when a rekey operation
is performed. All new Vault initializations using Shamir are new-style.

New-style Shamir writes an encrypted master key to storage, just like
AutoUnseal. The stored master key is encrypted using the shared key that
is split via Shamir's algorithm. Thus when unsealing, we take the key
fragments given, combine them into a Key-Encryption-Key, and use that
to decrypt the master key on disk. Then the master key is used to read
the keyring that decrypts the barrier.
2019-10-18 14:46:00 -04:00
Lexman c86fe212c0
oss changes for entropy augmentation feature (#7670)
* oss changes for entropy augmentation feature

* fix oss command/server/config tests

* update go.sum

* fix logical_system and http/ tests

* adds vendored files

* removes unused variable
2019-10-17 10:33:00 -07:00
Michael Gaffney 4f7bd872ed
core: Log if an error is returned from postSealMigration (#7675) 2019-10-16 14:00:00 -04:00
Michael Gaffney 24f663403e
core: add postSealMigration method (#7579)
* core: add postSealMigration method

The postSealMigration method is called at the end of the postUnseal
method if a seal migration has occurred. This starts a seal rewrap
process in the enterprise version of. It is a no-op in the OSS version.
2019-10-16 12:52:37 -04:00
Michael Gaffney 6ae92ede11
core: add hook for initializing seals for migration (#7666)
* core: add hook for initializing seals for migration

Needed in enterprise version.
2019-10-15 15:48:23 -04:00
Vishal Nayak 0d077d7945
Recovery Mode (#7559)
* Initial work

* rework

* s/dr/recovery

* Add sys/raw support to recovery mode (#7577)

* Factor the raw paths out so they can be run with a SystemBackend.

# Conflicts:
#	vault/logical_system.go

* Add handleLogicalRecovery which is like handleLogical but is only
sufficient for use with the sys-raw endpoint in recovery mode.  No
authentication is done yet.

* Integrate with recovery-mode.  We now handle unauthenticated sys/raw
requests, albeit on path v1/raw instead v1/sys/raw.

* Use sys/raw instead raw during recovery.

* Don't bother persisting the recovery token.  Authenticate sys/raw
requests with it.

* RecoveryMode: Support generate-root for autounseals (#7591)

* Recovery: Abstract config creation and log settings

* Recovery mode integration test. (#7600)

* Recovery: Touch up (#7607)

* Recovery: Touch up

* revert the raw backend creation changes

* Added recovery operation token prefix

* Move RawBackend to its own file

* Update API path and hit it using CLI flag on generate-root

* Fix a panic triggered when handling a request that yields a nil response. (#7618)

* Improve integ test to actually make changes while in recovery mode and
verify they're still there after coming back in regular mode.

* Refuse to allow a second recovery token to be generated.

* Resize raft cluster to size 1 and start as leader (#7626)

* RecoveryMode: Setup raft cluster post unseal (#7635)

* Setup raft cluster post unseal in recovery mode

* Remove marking as unsealed as its not needed

* Address review comments

* Accept only one seal config in recovery mode as there is no scope for migration
2019-10-15 00:55:31 -04:00
Brian Kassouf 024c29c36a
OSS portions of raft non-voters (#7634)
* OSS portions of raft non-voters

* add file

* Update vault/raft.go

Co-Authored-By: Vishal Nayak <vishalnayak@users.noreply.github.com>
2019-10-11 11:56:59 -07:00
Calvin Leung Huang d8875b1991
sys/config: config state endpoint (#7424)
* sys/config: initial work on adding config state endpoint

* server/config: add tests, fix Sanitized method

* thread config through NewTestCluster's config to avoid panic on dev modes

* properly guard endpoint against request forwarding

* add http tests, guard against panics on nil RawConfig

* ensure non-nil rawConfig on NewTestCluster cores

* update non-forwarding logic

* fix imports; use no-forward handler

* add missing config test fixture; update gitignore

* return sanitized config as a map

* fix test, use deep.Equal to check for equality

* fix http test

* minor comment fix

* config: change Sanitized to return snake-cased keys, update tests

* core: hold rlock when reading config; add docstring

* update docstring
2019-10-08 10:57:15 -07:00
Michel Vocks a7a6dd55a5
Add config parameter to allow unauthenticated metrics access (#7550)
* Implement config parameter to allow unathenticated metricss access

* Add unit test for unauthenticated metrics access parameter

* go mod tidy
2019-10-04 09:29:51 +02:00
Michael Gaffney 6b1e1909e9
core: re-encrypt barrier and recovery keys if the unseal key is updated (#7493)
Seal keys can be rotated. When this happens, the barrier and recovery
keys should be re-encrypted with the new seal key. This change
automatically re-encrypts the barrier and recovery keys with the latest
seal key on the active node during the 'postUnseal' phase.
2019-10-03 16:40:18 -04:00
Matthew Irish 6e4cc02f4d
expose 'storage_type' on the sys/seal-status endpoint (#7486)
* expose 'storage_type' on the sys/seal-status endpoint

* add comments

* Update vault/core.go

Co-Authored-By: Brian Kassouf <briankassouf@users.noreply.github.com>
2019-09-18 14:07:18 -05:00
ncabatoff ed147b7ae7
Make clusterListener an atomic.Value to avoid races with getGRPCDialer. (#7408) 2019-09-03 11:59:56 -04:00
Jeff Mitchell 19f669a52f Sync some changes that weren't properly pulled over 2019-07-22 13:11:00 -04:00
Brian Kassouf 4d7d0d729a
storage/raft: When restoring a snapshot preseal first (#7011)
* storage/raft: When restoring a snapshot preseal first

* best-effort allow standbys to apply the restoreOp before sealing active node

* Don't cache the raft tls key

* Update physical/raft/raft.go

* Move pending raft peers to core

* Fix race on close bool

* Extend the leaderlease time for tests

* Update raft deps

* Fix audit hashing

* Fix race with auditing
2019-07-03 13:56:30 -07:00
Jeff Mitchell a3fc497fec
Fix batch token test (#7047)
At the level of role config it doesn't mean anything to use
default-service or default-batch; that's for mount tuning. So disallow
it in tokenutil. This also fixes the fact that the switch statement
wasn't right.
2019-07-02 22:16:43 -04:00
Brian Kassouf 62e14c280d
storage/raft: fix races in tests (#6996)
* storage/raft: fix races in tests

* Fix another test race
2019-06-27 10:00:03 -07:00
dr-db 720db0ffbc Fix the doc string for IsFatalError (#7000) 2019-06-27 12:50:47 +02:00
Vishal Nayak 53035ce390
Raft CLI (#6893)
* raft cli

* Reuse the command's client

* Better response handling

* minor touchups
2019-06-20 21:32:00 -04:00
Jeff Mitchell 07dcdc8b79 Sync 2019-06-20 20:55:10 -04:00
Brian Kassouf ed14061578
Raft Storage Backend (#6888)
* Work on raft backend

* Add logstore locally

* Add encryptor and unsealable interfaces

* Add clustering support to raft

* Remove client and handler

* Bootstrap raft on init

* Cleanup raft logic a bit

* More raft work

* Work on TLS config

* More work on bootstrapping

* Fix build

* More work on bootstrapping

* More bootstrapping work

* fix build

* Remove consul dep

* Fix build

* merged oss/master into raft-storage

* Work on bootstrapping

* Get bootstrapping to work

* Clean up FMS and node-id

* Update local node ID logic

* Cleanup node-id change

* Work on snapshotting

* Raft: Add remove peer API (#906)

* Add remove peer API

* Add some comments

* Fix existing snapshotting (#909)

* Raft get peers API (#912)

* Read raft configuration

* address review feedback

* Use the Leadership Transfer API to step-down the active node (#918)

* Raft join and unseal using Shamir keys (#917)

* Raft join using shamir

* Store AEAD instead of master key

* Split the raft join process to answer the challenge after a successful unseal

* get the follower to standby state

* Make unseal work

* minor changes

* Some input checks

* reuse the shamir seal access instead of new default seal access

* refactor joinRaftSendAnswer function

* Synchronously send answer in auto-unseal case

* Address review feedback

* Raft snapshots (#910)

* Fix existing snapshotting

* implement the noop snapshotting

* Add comments and switch log libraries

* add some snapshot tests

* add snapshot test file

* add TODO

* More work on raft snapshotting

* progress on the ConfigStore strategy

* Don't use two buckets

* Update the snapshot store logic to hide the file logic

* Add more backend tests

* Cleanup code a bit

* [WIP] Raft recovery (#938)

* Add recovery functionality

* remove fmt.Printfs

* Fix a few fsm bugs

* Add max size value for raft backend (#942)

* Add max size value for raft backend

* Include physical.ErrValueTooLarge in the message

* Raft snapshot Take/Restore API  (#926)

* Inital work on raft snapshot APIs

* Always redirect snapshot install/download requests

* More work on the snapshot APIs

* Cleanup code a bit

* On restore handle special cases

* Use the seal to encrypt the sha sum file

* Add sealer mechanism and fix some bugs

* Call restore while state lock is held

* Send restore cb trigger through raft log

* Make error messages nicer

* Add test helpers

* Add snapshot test

* Add shamir unseal test

* Add more raft snapshot API tests

* Fix locking

* Change working to initalize

* Add underlying raw object to test cluster core

* Move leaderUUID to core

* Add raft TLS rotation logic (#950)

* Add TLS rotation logic

* Cleanup logic a bit

* Add/Remove from follower state on add/remove peer

* add comments

* Update more comments

* Update request_forwarding_service.proto

* Make sure we populate all nodes in the followerstate obj

* Update times

* Apply review feedback

* Add more raft config setting (#947)

* Add performance config setting

* Add more config options and fix tests

* Test Raft Recovery (#944)

* Test raft recovery

* Leave out a node during recovery

* remove unused struct

* Update physical/raft/snapshot_test.go

* Update physical/raft/snapshot_test.go

* fix vendoring

* Switch to new raft interface

* Remove unused files

* Switch a gogo -> proto instance

* Remove unneeded vault dep in go.sum

* Update helper/testhelpers/testhelpers.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* Update vault/cluster/cluster.go

* track active key within the keyring itself (#6915)

* track active key within the keyring itself

* lookup and store using the active key ID

* update docstring

* minor refactor

* Small text fixes (#6912)

* Update physical/raft/raft.go

Co-Authored-By: Calvin Leung Huang <cleung2010@gmail.com>

* review feedback

* Move raft logical system into separate file

* Update help text a bit

* Enforce cluster addr is set and use it for raft bootstrapping

* Fix tests

* fix http test panic

* Pull in latest raft-snapshot library

* Add comment
2019-06-20 12:14:58 -07:00
Jeff Mitchell 3f1c510bc9
Fix a deadlock if a panic happens during request handling (#6920)
* Fix a deadlock if a panic happens during request handling

During request handling, if a panic is created, deferred functions are
run but otherwise execution stops. #5889 changed some locks to
non-defers but had the side effect of causing the read lock to not be
released if the request panicked. This fixes that and addresses a few
other potential places where things could go wrong:

1) In sealInitCommon we always now defer a function that unlocks the
read lock if it hasn't been unlocked already
2) In StepDown we defer the RUnlock but we also had two error cases that
were calling it manually. These are unlikely to be hit but if they were
I believe would cause a panic.

* Add panic recovery test
2019-06-19 09:40:57 -04:00
ncabatoff cc41e608dc
Add core.coreNumber field, used to differentiate multiple cores/clusters when running tests (#6855)
This is not used or exposed in prod.

Remove some test-specific code from the cluster-building helpers. The corresponding additions go on the ent side.
2019-06-10 14:07:16 -04:00
Jeff Mitchell c02abb969c
Attempt to grab read statelock in emit metrics to remove some raciness (#6829) 2019-06-05 12:26:29 -04:00
ncabatoff ad28263b69
Allow plugins to submit audit requests/responses via extended SystemView (#6777)
Move audit.LogInput to sdk/logical.  Allow the Data values in audited
logical.Request and Response to implement OptMarshaler, in which case
we delegate hashing/serializing responsibility to them.  Add new
ClientCertificateSerialNumber audit request field.

SystemView can now be cast to ExtendedSystemView to expose the Auditor
interface, which allows submitting requests and responses to the audit
broker.
2019-05-22 18:52:53 -04:00
Brian Kassouf 43783a5dca
Move cluster logic out of vault package (#6601)
* Move cluster logic out of vault package

* Dedup heartbeat and fix tests

* Fix test
2019-04-17 13:50:31 -07:00
Jeff Mitchell 9ebc57581d
Switch to go modules (#6585)
* Switch to go modules

* Make fmt
2019-04-13 03:44:06 -04:00
Jeff Mitchell 80c303ac83 Move ldaputil and tlsutil over to sdk 2019-04-12 18:26:54 -04:00
Jeff Mitchell 8bcb533a1b
Create sdk/ and api/ submodules (#6583) 2019-04-12 17:54:35 -04:00
T.K 453f1ac109 changed misspelled english words (#6432) 2019-03-19 09:32:45 -04:00
ncabatoff cd747c9318
Add code for writing and reading request counters to storage. (#5918)
Increment a counter whenever a request is received. 
The in-memory counter is persisted to counters/requests/YYYY/MM.
When the month wraps around, we reset the in-memory counter to
zero.
Add an endpoint for querying the request counters across all time.
2019-03-05 14:55:07 -05:00
Jeff Mitchell a83ed04730 Add ability to migrate autoseal to autoseal (#5930)
* Add ability to migrate autoseal to autoseal

This adds the ability to migrate from shamir to autoseal, autoseal to
shamir, or autoseal to autoseal, by allowing multiple seal stanzas. A
disabled stanza will be used as the config being migrated from; this can
also be used to provide an unwrap seal on ent over multiple unseals.

A new test is added to ensure that autoseal to autoseal works as
expected.

* Fix test

* Provide default shamir info if not given in config

* Linting feedback

* Remove context var that isn't used

* Don't run auto unseal watcher when in migration, and move SetCores to SetSealsForMigration func

* Slight logic cleanup

* Fix test build and fix bug

* Updates

* remove GetRecoveryKey function
2019-03-04 14:11:56 -08:00
Jeff Mitchell a3a2a3cd04 A few more syncs 2019-03-04 13:53:15 -05:00
Brian Kassouf efe5671f36 make fmt 2019-02-20 12:12:21 -08:00
Brian Kassouf f5b5fbb392
Refactor the cluster listener (#6232)
* Port over OSS cluster port refactor components

* Start forwarding

* Cleanup a bit

* Fix copy error

* Return error from perf standby creation

* Add some more comments

* Fix copy/paste error
2019-02-14 18:14:56 -08:00
Martin 9044173d6e Prometheus support on v1/sys/metrics endpoint (#5308)
* initial commit for prometheus and sys/metrics support

* Throw an error if prometheusRetentionTime is 0,add prometheus in devmode

* return when format=prometheus is used and prom is disable

* parse prometheus_retention_time from string instead of int

* Initialize config.Telemetry if nil

* address PR issues

* add sys/metrics framework.Path in a factory

* Apply requiredMountTable entries's MountConfig to existing core table

* address pr comments

* enable prometheus sink by default

* Move Metric-related code in a separate metricsutil helper
2019-02-14 12:46:59 -08:00
Vishal Nayak b4ba344782
Merge entities during unseal only on the primary (#6075)
* Merge entities during unseal only on the primary

* Add another guard check

* Add perf standby to the check

* Make primary to not differ from case-insensitivity status w.r.t secondaries

* Ensure mutual exclusivity between loading and invalidations

* Both primary and secondaries won't persist during startup and invalidations

* Allow primary to persist when loading case sensitively

* Using core.perfStandby

* Add a tweak in core for testing

* Address review feedback

* update memdb but not storage in secondaries

* Wire all the things directly do mergeEntity

* Fix persist behavior

* Address review feedback
2019-02-08 16:32:06 -05:00
Jeff Mitchell 9ef0680e7f
Fix leader info repopulation (#6167)
* Two things:

* Change how we populate and clear leader UUID. This fixes a case where
if a standby disconnects from an active node and reconnects, without the
active node restarting, the UUID doesn't change so triggers on a new
active node don't get run.

* Add a bunch of test helpers and minor updates to things.
2019-02-05 21:01:18 -05:00
Jeff Mitchell 1f57e3674a Move a common block up a level 2019-01-24 18:29:22 -05:00
Seth Vargo 98ad431d6d Continuously attempt to unseal if sealed keys are supported (#6039)
* Add helper for checking if an error is a fatal error

The double-double negative was really confusing, and this pattern is used a few places in Vault. This negates the double negative, making the devx a bit easier to follow.

* Check return value of UnsealWithStoredKeys in sys/init

* Return proper error types when attempting unseal with stored key

Prior to this commit, "nil" could have meant unsupported auto-unseal, a transient error, or success. This updates the function to return the correct error type, signaling to the caller whether they should retry or fail.

* Continuously attempt to unseal if sealed keys are supported

This fixes a bug that occurs on bootstrapping an initial cluster. Given a collection of Vault nodes and an initialized storage backend, they will all go into standby waiting for initialization. After one node is initialized, the other nodes had no mechanism by which they "re-check" to see if unseal keys are present. This adds a goroutine to the server command which continually waits for unseal keys to exist. It exits in the following conditions:

- the node is unsealed
- the node does not support stored keys
- a fatal error occurs (as defined by Vault)
- the server is shutting down

In all other situations, the routine wakes up at the specified interval and attempts to unseal with the stored keys.
2019-01-23 16:34:34 -05:00
Jim Kalafut d0e2badbae Run goimports across the repository (#6010)
The result will still pass gofmtcheck and won't trigger additional
changes if someone isn't using goimports, but it will avoid the
piecemeal imports changes we've been seeing.
2019-01-08 16:48:57 -08:00
Brian Kassouf 75e25711a0
Default seal type to Shamir on older seal configs (#5956) 2018-12-13 16:44:56 -08:00
Calvin Leung Huang e71017e5a9 Set request token entry within fetchACLTokenEntryAndEntity (#5880) 2018-12-03 11:57:53 -05:00
Brian Kassouf 33776b89c2
Wrap storage calls with encoding checks (#5819)
* Add encoding backend

* More work on encoding checks

* Update error message

* Update physical/encoding.go

* Disable key checks if configured
2018-11-19 13:13:16 -08:00
Becca Petrin 7bd22e6779
Run all builtins as plugins (#5536) 2018-11-06 17:21:24 -08:00
Jeff Mitchell 41649c1511 Clean up stored barrier keys after migration to shamir (#5671) 2018-11-05 14:06:39 -05:00
Jeff Mitchell f8ec4d59b8 Remove disableIndexing 2018-10-23 16:05:45 -04:00
Jeff Mitchell 8a274fba51 Add disable indexing to core object 2018-10-23 15:04:36 -04:00
Jeff Mitchell a979f49cd7 Add disable-indexing 2018-10-23 15:03:17 -04:00
Jeff Mitchell 82992d6097
Seal migration (OSS) (#781) 2018-10-22 23:34:02 -07:00
Jeff Mitchell 224fbd4a88 Merge branch 'master-oss' into 1.0-beta-oss 2018-10-16 10:08:03 -04:00
Jeff Mitchell 04e3f9b0f3
Add LastWAL in leader/health output (#5523) 2018-10-16 09:38:44 -04:00
Jeff Mitchell a64fc7d7cb
Batch tokens (#755) 2018-10-15 12:56:24 -04:00
Jim Kalafut 123e34f4a7
Don't copy HA lock file during migration (#5503) 2018-10-12 09:29:15 -07:00
Jeff Mitchell 919b968c27
The big one (#5346) 2018-09-17 23:03:00 -04:00
Jeff Mitchell c28ed23972
Allow most parts of Vault's logging to have its level changed on-the-fly (#5280)
* Allow most parts of Vault's logging to have its level changed on-the-fly

* Use a const for not set
2018-09-05 15:52:54 -04:00
Brian Shumate 45f1ca162f Log 'marked as sealed' at INFO instead (#5260) 2018-09-04 10:53:40 -07:00
Brian Kassouf c603a8b811
Add performance standby status to status output (#5192)
* Add performance standby status to status output

* Update ha.go
2018-08-27 10:01:07 -07:00
Brian Kassouf b7e33f1d2e
Port some HA changes (#5186) 2018-08-25 14:41:55 -07:00
Jeff Mitchell 362a92945e Don't resetnamed 2018-08-23 15:04:18 -04:00
Jeff Mitchell 71d92ef093 ACL Templating (#4994)
* Initial work on templating

* Add check for unbalanced closing in front

* Add missing templated assignment

* Add first cut of end-to-end test on templating.

* Make template errors be 403s and finish up testing

* Review feedback
2018-08-15 11:42:56 -07:00
Vishal Nayak 303b2f97ff
don't ignore errors (#5087) 2018-08-10 22:32:10 -04:00
Vishal Nayak 68d27c7c38
Adjust ACL error checking (#5086)
* Adjust acl error checking

* address review feedback

* unlock before checks
2018-08-10 21:05:10 -04:00
Jeff Mitchell 4798af88f9
Fix some cases where unauthorized calls could return 400 (#5083) 2018-08-10 08:59:58 -05:00
Chris Hoffman d8b1d19ed6
Plumbing request context through to expiration manager (#5021)
* plumbing request context to expiration manager

* moar context

* address feedback

* only using active context for revoke prefix

* using active context for revoke commands

* cancel tidy on active context

* address feedback
2018-08-01 21:39:39 -04:00
Brian Kassouf bb076f161d
core: Cancel context before taking state lock (#5020)
* core: Cancel context before taking state lock

* Create active context outside of postUnseal

* Attempt to drain requests before canceling context

* fix test
2018-08-01 12:07:37 -07:00
Chris Hoffman 51f8cd9668
do not grab statelock when requested not to (#5015) 2018-07-30 14:54:54 -04:00
Jeff Mitchell 0e659ca6be Simplify sealInternal 2018-07-24 23:26:28 -04:00
Jeff Mitchell 4261618d10 Add request timeouts in normal request path and to expirations (#4971)
* Add request timeouts in normal request path and to expirations

* Add ability to adjust default max request duration

* Some test fixes

* Ensure tests have defaults set for max request duration

* Add context cancel checking to inmem/file

* Fix tests

* Fix tests

* Set default max request duration to basically infinity for this release for BC

* Address feedback
2018-07-24 14:50:49 -07:00
Jeff Mitchell 9687ccc8fa Tackle #4929 a different way (#4932)
* Tackle #4929 a different way

This turns c.sealed into an atomic, which allows us to call sealInternal
without a lock. By doing so we can better control lock grabbing when a
condition causing the standby loop to get out of active happens. This
encapsulates that logic into two distinct pieces (although they could
be combined into one), and makes lock guarding more understandable.

* Re-add context canceling to the non-HA version of sealInternal

* Return explicitly after stopCh triggered
2018-07-24 13:57:25 -07:00
Vishal Nayak 0d8f424ab4
disallow token use if entity is invalid (#4791) 2018-06-19 12:57:19 -04:00
Jeff Mitchell 5d44c54947
Changes the way policies are reported in audit logs (#4747)
* This changes the way policies are reported in audit logs.

Previously, only policies tied to tokens would be reported. This could
make it difficult to perform after-the-fact analysis based on both the
initial response entry and further requests. Now, the full set of
applicable policies from both the token and any derived policies from
Identity are reported.

To keep things consistent, token authentications now also return the
full set of policies in api.Secret.Auth responses, so this both makes it
easier for users to understand their actual full set, and it matches
what the audit logs now report.
2018-06-14 09:49:33 -04:00
Jeff Mitchell d3cbde6ae2 Fix nil pointer 2018-06-09 16:57:57 -04:00
Jeff Mitchell 8916f6b625
Some atomic cleanup (#4732)
Taking inspiration from
https://github.com/golang/go/issues/17604#issuecomment-256384471
suggests that taking the address of a stack variable for use in atomics
works (at least, the race detector doesn't complain) but is doing it
wrong.

The only other change is a change in Leader() detecting if HA is enabled
to fast-path out. This value never changes after NewCore, so we don't
need to grab the read lock to check it.
2018-06-09 15:35:22 -04:00
Brian Kassouf 05b46def62
Cleanup core.go and break out HA functionality (#4661) 2018-05-30 17:30:50 -07:00
Jeff Mitchell 7e7163f826 Factor out a bunch of shared code 2018-05-21 17:46:32 -04:00
Jeff Mitchell a9d8be3c4d WIP 2018-05-19 21:31:45 -04:00
Calvin Leung Huang dd7520459e
Token revocation refactor (#4512)
* Hand off lease expiration to expiration manager via timers

* Use sync.Map as the cache to track token deletion state

* Add CreateOrFetchRevocationLeaseByToken to hand off token revocation to exp manager

* Update revoke and revoke-self handlers

* Fix tests

* revokeSalted: Move token entry deletion into the deferred func

* Fix test race

* Add blocking lease revocation test

* Remove test log

* Add HandlerFunc on NoopBackend, adjust locks, and add test

* Add sleep to allow for revocations to settle

* Various updates

* Rename some functions and variables to be more clear
* Change step-down and seal to use expmgr for revoke functionality like
during request handling
* Attempt to WAL the token as being invalid as soon as possible so that
further usage will fail even if revocation does not fully complete

* Address feedback

* Return invalid lease on negative TTL

* Revert "Return invalid lease on negative TTL"

This reverts commit a39597ecdc23cf7fc69fe003eef9f10d533551d8.

* Extend sleep on tests
2018-05-10 15:50:02 -04:00
Brian Kassouf 55997b6bf0
physical/cache: Add a list of prefixes to not cache (#4515)
* physical/cache: Add a list of prefixes to not cache

* Rename the pathmanager

* Move cache back to the beggining of postUnseal

* Fix comment
2018-05-10 10:29:26 -07:00
Jeff Mitchell 61f2144adf Add identity store nil checks 2018-04-24 23:10:22 -04:00
Jeff Mitchell d54191adff Use permission denied for entity disabling 2018-04-23 16:50:04 -04:00
Jeff Mitchell 78b09d77c0 Use ResetNamed for sealunwrapper logger 2018-04-23 13:42:02 -04:00
Becca Petrin dbd1f119ed Use a sub-logger for the transactional cache (#4428) 2018-04-23 12:08:12 -04:00
Jeff Mitchell 6d95b4d266
Add the ability to restrict token usage by IP. Add to token roles. (#4412)
Fixes #815
2018-04-21 10:49:16 -04:00
Jeff Mitchell e69845b458
Defer setting views read/write until the end of postUnseal (#4392)
A few notes:

* We exert positive control over singletons and they usually need to
perform some (known, validated) writes, so this excludes singletons --
they are simply limited to the end of the mount function as before.

* I'm not sure how to test this _specifically_; I've done some testing
of e.g. sealing vault and unsealing and ensuring that I can write to a
KV mount. I think this is tested by every dev server though, since for a
dev server Vault is inited, the default mounts are mounted, then it's
sealed, then it's unsealed for the user, so it already goes through this
code path. The mere fact that you can write to secret/ on a dev server
means it was successfully set read-write.
2018-04-19 13:29:43 -04:00
Jeff Mitchell 530121c655
Add ability to disable an entity (#4353) 2018-04-13 21:49:40 -04:00
Vishal Nayak 28e3eb9e2c
Errwrap everywhere (#4252)
* package api

* package builtin/credential

* package builtin/logical

* package command

* package helper

* package http and logical

* package physical

* package shamir

* package vault

* package vault

* address feedback

* more fixes
2018-04-05 11:49:21 -04:00
Chris Hoffman e293fe84c3 OSS: Adding UI handlers and configurable headers (#390)
* adding UI handlers and UI header configuration

* forcing specific static headers

* properly getting UI config value from config/environment

* fixing formatting in stub UI text

* use http.Header

* case-insensitive X-Vault header check

* fixing var name

* wrap both stubbed and real UI in header handler

* adding test for >1 keys
2018-04-03 09:34:01 -05:00
Becca Petrin 03cf302e9a Move to "github.com/hashicorp/go-hclog" (#4227)
* logbridge with hclog and identical output

* Initial search & replace

This compiles, but there is a fair amount of TODO
and commented out code, especially around the
plugin logclient/logserver code.

* strip logbridge

* fix majority of tests

* update logxi aliases

* WIP fixing tests

* more test fixes

* Update test to hclog

* Fix format

* Rename hclog -> log

* WIP making hclog and logxi love each other

* update logger_test.go

* clean up merged comments

* Replace RawLogger interface with a Logger

* Add some logger names

* Replace Trace with Debug

* update builtin logical logging patterns

* Fix build errors

* More log updates

* update log approach in command and builtin

* More log updates

* update helper, http, and logical directories

* Update loggers

* Log updates

* Update logging

* Update logging

* Update logging

* Update logging

* update logging in physical

* prefixing and lowercase

* Update logging

* Move phyisical logging name to server command

* Fix som tests

* address jims feedback so far

* incorporate brians feedback so far

* strip comments

* move vault.go to logging package

* update Debug to Trace

* Update go-plugin deps

* Update logging based on review comments

* Updates from review

* Unvendor logxi

* Remove null_logger.go
2018-04-02 17:46:59 -07:00
Josh Soref 73b1fde82f Spelling (#4119) 2018-03-20 14:54:10 -04:00
Brian Kassouf 34d958968d Revert stopoptions (#4088)
* Use an atomic value instead to communicate whether to give up HA lock

* Remove now-unneeded StopOptions struct

* Add a channel based mutex acquisition function to avoid a deadlock

* Make periodic leader refresh only spawn a single waiting goroutine and
extend logic to the key upgrade check
2018-03-06 21:35:58 -05:00
Jeff Mitchell 6b777dc63e
Rearrange shutdown logic to remove some lock switching (#4083)
* Rearrange shutdown logic to remove some lock switching and to allow
giving up the HA lock to be skipped.
2018-03-06 18:06:09 -05:00
Calvin Leung Huang e2fb199ce5
Non-HMAC audit values (#4033)
* Add non-hmac request keys

* Update comment

* Initial audit request keys implementation

* Add audit_non_hmac_response_keys

* Move where req.NonHMACKeys gets set

* Minor refactor

* Add params to auth tune endpoints

* Sync cache on loadCredentials

* Explicitly unset req.NonHMACKeys

* Do not error if entry is nil

* Add tests

* docs: Add params to api sections

* Refactor audit.Backend and Formatter interfaces, update audit broker methods

* Add audit_broker.go

* Fix method call params in audit backends

* Remove fields from logical.Request and logical.Response, pass keys via LogInput

* Use data.GetOk to allow unsetting existing values

* Remove debug lines

* Add test for unsetting values

* Address review feedback

* Initialize values in FormatRequest and FormatResponse using input values

* Update docs

* Use strutil.StrListContains

* Use strutil.StrListContains
2018-03-02 12:18:39 -05:00
Jeff Mitchell f1bd0cbe74
Use atomic values in seal to avoid some data races (#4040) 2018-02-23 17:18:48 -05:00
Jeff Mitchell d4a431b298
Move local cluster parameters to atomic values to fix some potential data races (#4036) 2018-02-23 14:47:07 -05:00
Jeff Mitchell 96ea0620fd
Add sealunwrapper to ease OSS downgrades (#3936) 2018-02-09 16:37:40 -05:00
Jeff Mitchell 60e2209532
Remove core restriction in cache and turn it into an active/standby restriction instead (#3849) 2018-01-25 22:21:51 -05:00
Jeff Mitchell ad9da2e0b8
Fix intermittent panic by storing a reference to the grpc server (#3842)
* Fix intermittent panic by storing a reference to the grpc server and
using that to ensure it will never be nil.

* Just get rid of c.rpcServer
2018-01-24 20:23:08 -05:00
Jeff Mitchell 8e8675053b Sync some bits over 2018-01-22 21:44:49 -05:00
Jeff Mitchell 7d6fed2e86
Use a separate var for active node replication state (#3819) 2018-01-19 19:24:04 -05:00
Jeff Mitchell 69aead14f0 A bit more context plumbing 2018-01-19 04:11:59 -05:00
Jeff Mitchell 33b68ebf3d Remove context from a few extraneous places 2018-01-19 03:44:06 -05:00
Jeff Mitchell 0f7e3bb79b Add context to performPolicyChecks 2018-01-19 02:43:39 -05:00
Brian Kassouf 2f19de0305 Add context to storage backends and wire it through a lot of places (#3817) 2018-01-19 01:44:44 -05:00
Calvin Leung Huang bffaf9b0c3 Remove vlogger, add log.Logger to PolicyStore struct (#3813) 2018-01-18 01:01:51 -05:00
Jeff Mitchell 5371b0489f Rename var from last commit 2018-01-17 23:08:35 -05:00
Jeff Mitchell 307a177540 Make heartbeatInterval a package var to allow tests to modify it 2018-01-17 23:05:11 -05:00
Jeff Mitchell 88313f0398 Update replication state logic.
Fixes #3727
2018-01-16 13:51:55 -05:00
Brian Kassouf 1c190d4bda
Pass context to backends (#3750)
* Start work on passing context to backends

* More work on passing context

* Unindent logical system

* Unindent token store

* Unindent passthrough

* Unindent cubbyhole

* Fix tests

* use requestContext in rollback and expiration managers
2018-01-08 10:31:38 -08:00
Jeff Mitchell 6681329041 Update redirect address error to be more clear 2018-01-04 10:45:40 -05:00
Jeff Mitchell 1fe494e8e1
Port IsDRSecondary over and enable returning it via sys_health (#3749) 2018-01-03 15:07:13 -05:00
Jeff Mitchell 92aaebd20e Bust recovery cache after unsealed is set true to force migration 2017-12-19 16:22:09 -05:00
Chris Hoffman f966d20225
Adding ability to cache core values, cache transaction improvements (#3640) 2017-12-06 12:25:17 -05:00
Jeff Mitchell 20320cd71e
Move location of quit channel closing in exp manager (#3638)
* Move location of quit channel closing in exp manager

If it happens after stopping timers any timers firing before all timers
are stopped will still run the revocation function. With plugin
auto-crash-recovery this could end up instantiating a plugin that could
then try to unwrap a token from a nil token store.

This also plumbs in core so that we can grab a read lock during the
operation and check standby/sealed status before running it (after
grabbing the lock).

* Use context instead of checking core values directly

* Use official Go context in a few key places
2017-12-01 17:08:38 -05:00
Calvin Leung Huang 2f9ff5c062 Acquire state lock at the start of UnsealWithRecoveryKeys (#3579)
* Acquire state lock from the start of UnsealWithRecoveryKeys, add metrics

* Change metrics variable
2017-11-13 22:05:22 -05:00
Calvin Leung Huang d9eaacf5de
Barrier unseal using recovery keys (#3541)
* Barrier unseal using recovery keys

* Remove tests
2017-11-07 15:15:39 -05:00
Vishal Nayak e4e4a7ba67
Capabilities responds considering policies on entities and groups (#3522)
* Capabilities endpoint will now return considering policies on entities and groups

* refactor the policy derivation into a separate function

* Docs: Update docs to reflect the change in capabilities endpoint
2017-11-03 11:20:10 -04:00
Vishal Nayak 7bae606662
External identity groups (#3447)
* external identity groups

* add local LDAP groups as well to group aliases

* add group aliases for okta credential backend

* Fix panic in tests

* fix build failure

* remove duplicated struct tag

* add test steps to test out removal of group member during renewals

* Add comment for having a prefix check in router

* fix tests

* s/parent_id/canonical_id

* s/parent/canonical in comments and errors
2017-11-02 16:05:48 -04:00
Jeff Mitchell 1ba4e828b4
Ensure revocation happens before seal/step-down since token store isn't (#3500)
available after when using single-use tokens.

Fixes #3497
2017-11-02 08:47:02 -05:00
Jeff Mitchell aaa1094240 Remove extra curly brace 2017-10-23 16:06:27 -04:00
Jeff Mitchell b7ffe7ec99 Sync over 2017-10-23 16:03:36 -04:00
Vishal Nayak ea1f552d48 set entity id on the token (#3468) 2017-10-18 13:23:05 -04:00
Vishal Nayak f7ed6732a5 Porting identity store (#3419)
* porting identity to OSS

* changes that glue things together

* add testing bits

* wrapped entity id

* fix mount error

* some more changes to core

* fix storagepacker tests

* fix some more tests

* fix mount tests

* fix http mount tests

* audit changes for identity

* remove upgrade structs on the oss side

* added go-memdb to vendor
2017-10-11 10:21:20 -07:00
Chris Hoffman 1029ad3b33 Rename "generic" secret backend to "kv" (#3292) 2017-09-15 09:02:29 -04:00
Chris Hoffman 9d73c81f38 Disable the `sys/raw` endpoint by default (#3329)
* disable raw endpoint by default

* adding docs

* config option raw -> raw_storage_endpoint

* docs updates

* adding listing on raw endpoint

* reworking tests for enabled raw endpoints

* root protecting base raw endpoint
2017-09-15 00:21:35 -04:00
Brian Kassouf a8d9426d9f Update locking components from DR replication changes (#3283)
* Update locking components from DR replication changes

* Fix plugin backend test

* Add a comment about needing the statelock:
2017-09-04 19:38:37 -04:00
Calvin Leung Huang a581e96b78 Lazy-load plugin mounts (#3255)
* Lazy load plugins to avoid setup-unwrap cycle

* Remove commented blocks

* Refactor NewTestCluster, use single core cluster on basic plugin tests

* Set c.pluginDirectory in TestAddTestPlugin for setupPluginCatalog to work properly

* Add special path to mock plugin

* Move ensureCoresSealed to vault/testing.go

* Use same method for EnsureCoresSealed and Cleanup

* Bump ensureCoresSealed timeout to 60s

* Correctly handle nil opts on NewTestCluster

* Add metadata flag to APIClientMeta, use meta-enabled plugin when mounting to bootstrap

* Check metadata flag directly on the plugin process

* Plumb isMetadataMode down to PluginRunner

* Add NOOP shims when running in metadata mode

* Remove unused flag from the APIMetadata object

* Remove setupSecretPlugins and setupCredentialPlugins functions

* Move when we setup rollback manager to after the plugins are initialized

* Fix tests

* Fix merge issue

* start rollback manager after the credential setup

* Add guards against running certain client and server functions while in metadata mode

* Call initialize once a plugin is loaded on the fly

* Add more tests, update basic secret/auth plugin tests to trigger lazy loading

* Skip mount if plugin removed from catalog

* Fixup

* Remove commented line on LookupPlugin

* Fail on mount operation if plugin is re-added to catalog and mount is on existing path

* Check type and special paths on startBackend

* Fix merge conflicts

* Refactor PluginRunner run methods to use runCommon, fix TestSystemBackend_Plugin_auth
2017-09-01 01:02:03 -04:00