Commit Graph

2410 Commits

Author SHA1 Message Date
Cory Bolar 6c726bde67
Only fail auth plugin backend mounting for custom plugins (#11690)
Checking if a plugin is a builtin by comparing it's type to those in the builtin
registry allows for a custom plugin loaded with the same name to be considered a
builtin during error handling of the mounting process.  This can cause the vault
installation to brick itself because it cannot be unsealed to register a new
sha256 or file path for a previously loaded custom plugin.  Improve this logic
by checking the plugin catalog rather than the builtin registry.

Fixes #11687
2021-05-27 13:37:45 -05:00
Scott Miller 46da8ebae6
Diagnose Disk usage checks (#11673)
* Disk usage checks

* Move disk free earlier

* Move logic to helpers

* Bring over test logic from the ulimit PR

* imports

* Report error

* Get unit tests working
2021-05-27 10:17:52 -07:00
Hridoy Roy b6e1d62d33
Modularize Run Command (#11573)
* initial refactoring of unseal step in run

* remove waitgroup

* remove waitgroup

* backup work

* backup

* backup

* completely modularize run and move into diagnose

* add diagnose errors for incorrect number of unseal keys

* comment tests back in

* backup

* first subspan

* finished subspanning but running into error with timeouts

* remove runtime checks

* meeting updates

* remove telemetry block

* roy comment

* subspans for seal finalization and wrapping diagnose latency checks

* fix storage latency test errors

* review comments

* use random uuid for latency checks instead of static id
2021-05-25 15:23:20 -07:00
Scott Miller e305f78e9c
Add a Success helper to set successful spans' messages (#11621) 2021-05-21 19:21:23 -07:00
Scott Miller 941d01eee3
Initial Diagnose CLI output (#11583)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* wip

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

* wip

* wip

* wip

* Revendor otel

* Fix merge related problems

* imports

* Fix unit tests

Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-21 19:21:11 -07:00
Nick Cabatoff 2adef1f878
OSS parts of #1891 (sys/health license addition) (#11676) 2021-05-20 13:32:15 -04:00
Nick Cabatoff 01f96f18ce
VAULT-2439: OSS parts of #1889 (raft licensing init) (#11665) 2021-05-19 16:07:58 -04:00
Brian Kassouf eabfda0c79
Make sure we are checking the final peerset (#11664) 2021-05-19 12:01:52 -07:00
Brian Kassouf f498d0d389
Reload raft TLS keys on active startup (#11660) 2021-05-19 10:03:32 -07:00
Scott Miller 6b8d7fe2e6
Patch expiration fix over from ENT (#11650)
* Patch expiration fix over from ENT

* Rename changelog
2021-05-18 14:55:38 -07:00
Nick Cabatoff e212ec5d8e
OSS parts of ent PR #1857: license autoloading init changes. (#11623) 2021-05-17 14:10:26 -04:00
Scott Miller 7b05fcf9bf
Don't backoff if a listener error was a timeout (#11594) 2021-05-13 17:02:25 -05:00
Jason O'Donnell 502cf3b212
core: updates to password policy generator (#11596)
* core: fix bug in password policies not using namespaces

* Add changelog
2021-05-13 09:55:46 -04:00
Josh Black 641a81b74b
Provide a new API endpoint for retrieving signed licenses (#11543) 2021-05-12 12:19:25 -07:00
Scott Miller 9dbf1a7dba
Add infrastructure and helpers for skipping diagnose checks (#11593)
* Add infrastructure for skipping tests

* Add infrastructure for skipping tests

* Set it

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Implement type alias for test functions

Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-12 12:54:40 -05:00
Scott Miller f0c3192f20
Add an exponential backoff to TCP listeners to avoid fast loops in error scenarios (#11588)
* Add an exponential backoff to TCP listeners to avoid fast loops in error scenarios

* reset loop delay

* changelog
2021-05-12 10:47:38 -05:00
swayne275 0229787a26
Vault 2303: Count irrevocable leases in quotas/metrics and other improvements (#11542)
* shrink generic error message

* move zombie loading to updatePendingInternal from loadEntryInternal

* probably the right metric/lease behavior for irrevocable leases...

* comment improvements

* test total lease count with valid and irrevocable leases
2021-05-11 14:04:06 -06:00
Lars Lehtonen 53dd619d2f
vault: deprecate errwrap.Wrapf() (#11577) 2021-05-11 13:12:54 -04:00
swayne275 f467d86805
don't panic if no data, improve log (#11581) 2021-05-11 10:41:57 -06:00
Scott Miller 40e0806f29
Add timeout helper (#11554)
* Add timeout helper

* cleanup-imports

* didnt need mainsection

Co-authored-by: HridoyRoy <roy@hashicorp.com>
2021-05-10 17:55:43 -07:00
swayne275 ee53fc5afb
Fix activity log fragment race test (#11565)
* grab reference to done channel while locked

* add clarifying comment

* lock and grab channel reference again for test races
2021-05-07 17:02:11 -06:00
Nick Cabatoff 53c7d1de7d
config for autoloading license (oss parts) 2021-05-07 08:55:41 -04:00
Nick Cabatoff 0926e302c5
Fix some panics resulting from calling ExpirationManager.Stop directly. (#11547) 2021-05-06 12:39:01 -04:00
Nick Cabatoff 663ad150a7
Make TestActivityLog_MultipleFragmentsAndSegments timeout on its own (#11490)
* The main driver for this change was to make the read from a.newFragmentCh timeout quickly rather than waiting for the test timeout (much longer).  While testing the change I observed a panic during shutdown, but it was swallowed and moreover there was no stack trace so it wasn't obvious.  I'm hoping we can get rid of the recover, so I fixed the issue in the activitylog tests that needed it.
2021-05-06 10:19:53 -04:00
Scott Miller 4fc6e8b366
Fix barrier key autoration config edge cases (#11541)
* Add an Int64 type

* Use the new Int64 type so that even 32 bit builds can specify max_operations above 2^31

* Missed a spot

* go mod vendor

* fix cast

* changelog

* Update unit test to ensure this works on both 32 and 64-bit archs
2021-05-05 14:39:04 -05:00
swayne275 56c57ffddd
change zombie terminology to irrevocable (#11525) 2021-05-04 15:48:02 -06:00
swayne275 335e4c3711
Introduce Logical Unrecoverable Error, Use it in Expiration Manager (#11477)
* build out zombie lease system

* add typo for CI

* undo test CI commit

* time equality test isn't working on CI, so let's see what this does...

* add unrecoverable proto error, make proto, go mod vendor

* zombify leases if unrecoverable error, tests

* test fix: somehow pointer in pointer rx is null after pointer rx called

* tweaks based on roy feedback

* improve zombie errors

* update which errors are unrecoverable

* combine zombie logic

* keep subset of zombie lease in memory
2021-05-03 17:56:06 -06:00
Vishal Nayak 1e61f799ca
Use correct mount accessor when refreshing external group memberships (#11506)
* Use correct mount accessor when refreshing external group memberships

* Add CL

* Handle the renew case properly
2021-05-03 08:23:59 -04:00
Hridoy Roy 28aa9b9175
Diagnoses Consul Direct Access Check (#11505)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* consul tls checks

* draft for storage end to end check

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

* cleaned up chan logic. Tests next.

* fix tests

* remove a comment

* tests

* remove a comment

* run direct access checks in diagnose command

* review comments

Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-02 16:21:06 -07:00
Hridoy Roy 5b63066c2c
Diagnose V0: Storage End to End Checks (#11468)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* consul tls checks

* draft for storage end to end check

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

* cleaned up chan logic. Tests next.

* fix tests

* remove a comment

* tests

* remove a comment

* cosmetic changes

Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-02 13:33:13 -07:00
Scott Miller fcb9bab51b
Fix rotate/config unit test for 32 bit yet again (#11491) 2021-04-30 10:31:11 -05:00
Scott Miller 85fbd45e1c
Create helpers which integrate with OpenTelemetry for diagnose collection (#11454)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

Co-authored-by: swayne275 <swayne275@gmail.com>
2021-04-29 13:32:41 -05:00
swayne275 8db70c9e58
Expiration Manager: Handle Presumed Irrevocable Leases Separately (#11452)
* build out zombie lease system

* add typo for CI

* undo test CI commit

* time equality test isn't working on CI, so let's see what this does...

* protect against nil pointer receiver calls
2021-04-29 09:12:02 -06:00
Josh Black ec105f288f
Switch to shared raft-boltdb library and add metrics (#11269) 2021-04-26 16:01:26 -07:00
Nick Cabatoff e3a88c59e9
Add metrics for perf standby and replication node type. (#11472) 2021-04-26 13:54:19 -04:00
Nick Cabatoff 5a4151873c
Cast constant to int64 so it doesn't overflow. (#11466) 2021-04-26 10:04:39 -04:00
Hridoy Roy 22cab6185d
[VAULT-1441] Fix race that allowed remounting on path used by another mount (#11453)
* remount concurrent test fix

* changelog

* Update changelog/11453.txt

Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>

Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
2021-04-24 09:05:41 -07:00
Angel Garbarino 2e35e9578c
UI/obscure secret on input (#11284)
* new font and add as font-family to be used in masked-input

* clean up logic

* refactor for displayOnly

* start cert masking

* work on certificates

* upload cert work

* fix global styling

* fix styling for class no longer used

* make mask by default and remove option

* glimmerize start and certificate on LDAP a file field

* glimmerize actions

* first part of glimmerizing text-file still need to do some clean up

* not doing awesome over here

* getting ready to un-glimmer

* unglimmerize

* remove placeholder based on conversations with design

* clean up text-file

* cleanup

* fix class bindings

* handle class binding

* set up for test

* fix elementId

* track down index

* update masked-input test

* add more to the masked-input test

* test-file test

* fix broken test

* clear old style

* clean up

* remove pgp key masked font, this really needs to be refactored to text-file component

* changelog

* cover other certificate view

* add allowCopy

* address some pr styling comments

* improve test coverage

* fix some issues

* add attr.options.masked
2021-04-22 08:58:37 -06:00
Josh Black 06809930a3
Add HTTP response headers for hostname and raft node ID (if applicable) (#11289) 2021-04-20 15:25:04 -07:00
Nick Cabatoff b7ed30afed
Poll until token disappears from storage. (#11424) 2021-04-20 14:55:10 -04:00
Nick Cabatoff 75ec35421d
Refactor TestTokenStore_CubbyholeTidy so that it can be tested on ent with a non-root namespace. (#11422) 2021-04-20 14:49:25 -04:00
Brian Kassouf c400dfdd3d
core/token: fix panic looking up invalid batch tokens (#11415) 2021-04-19 18:26:51 -07:00
Nick Cabatoff 474c4e8134
Make cubbyhole revocation/tidying compatible with cubbys in namespaces. (#11408) 2021-04-19 17:28:04 -04:00
Nick Cabatoff a8023e0fdb
Add support for unauthenticated pprof access on a per-listener basis,… (#11324)
* Add support for unauthenticated pprof access on a per-listener basis, as we do for metrics.

* Add missing pprof sub-targets like 'allocs' and 'block'.  Capture the goroutine subtarget a second time in text form.  This is mostly a convenience, but also I think the pprof format might be a bit lossy?
2021-04-19 14:30:59 -04:00
Nick Cabatoff 541ae8636c
On lease deletion, also delete non-orphan batch token parent index (#11377) 2021-04-16 17:03:22 -04:00
Nick Cabatoff b07a10331f
Add metrics for requests forwarded by standbys. (#11366) 2021-04-16 14:02:20 -04:00
Nick Cabatoff 242d258e94
Fix goroutine leak caused by updating rate quotas (#11371)
Make sure that when we modify a rate quota, we stop the existing goroutine before starting the new one.
2021-04-16 14:00:01 -04:00
Nick Cabatoff 33dd025278
When a standby does a ForwardRequest, it's not using the request context, and thus not getting timed out properly when it takes too long. (#11322)
The rpcClientConnContext is still used to terminate gRPC internal/dialer-related goroutines, but the actual RPC is now timed out when the request times out, e.g. due to the default max request duration.  This mirrors what we do with the parallel forwarding code in ENT.
2021-04-15 10:23:26 -04:00
Hridoy Roy 996f114ad7
TLS Diagnose Formatting Fixes (#11342)
* diagnose formatting fixes

* diagnose formatting fixes
2021-04-12 10:55:33 -07:00
Hridoy Roy fde9f2f71d
Add More TLS Tests and Verification of TLS Root Certificate (#11300)
* tls tests and root verification

* make the certificate verification check correct for non root CA case

* add expiry test

* addressed comments but struggling with the bug in parsing Cas and inters from single file:

* final checks on tls and listener

* cleanup
2021-04-12 08:39:40 -07:00
Vishal Nayak 4666f40925
Support autopilot when raft is for HA only (#11260) 2021-04-12 09:33:21 -04:00
Brian Kassouf 303c2aee7c
Run a more strict formatter over the code (#11312)
* Update tooling

* Run gofumpt

* go mod vendor
2021-04-08 09:43:39 -07:00
Brian Kassouf de0253056c
Fix a few static analysis findings (#11307) 2021-04-07 16:48:40 -07:00
Nick Cabatoff c2673ee86a
Move SanitizedConfig back to a shared-ent file. (#11291) 2021-04-07 10:25:05 -04:00
Hridoy Roy 049f2513e6
Initial Diagnose Command for TLS and Listener [VAULT-1896, VAULT-1899] (#11249)
* sanity checks for tls config in diagnose

* backup

* backup

* backup

* added necessary tests

* remove comment

* remove parallels causing test flakiness

* comments

* small fix

* separate out config hcl test case into new hcl file

* newline

* addressed comments

* addressed comments

* addressed comments

* addressed comments

* addressed comments

* reload funcs should be allowed to be nil
2021-04-06 16:40:43 -07:00
Nick Cabatoff 44c00cd54f
Fix: leader_tls_servername raft option only worked when used with mTLS and/or an explicit CA cert. (#11252) 2021-04-06 09:16:54 -04:00
Nick Cabatoff df7404e67e
Fix deadlock in inmemlayer (#11225)
Also tweak autopilot test timings to adapt to things running faster.
2021-03-31 13:40:26 -04:00
swayne275 532093d065
fix end date calculation (#11232) 2021-03-30 09:58:45 -06:00
Scott Miller c44f18e3e4
Use pointers to 64-bit atomics to enforce alignment in AES-GCM Barrier (#11211)
* Align atomics in barrier

* Use the pointer solution
2021-03-26 14:51:56 -05:00
Nick Cabatoff 41d9030fbb
Disable autopilot in raft-ha mode. (#11181)
* Disable autopilot in raft-ha mode.

* Also don't run autopilot on DR secondaries.
2021-03-23 14:13:44 -07:00
Brian Kassouf d01a068929
Remove retry from new raft test (#11158) 2021-03-19 12:41:57 -07:00
Nick Cabatoff b3af58d758
Expose snapshot_interval tunable instead of setting it in prod code for the sake of a test. (#11160) 2021-03-19 15:41:42 -04:00
Brian Kassouf 28aba513f2
storage/raft: Ensure peers are informed of their correct suffrage when added with AutoPilot (#11155)
* storage/raft: Ensure peers are informed of their correct suffrage when added with AutoPilot

* Add test ensuring peer sets are equivalent
2021-03-19 11:53:50 -07:00
Scott Miller 535bcf289e
Fix handling of minimum operations, and forward rotate/config requests to Primary (#11116)
* Boost max_operations to the greater of that specified or absoluteMinOperations

* Forward rotation config requests to the primary

* Reject rotation configs outside the min/max range

* Minor wording fix
2021-03-18 15:08:47 -05:00
Nick Cabatoff 411495514c
Add a test for server stabilization (#11128) 2021-03-17 17:23:13 -04:00
Vishal Nayak 9839e76192
Remove unneeded fields from state output (#11073) 2021-03-10 12:08:12 -05:00
Brian Kassouf aa00b53ba1
Make sure we sanitize the rotation config on each clone (#11050)
* Make sure we sanitize the rotation config on each clone

* Add regression test for missing rotation config

* use Equals

* simplify

Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
2021-03-08 10:59:21 -06:00
Vishal Nayak 3e55e79a3f
Autopilot: Server Stabilization, State and Dead Server Cleanup (#10856)
* k8s doc: update for 0.9.1 and 0.8.0 releases (#10825)

* k8s doc: update for 0.9.1 and 0.8.0 releases

* Update website/content/docs/platform/k8s/helm/configuration.mdx

Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>

Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>

* Autopilot initial commit

* Move autopilot related backend implementations to its own file

* Abstract promoter creation

* Add nil check for health

* Add server state oss no-ops

* Config ext stub for oss

* Make way for non-voters

* s/health/state

* s/ReadReplica/NonVoter

* Add synopsis and description

* Remove struct tags from AutopilotConfig

* Use var for config storage path

* Handle nin-config when reading

* Enable testing autopilot by using inmem cluster

* First passing test

* Only report the server as known if it is present in raft config

* Autopilot defaults to on for all existing and new clusters

* Add locking to some functions

* Persist initial config

* Clarify the command usage doc

* Add health metric for each node

* Fix audit logging issue

* Don't set DisablePerformanceStandby to true in test

* Use node id label for health metric

* Log updates to autopilot config

* Less aggressively consume config loading failures

* Return a mutable config

* Return early from known servers if raft config is unable to be pulled

* Update metrics name

* Reduce log level for potentially noisy log

* Add knob to disable autopilot

* Don't persist if default config is in use

* Autopilot: Dead server cleanup (#10857)

* Dead server cleanup

* Initialize channel in any case

* Fix a bunch of tests

* Fix panic

* Add follower locking in heartbeat tracker

* Add LastContactFailureThreshold to config

* Add log when marking node as dead

* Update follower state locking in heartbeat tracker

* Avoid follower states being nil

* Pull test to its own file

* Add execution status to state response

* Optionally enable autopilot in some tests

* Updates

* Added API function to fetch autopilot configuration

* Add test for default autopilot configuration

* Configuration tests

* Add State API test

* Update test

* Added TestClusterOptions.PhysicalFactoryConfig

* Update locking

* Adjust locking in heartbeat tracker

* s/last_contact_failure_threshold/left_server_last_contact_threshold

* Add disabling autopilot as a core config option

* Disable autopilot in some tests

* s/left_server_last_contact_threshold/dead_server_last_contact_threshold

* Set the lastheartbeat of followers to now when setting up active node

* Don't use config defaults from CLI command

* Remove config file support

* Remove HCL test as well

* Persist only supplied config; merge supplied config with default to operate

* Use pointer to structs for storing follower information

* Test update

* Retrieve non voter status from configbucket and set it up when a node comes up

* Manage desired suffrage

* Consider bucket being created already

* Move desired suffrage to its own entry

* s/DesiredSuffrageKey/LocalNodeConfigKey

* s/witnessSuffrage/recordSuffrage

* Fix test compilation

* Handle local node config post a snapshot install

* Commit to storage first; then record suffrage in fsm

* No need of local node config being nili case, post snapshot restore

* Reconcile autopilot config when a new leader takes over duty

* Grab fsm lock when recording suffrage

* s/Suffrage/DesiredSuffrage in FollowerState

* Instantiate autopilot only in leader

* Default to old ways in more scenarios

* Make API gracefully handle 404

* Address some feedback

* Make IsDead an atomic.Value

* Simplify follower hearbeat tracking

* Use uber.atomic

* Don't have multiple causes for having autopilot disabled

* Don't remove node from follower states if we fail to remove the dead server

* Autopilot server removals map (#11019)

* Don't remove node from follower states if we fail to remove the dead server

* Use map to track dead server removals

* Use lock and map

* Use delegate lock

* Adjust when to remove entry from map

* Only hold the lock while accessing map

* Fix race

* Don't set default min_quorum

* Fix test

* Ensure follower states is not nil before starting autopilot

* Fix race

Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
2021-03-03 13:59:50 -05:00
swayne275 d74f82346b
Add Partial Month Client Count API for Activity Log (#11022)
* sketch out partial month activity log client API

* unit test partialMonthClientCount

* cleanup api

* add api doc, fix test, update api nomenclature to match existing

* cleanup

* add PR changelog file

* integration test for API

* report entities and tokens separately
2021-03-01 16:15:59 -07:00
Scott Miller 08d8f65e01
Take the state lock in checkBarrierRotate, and don't save on seal (#11028)
* Use the state lock, and don't bother a last minute check on seal

* defer
2021-03-01 16:32:17 -06:00
Brian Kassouf cbb8b21520 Fix test build 2021-03-01 12:29:12 -08:00
Brian Kassouf 1bc410783d OSS/ENT Drift 2021-03-01 10:51:04 -08:00
Brian Kassouf a112161f60
expiration: Add a few metrics to measure revoke queue lengths (#10955)
* expiration: Add a few metrics to measure revoke queue lengths

* Update the metric names

* Add appropriate cluster labels

* Add metrics to docs

* Update jobmanager.go
2021-02-26 16:00:39 -08:00
Scott Miller a7b372b447
Two minor changes not reflected OSS side (#11020) 2021-02-26 14:23:56 -06:00
Scott Miller b13b27f37e
OSS side barrier encryption tracking and automatic rotation (#11007)
* Automatic barrier key rotation, OSS portion

* Fix build issues

* Vendored version

* Add missing encs field, not sure where this got lost.
2021-02-25 14:27:25 -06:00
Nick Cabatoff c1ddfbb538
OSS parts of the new client controlled consistency feature (#10974) 2021-02-24 06:58:10 -05:00
swayne275 38a647c6e5
remove noisy log, simplify job interface (#10975) 2021-02-22 15:00:24 -07:00
Brian Kassouf 34a7fc0286
replication: Don't write request coutners on DR Secondary nodes (#10936) 2021-02-22 09:04:41 -08:00
Brian Kassouf 0ad63e5a20
core/expiration: Add backoff jitter to the expiration retries (#10937) 2021-02-18 20:20:01 -08:00
Hridoy Roy 4a96126d5a
Revert "Vault Dependency Upgrades [VAULT-871] (#10903)" (#10939)
This reverts commit eb74ca61fc4dcb7038f39defb127d5d639ba0ca1.
2021-02-18 15:40:18 -05:00
Hridoy Roy a26d1300e8
Vault Dependency Upgrades [VAULT-871] (#10903)
* upgrade vault dependency set

* etcd and grpc issues:

* better for tests

* testing

* all upgrades for hashicorp deps

* kubernetes plugin upgrade seems to work

* kubernetes plugin upgrade seems to work

* etcd and a bunch of other stuff

* all vulnerable packages upgraded

* k8s is broken in linux env but not locally

* test fixes

* fix testing

* fix etcd and grpc

* fix etcd and grpc

* use master branch of go-testing-interface

* roll back etcd upgrade

* have to fix grpc since other vendors pull in grpc 1.35.0 but we cant due to etcd

* rolling back in the replace directives

* a few more testing dependencies to clean up

* fix go mod vendor
2021-02-18 12:31:57 -08:00
swayne275 e4119a6a8a
Vault-1403 Switch Expiration Manager to use Fairsharing Backpressure (#1709) (#10932)
* basic pool and start testing

* refactor a bit for testing

* workFunc, start/stop safety, testing

* cleanup function for worker quit, more tests

* redo public/private members

* improve tests, export types, switch uuid package

* fix loop capture bug, cleanup

* cleanup tests

* update worker pool file name, other improvements

* add job manager prototype

* remove remnants

* add functions to wait for job manager and worker pool to stop, other fixes

* test job manager functionality, fix bugs

* encapsulate how jobs are distributed to workers

* make worker job channel read only

* add job interface, more testing, fixes

* set name for dispatcher

* fix test races

* wire up expiration manager most of the way

* dispatcher and job manager constructors don't return errors

* logger now dependency injected

* make some members private, test fcn to get worker pool size

* make GetNumWorkers public

* Update helper/fairshare/jobmanager_test.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* update fairsharing usage, add tests

* make workerpool private

* remove custom worker names

* concurrency improvements

* remove worker pool cleanup function

* remove cleanup func from job manager, remove non blocking stop from fairshare

* update job manager for new constructor

* stop job manager when expiration manager stopped

* unset env var after test

* stop fairshare when started in tests

* stop leaking job manager goroutine

* prototype channel for waking up to assign work

* fix typo/bug and add tests

* improve job manager wake up, fix test typo

* put channel drain back

* better start/pause test for job manager

* comment cleanup

* degrade possible noisy log

* remove closure, clean up context

* improve revocation context timer

* test: reduce number of revocation workers during many tests

* Update vault/expiration.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* feedback tweaks

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2021-02-17 14:30:27 -08:00
swayne275 6e1b183f79
Shutdown Test Cores when Tests Complete (#10912)
* Shutdown Test Cores when Tests Complete

* go mod vendor
2021-02-12 13:04:48 -07:00
Jim Kalafut 42bae71806
Improve error messages (#10843)
- Fix: "bytes" should be less than %!s(int=131072) message
- Also add a missing openapi type that was throwing warnings
2021-02-11 19:51:12 -08:00
Michael Golowka 108d4c6a68
MySQL - Add username customization (#10834) 2021-02-11 14:08:32 -07:00
Vishal Nayak 53cb1deb38
Revert "Read-replica instead of non-voter (#10875)" (#10890)
This reverts commit fc745670cf34821f5834357d9caebc3351dbc1e7.
2021-02-10 16:41:58 -05:00
Mark Gritter 85c1ae1002
Fix error in log; add additional log on conflicting quotas. (#10888) 2021-02-10 12:24:35 -06:00
Ian Ferguson 865df63c76
Correct lock acquisition order in the `pathEntityMergeID` identity to fix deadlock condition (#10877) 2021-02-10 11:05:16 -05:00
Vishal Nayak a2394e7353
Read-replica instead of non-voter (#10875) 2021-02-10 09:58:18 -05:00
Mark Gritter c5fd996a36
Fix flaky ActivityLog unit test (#10860)
* Wait for initial retention run to finish before adding segments.
2021-02-09 16:34:49 -06:00
Vishal Nayak 8613ba88a6
Fix quota enforcing old path issue (#10689)
* Fix db indexing issue

* Add CL update
2021-02-09 05:46:09 -05:00
Mark Gritter d0994340fb
Fill in missing lease ID deterministically. Generate a UUID on creation. (#10855) 2021-02-08 13:46:59 -06:00
Nick Cabatoff 75c955b3c3
Apply OSS part of ENT change re waitForReplicationState. (#10837) 2021-02-04 09:10:35 -05:00
Mark Gritter 3ec15c4927
Fix use of identity/group endpoint to edit group by name (#10812)
* Updates identity/group to allow updating a group by name (#10223)
* Now that lookup by name is outside handleGroupUpdateCommon, do not
use the second name lookup as the object to update.
* Added changelog.

Co-authored-by: dr-db <25711615+dr-db@users.noreply.github.com>
2021-01-29 16:50:08 -06:00
Mark Gritter ce858de180
Fix for test failing on January 29th: advance months using timeutil, not AddDate. (#10808) 2021-01-29 11:48:22 -06:00
Hridoy Roy 537189cab8
make token create case insensitive [VAULT-1021] (#10743)
* make token create case insensitive

* changelog

* comment update
2021-01-27 09:56:54 -08:00
Aleksandr Bezobchuk 2ec8f9a222
metrics: activity log (#10514)
* core: add vault.identity.entity.active.monthly log
* Fixed end-of-month metrics and unit test.
* Added metric covering month-to-date (not broken down by namespace.)
* Updated documentation
* Added changelog.

Co-authored-by: mgritter <mgritter@hashicorp.com>
2021-01-26 16:37:07 -06:00
Vishal Nayak fcbbc5f7d8
Remove peer DR op token check only on secondaries (#10765) 2021-01-25 17:35:58 -05:00
Vishal Nayak 904bacd55e
Fix remove peers check (#10758) 2021-01-25 14:20:46 -05:00
Vishal Nayak c74c381fb1
Move the declaration to a OSS build tag file to not have it collide w… (#10750)
* Move the declaration to a OSS build tag file to not have it collide with ent declarations

* Add comment

* Remove comment to trigger ci
2021-01-25 09:35:19 -05:00
Vishal Nayak 8ebf0ae794
Fix build (#10749) 2021-01-22 16:40:22 -05:00
Vishal Nayak 5d270db1df
Add list peers to DR secondaries (#10746) 2021-01-22 11:50:59 -05:00
Mark Gritter fd55aa8378
Implement sys/seal-status and sys/leader in system backend (#10725)
* Implement sys/seal-status and sys/leader as normal API calls
(so that they can be used in namespaces.)
* Added changelog.
2021-01-20 14:04:24 -06:00
Nick Cabatoff 8cbc63d572
Add configuration to specify a TLS ServerName to use in the TLS handshake when performing a raft join. (#10698) 2021-01-19 17:54:28 -05:00
Nick Cabatoff c2bdeb9e7d
Minimal change to ensure that the bulky leaseEntry isn't kept in memory. (#10726) 2021-01-19 17:51:41 -05:00
Hridoy Roy 0becd555cf
Protect part of emitMetrics from panic behavior during post-seal (#10708)
* vault/core_metrics.go

* changelog

* comments
2021-01-19 14:06:50 -08:00
Scott Miller 77d27cb968
Add NIST guidance on rotating keys used for AES-GCM encryption (#10612)
* Add NIST guidance on rotating keys used for AES-GCM encryption

* Capture more places barrier encryption is used

* spacing issue

* Probabilistically track an estimated encryption count by key term

* Un-reorder imports

* wip

* get rid of sampling
2021-01-07 15:37:37 -06:00
Scott Miller c3e0d06216
Make the error response to the sys/internal/ui/mounts with no client token consistent (#10650)
* Make the error response to the sys/internal/ui/mounts with no client token consistent

* changelog

* Don't test against an empty mount path

* One other spot

* Instead, do all token checks first and early out before even looking for the mount
2021-01-07 11:46:08 -06:00
Lauren Voswinkel 7189a67a33
Adding snowflake as a bundled database secrets plugin (#10603)
* Adding snowflake as a bundled database secrets plugin

* Add snowflake-database-plugin to expected bundled plugins

* Add snowflake plugin name to the mockBuiltinRegistry
2021-01-07 09:30:24 -08:00
Mark Gritter d076d95d37
Feature flags API (#10613)
* Added sys/internal/ui/feature-flags endpoint.
* Added documentation for new API endpoint.
* Added integration test.
Co-authored-by: swayne275 <swayne@hashicorp.com>
2021-01-06 16:05:00 -06:00
Nick Cabatoff e856174d15
Fix test for expiring root tokens creating non-expiring root tokens (#10632)
Test was failing (once we specified the expected error to check) because when we create a token via the TokenStore, without registering the lease in the expiration manager, lookupInternal will see that there is an expiring token with no lease and delete it immediately, yielding the "no parent found" error.
2021-01-04 09:48:22 -05:00
swayne275 a961bdc318
Fix setting Activity Log enable flag through the API (#10594)
* fix setting enable, update tests

* improve wording

* fix typo - left the testing enabled set in originally

* improve warning handling

* move from nested if to switch - TIL
2020-12-18 11:20:32 -07:00
Mark Gritter 8c67bed7ae
Send a test message before committing a new audit device. (#10520)
* Send a test message before committing a new audit device.
Also, lower timeout on connection attempts in socket device.
* added changelog
* go mod vendor (picked up some unrelated changes.)
* Skip audit device check in integration test.
Co-authored-by: swayne275 <swayne@hashicorp.com>
2020-12-16 16:00:32 -06:00
Aleksandr Bezobchuk ae6267cc9b
core: add warning when disabling activity (#10485) 2020-12-15 14:11:28 -05:00
Michel Vocks 191aa65bc3
Fix UI custom header values (#10511)
* Fix UI custom header values

* Fix changelog entry

* Introduce param for multi values

* Fix multivalue

* multivalue should be bool

* Sort imports

* Fix conflict

* Remove changelog entry

* Revert entry delete
2020-12-15 15:58:03 +01:00
swayne275 cdf933adf1
say how many leases there are when threshold exceeded (#10567) 2020-12-14 16:00:19 -07:00
Aleksandr Bezobchuk 3bce568535
rate limit: fix initialize defaults (#10536) 2020-12-14 14:55:52 -05:00
Brian Kassouf 275ca323e8
core: Record the time a node became active (#10489)
* core: Record the time a node became active

* Update vault/core.go

Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>

* Add omitempty field

* Update vendor

* Added CL entry and fixed test

* Fix test

* Fix command package tests

Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
2020-12-11 16:50:19 -08:00
Calvin Leung Huang f137c945d7
ha: update godoc on grabLockOrStop (#10547) 2020-12-11 16:04:00 -08:00
Seth Bunce a6a437a1ce
fix deadlock on core state lock (#10456)
* fix race that can cause deadlock on core state lock

The bug is in the grabLockOrStop function. For specific concurrent
executions the grabLockOrStop function can return stopped=true when
the lock is still held. A comment in grabLockOrStop indicates that the
function is only used when the stateLock is held, but grabLockOrStop is
being used to acquire the stateLock. If there are concurrent goroutines
using grabLockOrStop then some concurrent executions result in
stopped=true being returned when the lock is acquired.

The fix is to add a lock and some state around which the parent and
child goroutine in the grabLockOrStop function can coordinate so that
the different concurrent executions can be handled.

This change includes a non-deterministic unit test which reliably
reproduces the problem before the fix.

* use rand instead of time for random test stopCh close

Using time.Now().UnixNano()%2 ends up being system dependent because
different operating systems and hardware have different clock
resolution. A lower resolution will return the same unix time for a
longer period of time.

It is better to avoid this issue by using a random number generator.
This change uses the rand package default random number generator. It's
generally good to avoid using the default random number generator,
because it creates extra lock contention. For a test it should be fine.
2020-12-10 06:50:11 -05:00
Nick Cabatoff b425be1a93
Fix race with test that mutates KeyRotateGracePeriod: make the global be a Core field instead. (#10512) 2020-12-08 13:57:44 -05:00
Nick Cabatoff 84d566db9e
Be consistent with how we report init status. (#10498)
Also make half-joined raft peers consider storage to be initialized, whether or not they're sealed.
2020-12-08 13:55:34 -05:00
Hridoy Roy 0ada870a52
Only use entropy augmentation for root token creation [VAULT-670] (#10487)
* Only use entropy augmentation for root token creation

* changelog

* change wording of changelog entry
2020-12-04 09:44:04 -08:00
swayne275 88eaf5f4c3
Fix Racy Activity Log Tests (#10484)
* fix racy activity log tests and move testing utilities elsewhere

* remove TODO

* move SetEnable out of activity log

* clarify not waiting on waitgroup

* remove todo
2020-12-02 13:48:13 -07:00
Michael Golowka cc7efd393d
MySQL - Fix username generation length bug (#10433) 2020-12-01 15:24:51 -07:00
Scott Miller 32cb144d0d
Update HCL dependency to fix ParseACLPolicy error on invalid syntax (#10156) 2020-11-30 09:17:33 -06:00
Nick Cabatoff 818f8aeff2
Fix test failures of the form "bad start timestamp. expected: 1606313752 got: 1606313753". Also re-enable a test that probably shouldn't be skipped, and delete a test that's meant for ent (see also https://github.com/hashicorp/vault-enterprise/pull/1613) (#10452) 2020-11-25 13:49:47 -05:00
Nick Cabatoff d8e7d2e2b8
Use a lock to address race. (#10447) 2020-11-25 12:24:28 -05:00
Nick Cabatoff 6faef07fd5
Factor out the consul-using sealmigration tests to their own package, so that the remaining tests can run in the CI job that doesn't need docker. (#10342)
Remove the file-storage-backed tests: they don't add anything, and they don't represent a viable cluster storage solution that can be used in prod.
2020-11-20 07:53:31 -05:00
Mark Gritter 0bc1226084
Disable test that fails on OSS. (#10401) 2020-11-16 12:23:21 -06:00
Hridoy Roy 6261afb343
Port: Telemetry For Lease Expiration Times (#10375)
* port lease metrics

* go mod vendor

* caught a bug
2020-11-13 10:26:58 -08:00
Hridoy Roy c5aa1c715f
reverting the tests to not use metrics when unnecessary (#10350) 2020-11-11 15:35:09 -08:00
Mark Gritter 52ddad2bc5
Wait for asynchronous deletion to finish before restarting. (#10345) 2020-11-06 18:21:04 -06:00
Scott Miller d53e26fb7a
Backport last quota fix changes to OSS (#10335)
* Backport last quota fix changes to OSS

* Get all unit tests

* dupe test
2020-11-06 09:46:31 -06:00
Vishal Nayak 36a5bd946a
Fix quota conflict error (#10285)
Co-authored-by: Scott Miller <smiller@hashicorp.com>
2020-11-05 10:18:07 -06:00
Mark Gritter 91ca298a14
Move "counters" path to the logical system's local path list. (#10314) 2020-11-02 21:59:55 -06:00
swayne275 dffd85e09a
Backport invalidation changes (#10292)
* merge activity log invalidation work from vault-enterprise PR 1546

* skip failing test due to enabled config on oss

Co-authored-by: Mark Gritter <mgritter@hashicorp.com>
2020-10-30 18:11:12 -06:00
Mark Gritter 7f01a58aee
Reintroduce a feature flag to disable the activity log entirely. (#10288)
* Reintroduce a feature flag to disable the activity log entirely.
* Add log message when disabled.
2020-10-30 18:27:35 -05:00
Brian Kassouf 8af08c3221
Add an env var to enable a permit pool that limits lease expirations (#10268)
* Add a flag to enable a permit pool to gate lease expiration

* Use the env var to get the size

* Add logs and metris to help debug this

Co-authored-by: Hridoy Roy <roy@hashicorp.com>
2020-10-30 14:45:44 -07:00
Brian Kassouf 81a86f48e8
Backport some OSS changes (#10267)
* Backport some OSS changes

* go mod vendor
2020-10-29 16:47:34 -07:00
Vishal Nayak 90a9528610
added test for concurrency call of remount handler and proposed fix for logic to avoid duplication of mount names (#10264)
Co-authored-by: bruj0 <ramakandra@gmail.com>
2020-10-29 14:39:41 -04:00
Vishal Nayak 30fe58a458
Fix remount tests (#10265) 2020-10-29 14:31:58 -04:00
Matt Greenfield 2f369730e0
Validate to/from parameters when remounting a backend (#9890)
Vault uses http.ServeMux which issues an HTTP 301 redirect if the
request path contains a double slash (`//`). Additionally, vault
handles all paths to ensure that the path only contains printable
characters. Therefore use the same validation on the to/from parameters
for remounting.

Not doing this can result in a Vault mount that was originally mounted
at `pki/foo` to being remounted at `pki/foo//bar` resulting in mounts
that cannot be accessed.

Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-10-29 14:06:07 -04:00
Hridoy Roy fc94e16805
port external test fix (#10263) 2020-10-29 10:55:26 -07:00
Hridoy Roy f8a248ce48
Port: change leader status metric name to active (#10245)
* change active node metric name

* comment to see if commit is fine

Co-authored-by: Hridoy Roy <hridoyroy@Hridoys-MacBook-Pro.local>
2020-10-29 10:30:45 -07:00
Hridoy Roy 0259be04e0
Port: Add metrics to report mount table sizes for auth and logical [Vault 671] (#10201)
* first commit

* update

* removed some ent features from backport

* final refactor

* backport patch

Co-authored-by: Hridoy Roy <hridoyroy@Hridoys-MacBook-Pro.local>
Co-authored-by: Hridoy Roy <hridoyroy@Hridoys-MBP.hitronhub.home>
2020-10-27 08:24:43 -07:00
Jeff Mitchell 3b93a18ef2
Consolidate locking for sys/health (#9876)
* Consolidate locking for sys/health

This avoids a second state lock read-lock on every sys/health hit

* Address review feedback

Co-authored-by: Vishal Nayak <vishalnayakv@gmail.com>
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-10-26 16:47:54 -04:00
Calvin Leung Huang ca8435bf4e
auth: store period value on tokens created via login (#7885)
* auth: store period value on tokens created via login

* test: reduce potentially flaskiness due to ttl check

* test: govet on package declaration

* changelog++

* Temporarily remove CL entry

* Add back the CL entry

Co-authored-by: Vishal Nayak <vishalnayakv@gmail.com>
2020-10-26 16:25:56 -04:00
Clint 95810d1360
Return logical.StatusBadRequest on requests with missing token (#8457)
* Add test for 400 status on missing token

* Return logical.StatusBadRequest on missing token

* remove commented out code

Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-10-26 16:17:25 -04:00
Jeff Mitchell a07b6ba1d2
Add omitempty's to MountEntry and MountConfig (#7154)
Co-authored-by: Jim Kalafut <jkalafut@hashicorp.com>
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-10-26 15:34:55 -04:00
Aleksandr Bezobchuk 95bbd8d920
Merge PR #10192: Auto-Join: Configurable Scheme & Port (and add k8s provider) 2020-10-23 16:13:09 -04:00
Nick Cabatoff 0d6a929a4c
Same seal migration oss (#10224)
* Refactoring and test improvements.

* Support migrating from a given type of autoseal to that same type but with different parameters.
2020-10-23 14:16:04 -04:00
Michael Golowka bd79fbafb3
Add couchbase, elasticsearch, and mongodbatlas back (#10222)
Updated the `Serve` function so these can be added back into Vault
2020-10-22 17:20:17 -06:00
Michael Golowka e6c8ee24ea
DBPW - Enables AutoMTLS for DB plugins (#10220)
This also temporarily disables couchbase, elasticsearch, and
mongodbatlas because the `Serve` function needs to change signatures
and those plugins are vendored in from external repos, causing problems
when building.
2020-10-22 15:43:19 -06:00
Aleksandr Bezobchuk 0d6a0ec589
Merge PR #10010: Rate Limit Quotas: Allow Exempt Paths to be Configurable 2020-10-16 14:58:19 -04:00
Nick Cabatoff 66274607b7
OSS changes for enterprise automated snapshots (#10160) 2020-10-16 14:57:11 -04:00
Hridoy Roy bd2dc7734c
Backport leader status telemetry [VAULT-672] (#10147)
* backport VAULT-672

* backport VAULT-672

* go mod tidy

* go mod tidy

* add back indirect import

* replace go mod and go sum with master version

* go mod vendor

* more go mod vendor

Co-authored-by: Hridoy Roy <hridoyroy@Hridoys-MBP.hitronhub.home>
Co-authored-by: Hridoy Roy <hridoyroy@Hridoys-MacBook-Pro.local>
2020-10-15 14:15:58 -07:00
Michael Golowka 1888323243
DBPW - Copy `newdbplugin` package to `dbplugin/v5` (#10151)
This is part 1 of 4 for renaming the `newdbplugin` package. This copies the existing package to the new location but keeps the current one in place so we can migrate the existing references over more easily.
2020-10-15 13:20:12 -06:00
Brian Kassouf 84dbca38a1
Revert "Migrate internalshared out (#9727)" (#10141)
This reverts commit ee6391b691ac12ab6ca13c3912404f1d3a842bd6.
2020-10-13 16:38:21 -07:00
Aleksandr Bezobchuk d37be9af6e
Merge PR #10095: Integrated Storage Cloud Auto-Join 2020-10-13 16:26:39 -04:00
Connor 8b1a3301f0
Add telemetry for LRU cache (#10079)
Vault creates an LRU cache that is used when interacting with the
physical backend. Add telemetry when the cache is hit, missed, written
to and deleted from. Use the MetricSink from ClusterMetrics
2020-10-13 10:11:54 -05:00
Jeff Mitchell e6881c8147
Migrate internalshared out (#9727)
* Migrate internalshared out

* fix merge issue

* fix merge issue

* go mod vendor

Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
2020-10-12 11:56:24 -07:00
Lauren Voswinkel 55761c1d43
Migrate MySQL db to v5 database engine (#10110)
* Migrate MySQL db to v5 database engine
2020-10-09 10:32:38 -07:00
Brian Kassouf cb37fda0a7
Expiration: Fix lease counting logic (#10106) 2020-10-07 17:27:45 -07:00
Ivan Lemeshev c84be1c874
Fix panic in RekeyVerifyRestart (#9930) (#10099) 2020-10-07 11:06:17 -07:00
Ivan Lemeshev 8a2842b467
Fix panic in handleAuditNonLogical if vault is sealed (#9310) (#10103) 2020-10-07 08:30:36 -04:00
Aleksandr Bezobchuk a3cfa7c447
Merge PR #10059: Port OSS changes from #1497 2020-10-01 15:15:20 -04:00
Brian Kassouf b0d3d9bf49
Update lease timer logic (#10030) 2020-09-23 11:46:22 -07:00
ncabatoff 3fb20ed529
Logging tweaks, mostly for tests (#10028)
Fix some places where raft wasn't hooking into the core logger as it should.
Revisited the code that was setting the log level to Error during cleanup: it's normal for there to be a bunch of errors then, which makes it harder to see what went wrong up to the point where the test was deemed to have failed.  So now, instead of setting log level to Error, we actually stop logging altogether.  This only applies if the test didn't pass in its own logger during cluster creation, but we should be moving away from that anyway.
2020-09-23 13:40:00 -04:00
Brian Kassouf 3f30fc5f4e
Port changes from enterprise lease fix (#10020) 2020-09-22 14:47:13 -07:00
Mark Gritter 38ae7efca3
Switch to PerfStandby() where possible. (#9993) 2020-09-22 14:48:30 -05:00
Michael Golowka 60e0cbbc37
[DBPW 4/X] Update DB engine to support v4 and v5 interfaces with password policies (#9878) 2020-09-18 15:10:54 -06:00
ncabatoff 0f77d0e282
Move the code that creates Consul containers out of teststorage. This allows importers of teststorage that don't need consul to run as a non-docker test. (#9975) 2020-09-17 15:44:29 -04:00
ncabatoff 2cb1e03c1b
Fix flaky TestCluster_ForwardRequest tests. (#9973)
We now wait to see that clients are aware of a new leader before we hit them, and we wait to see that the leader has setup the forwarding handler.
2020-09-16 15:31:06 -04:00
Mark Gritter 587ed7d499
Disable usage metrics on performance standby nodes. (#9966) 2020-09-15 17:12:28 -05:00
ncabatoff b615da43d7
Run CI tests in docker instead of a machine. (#8948) 2020-09-15 10:01:26 -04:00
Mark Gritter 4633f5a8fc
Disable flaky test case. (#9926) 2020-09-10 17:54:31 -05:00
swayne275 22f7a0ebef
VAULT-417: check expired context in entity API (#1445) (#9925) 2020-09-10 16:31:32 -06:00
Mark Gritter 1b2c20e07c
Merge activity log work to date on enterprise back into oss. (#9900)
* Added stub class for activity logging. (#1435)
* Define activity fragments and starter methods for manipulating them. (#1441)
2020-09-08 14:22:09 -05:00
Mark Gritter 3690774f7b
Fix crash when KV store has a zero-length key. (#9881)
* Fix crash when KV store has a zero-length key.
* Add PR to changelog.
2020-09-02 17:43:44 -05:00
ncabatoff 30eba1eed1
Update retryablehttp to fix a data race (#9551) 2020-08-31 11:10:52 -04:00
Alexander Bezobchuk f873863263
Merge PR #9667: Rate Limit Backoff 2020-08-16 22:09:18 -04:00
ncabatoff 4134ef2e98
Ensure that perf standbys can perform seal migrations. (#9690) 2020-08-10 08:35:57 -04:00
Rodrigo D. L d0df8bfa21
adding new config flag disable_sentinel_trace (#9696) 2020-08-10 06:23:44 -04:00
Tom Proctor 4ca978598f
Bundle couchbase database plugin with vault (#9664) 2020-08-07 11:01:04 +01:00
ncabatoff b6fd378ee8
Make manualStepDownCh a 1-buffered channel to ensure StepDown actually steps down in tests. (#9622) 2020-07-31 10:01:51 -04:00
ncabatoff 1154b36b56
Log sanitized config at startup and when it changes. (#9637)
Co-authored-by: Aleksandr Bezobchuk <aleks.bezobchuk@gmail.com>
2020-07-30 13:15:00 -04:00
Alexander Bezobchuk 1e262e5648
Merge PR #9581: Rate Limit Quota Headers 2020-07-29 15:15:05 -04:00
Scott Miller 16ca436dd8
Increase expiration timeouts on leases to avoid races in NoopBackend (#9600)
* Increase expiration timeouts on leases to avoid races in NoopBackend

* Set timeouts depending on whether they are relevant to the test: 1s for irrelevant, back to 20ms if they are

* revert one more
2020-07-29 10:13:47 -05:00
ncabatoff 003bccd16e
Eliminate global that caused race tests to fail in ent with an internal config setting. (#9604) 2020-07-27 16:10:26 -04:00
ncabatoff ee6e2344dd
Fix a race caused by assignment to core.metricSink (#9560) 2020-07-22 13:52:10 -04:00
ncabatoff 7484fd7c72
Handle "invalid request" failures stemming from a kvv2 upgrade. (#9550) 2020-07-22 10:56:00 -04:00
ncabatoff 3fbc0f35c2
Make runTransit tolerate a non-core-0 leader. (#9548) 2020-07-21 15:50:01 -04:00
Brian Kassouf fd72d92434
raft: Fix some snapshot restore issues (#9533)
* raft: Remove double read lock

* Reload TLS keyring after reloading the barrier keys
2020-07-21 10:59:07 -07:00
ncabatoff d2436a9c56
Make standbyStopCh atomic to avoid data races (#9539) 2020-07-21 08:34:07 -04:00
ncabatoff d777708fde
Improve logging, and add polling to the post-stepdown leader check. (#9530) 2020-07-20 12:44:23 -04:00
ncabatoff a31fd27069
Use the accessor method so state lock is used to check perf standby status. (#9496) 2020-07-20 10:34:16 -04:00
ncabatoff 3ddc837ce3
Make sure cluster is stopped before wiping storage. (#9526) 2020-07-20 09:32:38 -04:00
Vishal Nayak 14779e9f9d
Remove prefix for exception paths; add sys/internal/ui/mounts (#9512) 2020-07-17 10:55:38 -04:00
Mike Jarmy 93ff4c098c
Add a lock to seal migration (#9485)
* add a lock to seal migration

* switch to CompareAndSwapInt32

* switch to uber go-atomic
2020-07-16 15:14:29 -04:00
Alexander Bezobchuk ea13485c7d
Merge PR #9502: Resource Quotas: Remove 'burst' Param from Rate Limiter 2020-07-16 14:34:43 -04:00
Vishal Nayak 4e4cb5289a
Fetch quota by name before updating it (#9466)
* Fix quotas update

* Update doc
2020-07-15 13:25:00 -04:00
Mark Gritter c4dbbccef3
Remove namespace from mount_point label. (#9436)
* Remove namespace from mount_point label.
* Fix the other two places where vault.token.creation is emitted.
2020-07-14 14:28:11 -05:00
Calvin Leung Huang cf354f6438
quotas: fix data race that could occur if ApplyQuota was called durin… (#9458)
* quotas: fix data race that could occur if ApplyQuota was called during a db reset

* Abstract out the locking caller

* Remove unneeded lock

* Update

Co-authored-by: Vishal Nayak <vishalnayakv@gmail.com>
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-07-13 11:42:39 -07:00
ncabatoff c822a4bffb
Modifying base can break tests that share the same coreConfig passed to NewTestCluster. (#9451) 2020-07-10 13:11:18 -04:00
Josh Black 38fc012817
Backport the pieces of the replication API changes (#9425) 2020-07-09 15:11:37 -07:00
Alexander Bezobchuk f7c1907683
Merge PR #9437: tests: Port #1315 from enterprise 2020-07-09 17:16:31 -04:00
Brian Kassouf f8df68b673
seal: Fix issue migrating from Auto->Shamir and improve tests (#9430)
* Fix issue migrating from Auto->Shamir and improve tests

* Undo newline

* fix panic in test

* Fix test panic
2020-07-09 12:28:17 -07:00
Alexander Bezobchuk dfb28a8fcc
Merge PR #9390: http: revert resource quota changes 2020-07-07 00:05:28 -04:00
Calvin Leung Huang 67444d85b8
test/migration: ensure that leader client is used for storage read check (#9403) 2020-07-06 16:22:07 -07:00
Scott Miller dfbc60cc49
Add mount type to request earlier, so it is populated on audit entries even if the request fails authorization (#9389) 2020-07-06 11:17:41 -05:00
Alexander Bezobchuk f1534a0ed0
Add nil check for quota manager (#9379)
* Add nil check for quota manager

* Add missing nil checks
2020-07-01 18:14:33 -07:00
Alexander Bezobchuk fb9cd9db2a
Merge PR #9372: Fix Unauthenticated list 2020-07-01 16:02:14 -04:00
Mark Gritter 707fdea702
Don't return quota error on revoke. (#9374)
Changed log messages to be clearer about quota operations.
This should fix enterprise unit test failures.
2020-07-01 14:41:42 -05:00
Calvin Leung Huang 37c0e51142
logical/system: re-introduce ns-awareness in pathInternalUIMountsRead (#9373) 2020-07-01 12:37:11 -07:00
Scott Miller a6f62359a9
Don't setup plugin reload on perf standbys (#9352) 2020-06-30 17:32:06 -05:00
Mike Jarmy 4b2cdfee72
re-enable seal migration (#9351)
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-06-30 18:21:18 -04:00
Scott Miller ad292bec73
Fix wrong err return value in plugin reload status command (#9348)
* Fix wrong return value (discovered when merging to ENT)

* go.mod

* go mod vendor

* Add setup plugin reload hook

* All reloads return something now
2020-06-30 13:33:30 -05:00
Scott Miller 001ee861bd
Global Plugin Reload: OSS Changes Take II (#9347)
* Carefully move changes from the plugin-cluster-reload branch into this clean branch off master.

* Don't test this at this level, adequately covered in the api level tests

* Change PR link

* go.mod

* Vendoring

* Vendor api/sys_plugins.go
2020-06-30 10:26:52 -05:00
Scott Miller e92f8f5a81
Revert global plugin reload commits (#9344)
* Revert "Some of the OSS changes were clobbered when merging with quotas out of, master (#9343)"

This reverts commit 8719a9b7c4d6ca7afb2e0a85e7c570cc17081f41.

* Revert "OSS side of Global Plugin Reload (#9340)"

This reverts commit f98afb998ae50346849050e882b6be50807983ad.
2020-06-29 17:36:22 -05:00
Scott Miller cc51427584
Some of the OSS changes were clobbered when merging with quotas out of, master (#9343)
* OSS side of Global Plugin Reload
2020-06-29 16:58:51 -05:00
Scott Miller a83fe0fc6d
OSS side of Global Plugin Reload (#9340)
* OSS side of Global Plugin Reload

* changelog++
2020-06-29 16:23:28 -05:00
ncabatoff d42ee4f7ef
Ensure "initialized" service registration tag is also present whenever Vault is unsealed, on both Consul and K8s (#8990)
* Add the initialized tag to Consul registration for parity with k8s (and for easy automated testing).  Ensure that whenever we flag Vault as unsealed, we also flag it as initialized.

* Update API docs.

Co-authored-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
2020-06-29 16:02:49 -04:00
Calvin Leung Huang babaa93a0f
monitor: watch for seal state during monitor request (#9341)
* monitor: watch for seal state during monitor request

* monitor: return error regardless of how upstream handles it
2020-06-29 12:58:41 -07:00
Mark Gritter 873acbefbd
Unit test that fails to demonstrate identity store problem. (#9339)
Fix test compilation error.
2020-06-29 13:33:25 -05:00
Vishal Nayak 6bd5674345
Reset quota manager during shutdown (#9331) 2020-06-29 13:23:10 -04:00
Vishal Nayak c6876fe00f
Resource Quotas: Rate Limiting (#9330) 2020-06-26 17:13:16 -04:00
Javier Ramos 16070564cb
Calculate percentage when displaying progress in tidy operation (#9233)
* Calculate percentage when displaying progress in tidy operation

* Update vault/token_store.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>

* Update vault/token_store.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>
2020-06-26 12:44:45 -07:00
Mark Gritter a5d2edece2
Enable identity entity collection. (#9324) 2020-06-25 18:54:38 -05:00
Scott Miller 57c6ae4233
Test for overflow of the capacity value (#9317) 2020-06-25 11:22:13 -05:00
Mark Gritter a54a3b6a66
Entity and alias counts (#9262)
* Added gauge collectors for entity counts.
* Entity and alias gauges.
* Locking around accessor to core.identityStore and core.mount.
2020-06-23 19:45:59 -05:00
Mark Gritter 97d415d024
Token gauge metrics implementation. (#9239)
* Token gauge metrics implementation.
* Enable gauges only when interval is nonzero.
* Added count by TTL
* Yandle "in restore mode" error specifically.
* Refactored initialization code for gauge collection processes.
* Fixed for multiple namespaces.
* Ability to disable individual gauges with environment variable.
* changelog++
2020-06-23 18:36:24 -05:00
Calvin Leung Huang c45bdca0b3
raft: add support for using backend for ha_storage (#9193)
* raft: initial work on raft ha storage support

* add note on join

* add todo note

* raft: add support for bootstrapping and joining existing nodes

* raft: gate bootstrap join by reading leader api address from storage

* raft: properly check for raft-only for certain conditionals

* raft: add bootstrap to api and cli

* raft: fix bootstrap cli command

* raft: add test for setting up new cluster with raft HA

* raft: extend TestRaft_HA_NewCluster to include inmem and consul backends

* raft: add test for updating an existing cluster to use raft HA

* raft: remove debug log lines, clean up verifyRaftPeers

* raft: minor cleanup

* raft: minor cleanup

* Update physical/raft/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/ha.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/ha.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/logical_system_raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* address feedback comments

* address feedback comments

* raft: refactor tls keyring logic

* address feedback comments

* Update vault/raft.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* address feedback comments

* testing: fix import ordering

* raft: rename var, cleanup comment line

* docs: remove ha_storage restriction note on raft

* docs: more raft HA interaction updates with migration and recovery mode

* docs: update the raft join command

* raft: update comments

* raft: add missing isRaftHAOnly check for clearing out state set earlier

* raft: update a few ha_storage config checks

* Update command/operator_raft_bootstrap.go

Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>

* raft: address feedback comments

* raft: fix panic when checking for config.HAStorage.Type

* Update vault/raft.go

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* Update website/pages/docs/commands/operator/raft.mdx

Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>

* raft: remove bootstrap cli command

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/raft.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* raft: address review feedback

* raft: revert vendored sdk

* raft: don't send applied index and node ID info if we're HA-only

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>
Co-authored-by: Vishal Nayak <vishalnayak@users.noreply.github.com>
2020-06-23 12:04:13 -07:00
Mark Gritter ce88903638
Add counter for creation of new entity. (#9263) 2020-06-19 18:24:05 -05:00
Mark Gritter 3e220f300a
Fix to failing test compilation. (#9272) 2020-06-19 15:19:21 -05:00
Mark Gritter b3c3635f49
Added gauges to count KV secrets. (#9250)
* Added gauges to count KV secrets.
* Use real KV implementation in test.
2020-06-19 14:01:35 -05:00
Mark Gritter 239b2375aa
Counter that increments on every secret engine lease creation. (#9244) 2020-06-18 15:36:21 -05:00
Mark Gritter cc6a64d4d8
Replaced ClusterMetricSink's cluster name with an atomic.Value. (#9252)
* Replaced ClusterMetricSink's cluster name with an atomic.Value.
This should permit go-race tests to pass which seal and unseal
the core.

* Replace metric sink before unseal to avoid data races.
2020-06-18 12:55:50 -05:00
Michael Golowka f77bcc53c4
Move sdk/helper/random -> helper/random (#9226)
* This package is new for 1.5 so this is not a breaking change.
* This is being moved because this code was originally intended to be used
within plugins, however the design of password policies has changed such
that this is no longer needed. Thus, this code doesn't need to be in the
public SDK.
2020-06-17 14:24:38 -06:00
Scott Miller 0b9a40a64e
Add a simple sealed gauge, updated when seal status changes (#9177)
* Add a simple unsealed gauge, updated when seal status changes
2020-06-17 09:50:28 -05:00
Mike Jarmy e608503139
Test Shamir-to-Transit and Transit-to-Shamir Seal Migration for post-1.4 Vault. (#9214)
* move adjustForSealMigration to vault package

* fix adjustForSealMigration

* begin working on new seal migration test

* create shamir seal migration test

* refactor testhelpers

* add VerifyRaftConfiguration to testhelpers

* stub out TestTransit

* Revert "refactor testhelpers"

This reverts commit 39593defd0d4c6fd79aedfd37df6298391abb9db.

* get shamir test working again

* stub out transit join

* work on transit join

* remove debug code

* initTransit now works with raft join

* runTransit works with inmem

* work on runTransit with raft

* runTransit works with raft

* cleanup tests

* TestSealMigration_TransitToShamir_Pre14

* TestSealMigration_ShamirToTransit_Pre14

* split for pre-1.4 testing

* add simple tests for transit and shamir

* fix typo in test suite

* debug wrapper type

* test debug

* test-debug

* refactor core migration

* Revert "refactor core migration"

This reverts commit a776452d32a9dca7a51e3df4a76b9234d8c0c7ce.

* begin refactor of adjustForSealMigration

* fix bug in adjustForSealMigration

* clean up tests

* clean up core refactoring

* fix bug in shamir->transit migration

* stub out test that brings individual nodes up and down

* refactor NewTestCluster

* pass listeners into newCore()

* simplify cluster address setup

* simplify extra test core setup

* refactor TestCluster for readability

* refactor TestCluster for readability

* refactor TestCluster for readability

* add shutdown func to TestCore

* add cleanup func to TestCore

* create RestartCore

* stub out TestSealMigration_ShamirToTransit_Post14

* refactor address handling in NewTestCluster

* fix listener setup in newCore()

* remove unnecessary lock from setSealsForMigration()

* rename sealmigration test package

* use ephemeral ports below 30000

* work on post-1.4 migration testing

* clean up pre-1.4 test

* TestSealMigration_ShamirToTransit_Post14 works for non-raft

* work on raft TestSealMigration_ShamirToTransit_Post14

* clean up test code

* refactor TestClusterCore

* clean up TestClusterCore

* stub out some temporary tests

* use HardcodedServerAddressProvider in seal migration tests

* work on raft for TestSealMigration_ShamirToTransit_Post14

* always use hardcoded raft address provider in seal migration tests

* debug TestSealMigration_ShamirToTransit_Post14

* fix bug in RestartCore

* remove debug code

* TestSealMigration_ShamirToTransit_Post14 works now

* clean up debug code

* clean up tests

* cleanup tests

* refactor test code

* stub out TestSealMigration_TransitToShamir_Post14

* set seals properly for transit->shamir migration

* migrateFromTransitToShamir_Post14 works for inmem

* migrateFromTransitToShamir_Post14 works for raft

* use base ports per-test

* fix seal verification test code

* simplify seal migration test suite

* simplify test suite

* cleanup test suite

* use explicit ports below 30000

* simplify use of numTestCores

* Update vault/external_tests/sealmigration/seal_migration_test.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>

* Update vault/external_tests/sealmigration/seal_migration_test.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>

* clean up imports

* rename to StartCore()

* Update vault/testing.go

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>

* simplify test suite

* clean up tests

Co-authored-by: Calvin Leung Huang <cleung2010@gmail.com>
2020-06-16 14:12:22 -04:00
Mark Gritter 50b388a93c
Changes to expiration manager to walk tokens (#9182)
* Changes to expiration manager to walk tokens (including non-expiring ones.)
* Count by namespace in token manager.
* Keep a dictionary of policy lists and deduplicate based on it.
2020-06-15 18:54:36 -05:00
Austin Gebauer a826bd1d23
fix: configutil redeclared as imported package name (#9211) 2020-06-11 18:08:46 -07:00
Michael Golowka a89f09802d
Integrate password policies into RabbitMQ secret engine (#9143)
* Add password policies to RabbitMQ & update docs
* Also updates some parts of the password policies to aid/fix testing
2020-06-11 16:08:20 -06:00
Mike Jarmy 4303790aae
Test pre-1.4 seal migration (#9085)
* enable seal wrap in all seal migration tests

* move adjustForSealMigration to vault package

* fix adjustForSealMigration

* begin working on new seal migration test

* create shamir seal migration test

* refactor testhelpers

* add VerifyRaftConfiguration to testhelpers

* stub out TestTransit

* Revert "refactor testhelpers"

This reverts commit 39593defd0d4c6fd79aedfd37df6298391abb9db.

* get shamir test working again

* stub out transit join

* work on transit join

* Revert "move resuable storage test to avoid creating import cycle"

This reverts commit b3ff2317381a5af12a53117f87d1c6fbb093af6b.

* remove debug code

* initTransit now works with raft join

* runTransit works with inmem

* work on runTransit with raft

* runTransit works with raft

* get rid of dis-used test

* cleanup tests

* TestSealMigration_TransitToShamir_Pre14

* TestSealMigration_ShamirToTransit_Pre14

* split for pre-1.4 testing

* add simple tests for transit and shamir

* fix typo in test suite

* debug wrapper type

* test debug

* test-debug

* refactor core migration

* Revert "refactor core migration"

This reverts commit a776452d32a9dca7a51e3df4a76b9234d8c0c7ce.

* begin refactor of adjustForSealMigration

* fix bug in adjustForSealMigration

* clean up tests

* clean up core refactoring

* fix bug in shamir->transit migration

* remove unnecessary lock from setSealsForMigration()

* rename sealmigration test package

* use ephemeral ports below 30000

* simplify use of numTestCores
2020-06-11 15:07:59 -04:00
ncabatoff 2ec9049ef7
Add ssh signing algorithm as a role option. (#9096) 2020-06-11 08:10:13 -04:00
ncabatoff fdba917b66
Fix feature flag persistence: we shouldn't have excluded dr primaries, they too must write feature flags. DR secondaries might not need depend on feature flags being there, but a DR primary could also be (or become) a perf primary. (#9148) 2020-06-04 13:00:33 -04:00
Mark Gritter 475fe0eede
Token creation counters (#9052)
* Add token creation counters.
* Created a utility to change TTL to bucket name.
* Add counter covering token creation for response wrapping.
* Fix namespace label, with a new utility function.
2020-06-02 13:40:54 -05:00
ncabatoff 8870b2e51c
Add mongodbatlas static roles support (#8987)
* Refactor PG container creation.
* Rework rotation tests to use shorter sleeps.
* Refactor rotation tests.
* Add a static role rotation test for MongoDB Atlas.
2020-05-29 14:21:23 -04:00
Scott Miller 12d704d97f
Provide token ttl and issue time in the audit log. (#9091)
* Populate a token_ttl and token_issue_time field on the Auth struct of audit log entries, and in the Auth portion of a response for login methods

* Revert go fmt, better zero checking

* Update unit tests

* changelog++
2020-05-29 12:30:47 -05:00