Commit graph

2228 commits

Author SHA1 Message Date
Hridoy Roy 65e3489c45
Diagnose resource creation checks (#11627)
* initial refactoring of unseal step in run

* remove waitgroup

* remove waitgroup

* backup work

* backup

* backup

* completely modularize run and move into diagnose

* add diagnose errors for incorrect number of unseal keys

* comment tests back in

* backup

* first subspan

* finished subspanning but running into error with timeouts

* remove runtime checks

* merge main branch

* meeting updates

* remove telemetry block

* roy comment

* subspans for seal finalization and wrapping diagnose latency checks

* backup while I fix something else

* fix storage latency test errors

* runtime checks

* diagnose with timeout on seal
2021-06-10 12:29:32 -07:00
swayne275 7c5c9c830e
fix irrevocable leases API deadlock on m.coreStateLock (#11815)
* fix m.coreStateLock deadlock for API handler code

* improve lock comment
2021-06-10 11:01:14 -06:00
Mark Gritter 3e8a4bad1a
Acquire a per-lock lease to make renew and revoke atomic wrt each other. (#11122)
* Acquire a per-lock lease to make renew and revoke atomic wrt each other.
This means we don't have to hold pendingLock during I/O.

* Attempted fix for deadlock in token revocation.

* Comment fix.

* Fix error checking in loadEntry.

* Add benchmark

* Add a few additional locking locations

* Improve benchmark slightly

* Update vault/expiration.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Update vault/expiration.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add a lease lock into tidy

Co-authored-by: Scott Miller <smiller@hashicorp.com>
Co-authored-by: Brian Kassouf <bkassouf@hashicorp.com>
Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
Co-authored-by: swayne275 <swayne275@gmail.com>
2021-06-09 17:28:41 -07:00
Nick Cabatoff c07711b488
Add some paths that use nonstandard auth. (#11804) 2021-06-09 17:04:06 -04:00
Nick Cabatoff abdf54c77e
Use the perfStandby field instead of the method because we don't want to try to grab the readlock when we're servicing a request (#11793) 2021-06-08 11:55:38 -04:00
Scott Miller b795643146
Third time trying to get the os_openbsd_arm.go right. (#11786)
* Fix the third dumb thing in this file

* ...
2021-06-07 17:19:47 -05:00
Scott Miller 69d0242db9
Add Advice and Advise functions for adding an advice section to any span. (#11760)
* wip

* wip

* Finish implementing advice handling and word wrapping

* Properly word wrap messages and warnings

* Remove debugging

* Remove debugging

* Remove unnecessary test

* unit test bug

* go vendor
2021-06-07 11:29:36 -05:00
Scott Miller c896c3a6aa
Fix SpotSkipped on openbsd/arm (#11781) 2021-06-07 11:29:15 -05:00
Hridoy Roy 1782b4e880
oss part of control groups upgrade (#11772)
* oss part of control groups upgrade

* changelog and docs

* formatting

* formatting
2021-06-07 09:15:35 -07:00
swayne275 62a92a77d7
bug fixes and improvements from ent namespace work (#11774) 2021-06-04 13:25:41 -06:00
Josh Black c8cfcd9514
OSS parts of sighup license reload (#11767) 2021-06-04 10:24:35 -07:00
Scott Miller fe8f0476fd
Fix build boolean logic causing os_common to be skipped wrongly (#11766) 2021-06-03 17:29:34 -05:00
Josh Black c6c0424a8e
OSS parts of sys/config/reload/license (#11695) 2021-06-03 10:30:30 -07:00
Scott Miller 9db384e0f7
Edge cases in the skipping gathering and output logic (#11752)
* Edge cases in the skipping gathering and output logic

* Fix unit test
2021-06-03 11:01:14 -05:00
Scott Miller d151b0b55b
Segment out disk checks to disable on openbsd/arm (#11749)
* Segment out disk checks to disable on openbsd/arm

Also add a spot skipped helper.

* Expected results may be fewer than actual because of variable length tests like disk usage

* Move to os_common and build on windows
2021-06-02 12:17:52 -05:00
swayne275 35aad1df4a
Vault 2256: fix lease count quotas causing panics on dr secondaries (#11742)
* lift relevant changes from ent to oss

* fix silent error bug in quotas
2021-06-02 10:12:05 -06:00
swayne275 9724f59180
Vault 1979: Query API for Irrevocable Leases (#11607)
* build out lease count (not fully working), start lease list

* build out irrevocable lease list

* bookkeeping

* test irrevocable lease counts for API/CLI

* fix listIrrevocableLeases, test listIrrevocableLeases, cleanup

* test expiration API limit

* namespace tweaks, test force flag on lease list

* integration test leases/count API, plenty of fixes and improvements

* test lease list API, fixes and improvements

* test force flag for irrevocable lease list API

* i guess this wasn't saved on the last refactor...

* fixes and improvements found during my review

* better test error msg

* Update vault/logical_system_paths.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* Update vault/logical_system_paths.go

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>

* return warning with data if more than default leases to list without force flag

* make api doc more generalized

* list leases in general, not by mount point

* change force flag to include_large_results

* sort leases by LeaseID for consistent API response

* switch from bool flag for API limit to string value

* sort first by leaseID, then stable sort by expiration

* move some utils to be in oss and ent

* improve sort efficiency for API response

Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>
2021-06-02 10:11:30 -06:00
Scott Miller 59bea5f6fd
Add ulimit check, and tidy unit test cases (#11678)
* Add ulimit check, and tidy unit test cases to avoid needing to have all results and perfect ordering

* Make order independent check recursive

* Fix unit tests

* Try a 5s request timeout
2021-06-01 12:43:51 -05:00
Cory Bolar 6c726bde67
Only fail auth plugin backend mounting for custom plugins (#11690)
Checking if a plugin is a builtin by comparing it's type to those in the builtin
registry allows for a custom plugin loaded with the same name to be considered a
builtin during error handling of the mounting process.  This can cause the vault
installation to brick itself because it cannot be unsealed to register a new
sha256 or file path for a previously loaded custom plugin.  Improve this logic
by checking the plugin catalog rather than the builtin registry.

Fixes #11687
2021-05-27 13:37:45 -05:00
Scott Miller 46da8ebae6
Diagnose Disk usage checks (#11673)
* Disk usage checks

* Move disk free earlier

* Move logic to helpers

* Bring over test logic from the ulimit PR

* imports

* Report error

* Get unit tests working
2021-05-27 10:17:52 -07:00
Hridoy Roy b6e1d62d33
Modularize Run Command (#11573)
* initial refactoring of unseal step in run

* remove waitgroup

* remove waitgroup

* backup work

* backup

* backup

* completely modularize run and move into diagnose

* add diagnose errors for incorrect number of unseal keys

* comment tests back in

* backup

* first subspan

* finished subspanning but running into error with timeouts

* remove runtime checks

* meeting updates

* remove telemetry block

* roy comment

* subspans for seal finalization and wrapping diagnose latency checks

* fix storage latency test errors

* review comments

* use random uuid for latency checks instead of static id
2021-05-25 15:23:20 -07:00
Scott Miller e305f78e9c
Add a Success helper to set successful spans' messages (#11621) 2021-05-21 19:21:23 -07:00
Scott Miller 941d01eee3
Initial Diagnose CLI output (#11583)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* wip

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

* wip

* wip

* wip

* Revendor otel

* Fix merge related problems

* imports

* Fix unit tests

Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-21 19:21:11 -07:00
Nick Cabatoff 2adef1f878
OSS parts of #1891 (sys/health license addition) (#11676) 2021-05-20 13:32:15 -04:00
Nick Cabatoff 01f96f18ce
VAULT-2439: OSS parts of #1889 (raft licensing init) (#11665) 2021-05-19 16:07:58 -04:00
Brian Kassouf eabfda0c79
Make sure we are checking the final peerset (#11664) 2021-05-19 12:01:52 -07:00
Brian Kassouf f498d0d389
Reload raft TLS keys on active startup (#11660) 2021-05-19 10:03:32 -07:00
Scott Miller 6b8d7fe2e6
Patch expiration fix over from ENT (#11650)
* Patch expiration fix over from ENT

* Rename changelog
2021-05-18 14:55:38 -07:00
Nick Cabatoff e212ec5d8e
OSS parts of ent PR #1857: license autoloading init changes. (#11623) 2021-05-17 14:10:26 -04:00
Scott Miller 7b05fcf9bf
Don't backoff if a listener error was a timeout (#11594) 2021-05-13 17:02:25 -05:00
Jason O'Donnell 502cf3b212
core: updates to password policy generator (#11596)
* core: fix bug in password policies not using namespaces

* Add changelog
2021-05-13 09:55:46 -04:00
Josh Black 641a81b74b
Provide a new API endpoint for retrieving signed licenses (#11543) 2021-05-12 12:19:25 -07:00
Scott Miller 9dbf1a7dba
Add infrastructure and helpers for skipping diagnose checks (#11593)
* Add infrastructure for skipping tests

* Add infrastructure for skipping tests

* Set it

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Implement type alias for test functions

Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-12 12:54:40 -05:00
Scott Miller f0c3192f20
Add an exponential backoff to TCP listeners to avoid fast loops in error scenarios (#11588)
* Add an exponential backoff to TCP listeners to avoid fast loops in error scenarios

* reset loop delay

* changelog
2021-05-12 10:47:38 -05:00
swayne275 0229787a26
Vault 2303: Count irrevocable leases in quotas/metrics and other improvements (#11542)
* shrink generic error message

* move zombie loading to updatePendingInternal from loadEntryInternal

* probably the right metric/lease behavior for irrevocable leases...

* comment improvements

* test total lease count with valid and irrevocable leases
2021-05-11 14:04:06 -06:00
Lars Lehtonen 53dd619d2f
vault: deprecate errwrap.Wrapf() (#11577) 2021-05-11 13:12:54 -04:00
swayne275 f467d86805
don't panic if no data, improve log (#11581) 2021-05-11 10:41:57 -06:00
Scott Miller 40e0806f29
Add timeout helper (#11554)
* Add timeout helper

* cleanup-imports

* didnt need mainsection

Co-authored-by: HridoyRoy <roy@hashicorp.com>
2021-05-10 17:55:43 -07:00
swayne275 ee53fc5afb
Fix activity log fragment race test (#11565)
* grab reference to done channel while locked

* add clarifying comment

* lock and grab channel reference again for test races
2021-05-07 17:02:11 -06:00
Nick Cabatoff 53c7d1de7d
config for autoloading license (oss parts) 2021-05-07 08:55:41 -04:00
Nick Cabatoff 0926e302c5
Fix some panics resulting from calling ExpirationManager.Stop directly. (#11547) 2021-05-06 12:39:01 -04:00
Nick Cabatoff 663ad150a7
Make TestActivityLog_MultipleFragmentsAndSegments timeout on its own (#11490)
* The main driver for this change was to make the read from a.newFragmentCh timeout quickly rather than waiting for the test timeout (much longer).  While testing the change I observed a panic during shutdown, but it was swallowed and moreover there was no stack trace so it wasn't obvious.  I'm hoping we can get rid of the recover, so I fixed the issue in the activitylog tests that needed it.
2021-05-06 10:19:53 -04:00
Scott Miller 4fc6e8b366
Fix barrier key autoration config edge cases (#11541)
* Add an Int64 type

* Use the new Int64 type so that even 32 bit builds can specify max_operations above 2^31

* Missed a spot

* go mod vendor

* fix cast

* changelog

* Update unit test to ensure this works on both 32 and 64-bit archs
2021-05-05 14:39:04 -05:00
swayne275 56c57ffddd
change zombie terminology to irrevocable (#11525) 2021-05-04 15:48:02 -06:00
swayne275 335e4c3711
Introduce Logical Unrecoverable Error, Use it in Expiration Manager (#11477)
* build out zombie lease system

* add typo for CI

* undo test CI commit

* time equality test isn't working on CI, so let's see what this does...

* add unrecoverable proto error, make proto, go mod vendor

* zombify leases if unrecoverable error, tests

* test fix: somehow pointer in pointer rx is null after pointer rx called

* tweaks based on roy feedback

* improve zombie errors

* update which errors are unrecoverable

* combine zombie logic

* keep subset of zombie lease in memory
2021-05-03 17:56:06 -06:00
Vishal Nayak 1e61f799ca
Use correct mount accessor when refreshing external group memberships (#11506)
* Use correct mount accessor when refreshing external group memberships

* Add CL

* Handle the renew case properly
2021-05-03 08:23:59 -04:00
Hridoy Roy 28aa9b9175
Diagnoses Consul Direct Access Check (#11505)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* consul tls checks

* draft for storage end to end check

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

* cleaned up chan logic. Tests next.

* fix tests

* remove a comment

* tests

* remove a comment

* run direct access checks in diagnose command

* review comments

Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-02 16:21:06 -07:00
Hridoy Roy 5b63066c2c
Diagnose V0: Storage End to End Checks (#11468)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* consul tls checks

* draft for storage end to end check

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

* cleaned up chan logic. Tests next.

* fix tests

* remove a comment

* tests

* remove a comment

* cosmetic changes

Co-authored-by: Scott G. Miller <smiller@hashicorp.com>
Co-authored-by: swayne275 <swayne275@gmail.com>
2021-05-02 13:33:13 -07:00
Scott Miller fcb9bab51b
Fix rotate/config unit test for 32 bit yet again (#11491) 2021-04-30 10:31:11 -05:00
Scott Miller 85fbd45e1c
Create helpers which integrate with OpenTelemetry for diagnose collection (#11454)
* Create helpers which integrate with OpenTelemetry for diagnose collection

* Go mod vendor

* Comments

* Update vault/diagnose/helpers.go

Co-authored-by: swayne275 <swayne275@gmail.com>

* Add unit test/example

* tweak output

* More comments

* add spot check concept

* Get unit tests working on Result structs

* Fix unit test

* Get unit tests working, and make diagnose sessions local rather than global

* Comments

* Last comments

* No need for init

* :|

* Fix helpers_test

Co-authored-by: swayne275 <swayne275@gmail.com>
2021-04-29 13:32:41 -05:00