Commit graph

46 commits

Author SHA1 Message Date
Alexander Scheel 544ae3461f
Allow ensuring ticker is stopped multiple times (#20509)
When executing multi-stage, multi-namespace tests, stopping the ticker
multiple times (via closing the StopTicker channel) results in a panic.

Store whether or not we've stopped it once, and do not close it again.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
2023-05-04 14:14:06 +00:00
Alexander Scheel f0fc48c788
Only log stopping rollback manager once (#20041)
When testing the Rollback Manager's one-time invocation in Enterprise,
it was noticed that due to the channel being closed, we'd always hit
this case and thus spam logs rather quickly with this message.

Switch to a boolean flip to log this once, as it is not executed in
parallel and thus doesn't need a sync.Once.

This only affected anyone calling the test core's
StopAutomaticRollbacks() helper.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
2023-04-10 13:24:03 -04:00
Alexander Scheel e95fadd8f0
Add mechanism to disable automatic rollbacks (#19748)
When testing the rollback mechanism, there's two categories of tests
typically written:

 1. Ones in which the rollback manager is entirely left alone, which
    usually are a bit slower and less predictable. However, it is still
    sufficient in many scenarios.
 2. Ones in which the rollback manager is explicitly probed by tests
    and "stepped" to achieve the next rollback.

Here, without a mechanism to fully disable the rollback manager's
periodic ticker (without affecting its ability to work!) we'll continue
to see races of the sort:

>     --- FAIL: TestRevocationQueue (50.95s)
>     panic: sync: WaitGroup is reused before previous Wait has returned [recovered]
>         panic: sync: WaitGroup is reused before previous Wait has returned

This allows us to disable the ticker, returning control to the test
suite entirely.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
2023-03-29 08:59:01 -04:00
Hamid Ghaf 27bb03bbc0
adding copyright header (#19555)
* adding copyright header

* fix fmt and a test
2023-03-15 09:00:52 -07:00
Nick Cabatoff 8e67651dcd
Fix a data race with rollbackPeriod. (#17387) 2022-10-13 09:59:07 -04:00
Nick Cabatoff 559754d580
Break grabLockOrStop into two pieces to facilitate investigating deadlocks (#17187)
Break grabLockOrStop into two pieces to facilitate investigating deadlocks.  Without this change, the "grab" goroutine looks the same regardless of who was calling grabLockOrStop, so there's no way to identify one of the deadlock parties.
2022-09-20 11:03:16 -04:00
Alexander Scheel a5fafd8163
Add ability to perform automatic tidy operations (#16900)
* Add ability to perform automatic tidy operations

This enables the PKI secrets engine to allow tidy to be started
periodically by the engine itself, avoiding the need for interaction.
This operation is disabled by default (to avoid load on clusters which
don't need tidy to be run) but can be enabled.

In particular, a default tidy configuration is written (via
/config/auto-tidy) which mirrors the options passed to /tidy. Two
additional parameters, enabled and interval, are accepted, allowing
auto-tidy to be enabled or disabled and controlling the interval
(between successful tidy runs) to attempt auto-tidy.

Notably, a manual execution of tidy will delay additional auto-tidy
operations. Status is reported via the existing /tidy-status endpoint.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add changelog entry

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add documentation on auto-tidy

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add tests for auto-tidy

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Prevent race during parallel testing

We modified the RollbackManager's execution window to allow more
faithful testing of the periodicFunc. However, the TestAutoRebuild and
the new TestAutoTidy would then race against each other for modifying
the period and creating their clusters (before resetting to the old
value).

This changeset adds a lock around this, preventing the races.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Use tidyStatusLock to gate lastTidy time

This prevents a data race between the periodic func and the execution of
the running tidy.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add read lock around tidyStatus gauges

When reading from tidyStatus for computing gauges, since the underlying
values aren't atomics, we really should be gating these with a read lock
around the status access.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
2022-08-30 15:45:54 -04:00
Alexander Scheel cacb23bda6
Enable periodic, automatic rebuilding of CRLs (#16762)
* Allow automatic rebuilding of CRLs

When enabled, periodic rebuilding of CRLs will improve PKI mounts in two
way:

 1. Reduced load during periods of high (new) revocations, as the CRL
    isn't rebuilt after each revocation but instead on a fixed schedule.
 2. Ensuring the CRL is never stale as long as the cluster remains up,
    by checking for next CRL expiry and regenerating CRLs before that
    happens. This may increase cluster load when operators have large
    CRLs that they'd prefer to let go stale, rather than regenerating
    fresh copies.

In particular, we set a grace period before expiration of CRLs where,
when the periodic function triggers (about once a minute), we check
upcoming CRL expirations and check if we need to rebuild the CRLs.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add changelog entry

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add documentation on periodic rebuilding

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Allow modification of rollback period for testing

When testing backends that use the periodic func, and specifically,
testing the behavior of that periodic func, waiting for the usual 1m
interval can lead to excessively long test execution. By switching to a
shorter period--strictly for testing--we can make these tests execute
faster.

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Add tests for auto-rebuilding of CRLs

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Remove non-updating getConfig variant

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

* Avoid double reload of config

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>

Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
2022-08-23 13:27:15 -04:00
Eng Zer Jun 61262ad98e
refactor: replace strings.Replace with strings.ReplaceAll (#15392)
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-08-03 15:22:48 -04:00
Brian Kassouf 303c2aee7c
Run a more strict formatter over the code (#11312)
* Update tooling

* Run gofumpt

* go mod vendor
2021-04-08 09:43:39 -07:00
Christian Muehlhaeuser e6febc5839 Fixed a bunch of typos (#7146) 2019-07-18 21:10:15 -04:00
Jeff Mitchell 66aef1f996
Make rollback less spammy (#7136)
We already have a separate log line if rollback fails. It really fills
up logs to always note when rollback is occurring and it usually isn't
useful for incidents.
2019-07-18 21:05:21 -04:00
Jeff Mitchell ffce5ca702 Fix various read only storage errors
* Fix various read only storage errors

A mistake we've seen multiple times in our own plugins and that we've
seen in the GCP plugin now is that control flow (how the code is
structured, helper functions, etc.) can obfuscate whether an error came
from storage or some other Vault-core location (in which case likely it
needs to be a 5XX message) or because of user input (thus 4XX). Error
handling for functions therefore often ends up always treating errors as
either user related or internal.

When the error is logical.ErrReadOnly this means that treating errors as
user errors skips the check that triggers forwarding, instead returning
a read only view error to the user.

While it's obviously more correct to fix that code, it's not always
immediately apparent to reviewers or fixers what the issue is and fixing
it when it's found both requires someone to hit the problem and report
it (thus exposing bugs to users) and selective targeted refactoring that
only helps that one specific case.

If instead we check whether the logical.Response is an error and, if so,
whether it contains the error value, we work around this in all of these
cases automatically. It feels hacky since it's a coding mistake, but
it's one we've made too multiple times, and avoiding bugs altogether is
better for our users.
2019-07-05 18:13:49 -04:00
Jeff Mitchell 8bcb533a1b
Create sdk/ and api/ submodules (#6583) 2019-04-12 17:54:35 -04:00
Brian Kassouf 2af9d9de98
Fix a locking issue in the Rollback manager (#6426)
* Fix a locking issue in the Rollback manager

* Update rollback.go

* Update rollback.go

* move state creation

* Update vault/rollback.go

Co-Authored-By: briankassouf <briankassouf@users.noreply.github.com>

* Simplify logic by canceling the lock grab

* Use context instead of a chan

* Update vault/rollback.go
2019-03-18 11:12:48 -07:00
Jim Kalafut d0e2badbae Run goimports across the repository (#6010)
The result will still pass gofmtcheck and won't trigger additional
changes if someone isn't using goimports, but it will avoid the
piecemeal imports changes we've been seeing.
2019-01-08 16:48:57 -08:00
Jeff Mitchell 919b968c27
The big one (#5346) 2018-09-17 23:03:00 -04:00
Jeff Mitchell c28ed23972
Allow most parts of Vault's logging to have its level changed on-the-fly (#5280)
* Allow most parts of Vault's logging to have its level changed on-the-fly

* Use a const for not set
2018-09-05 15:52:54 -04:00
Jeff Mitchell 362a92945e Don't resetnamed 2018-08-23 15:04:18 -04:00
Brian Kassouf 3330a3954a
Rollback: Only grab statelock if we are running async (#5064) 2018-08-07 17:48:43 -07:00
Brian Kassouf 1781ade3a1
core: Read lock when doing a rollback (#5016) 2018-07-30 19:55:43 -06:00
Jeff Mitchell 4261618d10 Add request timeouts in normal request path and to expirations (#4971)
* Add request timeouts in normal request path and to expirations

* Add ability to adjust default max request duration

* Some test fixes

* Ensure tests have defaults set for max request duration

* Add context cancel checking to inmem/file

* Fix tests

* Fix tests

* Set default max request duration to basically infinity for this release for BC

* Address feedback
2018-07-24 14:50:49 -07:00
Becca Petrin 03cf302e9a Move to "github.com/hashicorp/go-hclog" (#4227)
* logbridge with hclog and identical output

* Initial search & replace

This compiles, but there is a fair amount of TODO
and commented out code, especially around the
plugin logclient/logserver code.

* strip logbridge

* fix majority of tests

* update logxi aliases

* WIP fixing tests

* more test fixes

* Update test to hclog

* Fix format

* Rename hclog -> log

* WIP making hclog and logxi love each other

* update logger_test.go

* clean up merged comments

* Replace RawLogger interface with a Logger

* Add some logger names

* Replace Trace with Debug

* update builtin logical logging patterns

* Fix build errors

* More log updates

* update log approach in command and builtin

* More log updates

* update helper, http, and logical directories

* Update loggers

* Log updates

* Update logging

* Update logging

* Update logging

* Update logging

* update logging in physical

* prefixing and lowercase

* Update logging

* Move phyisical logging name to server command

* Fix som tests

* address jims feedback so far

* incorporate brians feedback so far

* strip comments

* move vault.go to logging package

* update Debug to Trace

* Update go-plugin deps

* Update logging based on review comments

* Updates from review

* Unvendor logxi

* Remove null_logger.go
2018-04-02 17:46:59 -07:00
Jeff Mitchell fc6564e4ee
Don't run rollback and upgrade functionality if we are a replication secondary (#3900)
* Don't run rollback and upgrade functionality if we are a replication
secondary, but do if the mount is local.
2018-02-02 20:28:25 -05:00
Brian Kassouf 2f19de0305 Add context to storage backends and wire it through a lot of places (#3817) 2018-01-19 01:44:44 -05:00
Brian Kassouf 1c190d4bda
Pass context to backends (#3750)
* Start work on passing context to backends

* More work on passing context

* Unindent logical system

* Unindent token store

* Unindent passthrough

* Unindent cubbyhole

* Fix tests

* use requestContext in rollback and expiration managers
2018-01-08 10:31:38 -08:00
Chris Hoffman 6651f3aa33 checking for nil backends before attempting to rollback on a backend (#2997) 2017-07-12 13:12:55 -07:00
Jeff Mitchell 8091a10c38 Make rollback attempts trace level instead of debug level 2017-02-27 09:41:56 -05:00
Jeff Mitchell fffee5611a Rejig locks during unmount/remount. (#1855) 2016-09-13 11:50:14 -04:00
Jeff Mitchell 58b32e5432 Convert to logxi 2016-08-21 18:13:37 -04:00
Jeff Mitchell 80a688c059 Ensure mount/auth tables are not nil when triggering rollback
During setup or teardown there could be a race condition so check for it
to avoid a potential panic.
2016-07-18 22:02:39 -04:00
vishalnayak 1d94828e45 Re-enable rollback triggers for auth backends 2016-05-26 14:29:41 -04:00
Jeff Mitchell 417a56c42b Disable rollback on auth for now and add workaround for its auth/ adding to entry paths 2016-05-25 17:53:45 -04:00
vishalnayak 9aa8fb6cc1 Support periodic tidy callback and config endpoints. 2016-04-26 10:22:29 -04:00
Jeff Mitchell bece637eb7 Address feedback from review 2015-11-15 17:32:57 -05:00
Jeff Mitchell bc4c18a1cf Rearchitect MountTable locking and fix rollback.
The rollback manager was using a saved MountTable rather than the
current table, causing it to attempt to rollback unmounted mounts, and
never rollback new mounts.

In fixing this, it became clear that bad things could happen to the
mount table...the table itself could be locked, but the table pointer
(which is what the rollback manager needs) could be modified at any time
without locking. This commit therefore also returns locking to a mutex
outside the table instead of inside, and plumbs RLock/RUnlock through to
the various places that are reading the table but not holding a write
lock.

Both unit tests and race detection pass.

Fixes #771
2015-11-11 11:54:52 -05:00
Armon Dadgar 496ebe561c vault: cleanups for the audit log changes 2015-06-29 15:27:28 -07:00
Nate Brown c55f103c58 Adding error and remote_address to audit log lines 2015-06-18 17:17:18 -07:00
Armon Dadgar 512b3d7afd vault: Adding metrics profiling 2015-04-08 16:43:17 -07:00
Armon Dadgar d74c4c1c33 vault: Remove log about rollback 2015-04-03 17:11:24 -07:00
Armon Dadgar f231a6c67d vault: rollback supports joining an inflight operation 2015-04-01 22:12:03 -07:00
Mitchell Hashimoto 6f9d63dea5 vault: comment mounts mapping in rollback manager 2015-03-17 20:53:28 -05:00
Mitchell Hashimoto 05f86ca957 vault: put uint32 at top of struct to avoid alignment issues 2015-03-17 20:46:10 -05:00
Mitchell Hashimoto 97dab0c285 vault: ignore backends that don't support rollback 2015-03-17 20:39:45 -05:00
Mitchell Hashimoto e078b957d4 vault: start/stop rollback manager post/pre seal 2015-03-17 20:39:45 -05:00
Mitchell Hashimoto c7b9148841 vault: RollbackManager
There are some major TODO items here, and it isn't hooked into the core
yet, but the basic functionality is there.
2015-03-17 20:39:45 -05:00