open-vault

Commit Graph

Author	SHA1	Message	Date
Brian Kassouf	a112161f60	expiration: Add a few metrics to measure revoke queue lengths (#10955 ) * expiration: Add a few metrics to measure revoke queue lengths * Update the metric names * Add appropriate cluster labels * Add metrics to docs * Update jobmanager.go	2021-02-26 16:00:39 -08:00
swayne275	38a647c6e5	remove noisy log, simplify job interface (#10975 )	2021-02-22 15:00:24 -07:00
Brian Kassouf	0ad63e5a20	core/expiration: Add backoff jitter to the expiration retries (#10937 )	2021-02-18 20:20:01 -08:00
swayne275	e4119a6a8a	Vault-1403 Switch Expiration Manager to use Fairsharing Backpressure (#1709 ) (#10932 ) * basic pool and start testing * refactor a bit for testing * workFunc, start/stop safety, testing * cleanup function for worker quit, more tests * redo public/private members * improve tests, export types, switch uuid package * fix loop capture bug, cleanup * cleanup tests * update worker pool file name, other improvements * add job manager prototype * remove remnants * add functions to wait for job manager and worker pool to stop, other fixes * test job manager functionality, fix bugs * encapsulate how jobs are distributed to workers * make worker job channel read only * add job interface, more testing, fixes * set name for dispatcher * fix test races * wire up expiration manager most of the way * dispatcher and job manager constructors don't return errors * logger now dependency injected * make some members private, test fcn to get worker pool size * make GetNumWorkers public * Update helper/fairshare/jobmanager_test.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * update fairsharing usage, add tests * make workerpool private * remove custom worker names * concurrency improvements * remove worker pool cleanup function * remove cleanup func from job manager, remove non blocking stop from fairshare * update job manager for new constructor * stop job manager when expiration manager stopped * unset env var after test * stop fairshare when started in tests * stop leaking job manager goroutine * prototype channel for waking up to assign work * fix typo/bug and add tests * improve job manager wake up, fix test typo * put channel drain back * better start/pause test for job manager * comment cleanup * degrade possible noisy log * remove closure, clean up context * improve revocation context timer * test: reduce number of revocation workers during many tests * Update vault/expiration.go Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> * feedback tweaks Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com> Co-authored-by: Brian Kassouf <briankassouf@users.noreply.github.com>	2021-02-17 14:30:27 -08:00
Nick Cabatoff	c2bdeb9e7d	Minimal change to ensure that the bulky leaseEntry isn't kept in memory. (#10726 )	2021-01-19 17:51:41 -05:00
swayne275	cdf933adf1	say how many leases there are when threshold exceeded (#10567 )	2020-12-14 16:00:19 -07:00
Hridoy Roy	6261afb343	Port: Telemetry For Lease Expiration Times (#10375 ) * port lease metrics * go mod vendor * caught a bug	2020-11-13 10:26:58 -08:00
Brian Kassouf	8af08c3221	Add an env var to enable a permit pool that limits lease expirations (#10268 ) * Add a flag to enable a permit pool to gate lease expiration * Use the env var to get the size * Add logs and metris to help debug this Co-authored-by: Hridoy Roy <roy@hashicorp.com>	2020-10-30 14:45:44 -07:00
Brian Kassouf	cb37fda0a7	Expiration: Fix lease counting logic (#10106 )	2020-10-07 17:27:45 -07:00
Brian Kassouf	b0d3d9bf49	Update lease timer logic (#10030 )	2020-09-23 11:46:22 -07:00
Brian Kassouf	3f30fc5f4e	Port changes from enterprise lease fix (#10020 )	2020-09-22 14:47:13 -07:00
Mark Gritter	707fdea702	Don't return quota error on revoke. (#9374 ) Changed log messages to be clearer about quota operations. This should fix enterprise unit test failures.	2020-07-01 14:41:42 -05:00
Vishal Nayak	c6876fe00f	Resource Quotas: Rate Limiting (#9330 )	2020-06-26 17:13:16 -04:00
Mark Gritter	97d415d024	Token gauge metrics implementation. (#9239 ) * Token gauge metrics implementation. * Enable gauges only when interval is nonzero. * Added count by TTL * Yandle "in restore mode" error specifically. * Refactored initialization code for gauge collection processes. * Fixed for multiple namespaces. * Ability to disable individual gauges with environment variable. * changelog++	2020-06-23 18:36:24 -05:00
Mark Gritter	50b388a93c	Changes to expiration manager to walk tokens (#9182 ) * Changes to expiration manager to walk tokens (including non-expiring ones.) * Count by namespace in token manager. * Keep a dictionary of policy lists and deduplicate based on it.	2020-06-15 18:54:36 -05:00
Mark Gritter	71b3de0450	Switch expiration manager's pending map to a sync.Map. (#8589 )	2020-05-21 12:41:03 -05:00
ncabatoff	c6518cc3f0	Make sure if a user gets removed from all groups in the external system, Vault updates itself accordingly. This is CVE-2020-10660. (#8606 )	2020-03-23 18:00:26 -04:00
ncabatoff	5fe1ab766b	Add option to detect deadlocks in Core.stateLock using build tag `deadlock` (#8524 )	2020-03-10 16:01:20 -04:00
Calvin Leung Huang	bbaa7f8ea9	core: revoke the proper token on partial failures from token-related requests (#7835 ) * core: revoke the proper token on partial failures from token-related requests * move test to vault package, move test trigger to expiration manager * update logging messages for clarity * docstring fix	2019-11-08 13:14:03 -08:00
Calvin Leung Huang	dac03d44e6	port namespace lease revocation fix (#7836 )	2019-11-07 14:10:47 -08:00
Jeff Mitchell	44e899afd1	Don't allow registering a non-root zero TTL token lease (#7524 ) * Don't allow registering a non-root zero TTL token lease This is defense-in-depth in that such a token was not allowed to be used; however it's also a bug fix in that this would then cause no lease to be generated but the token entry to be written, meaning the token entry would stick around until it was attempted to be used or tidied (in both cases the internal lookup would see that this was invalid and do a revoke on the spot). * Fix tests * tidy	2019-11-05 16:11:13 -05:00
ncabatoff	e7fe4b6d92	Return a useful error on attempts to renew a token via sys/leases/renew (#7298 )	2019-10-02 10:55:20 -04:00
Jeff Mitchell	e8a9d47aca	Port over some SP v2 bits (#6516 ) * Port over some SP v2 bits Specifically: * Add too-large handling to Physical (Consul only for now) * Contextify some identity funcs * Update SP protos * Add size limiting to inmem storage	2019-05-01 13:47:41 -04:00
Jeff Mitchell	213b9fd1cf	Update to api 1.0.1 and sdk 0.1.8	2019-04-15 14:10:07 -04:00
Jeff Mitchell	9ebc57581d	Switch to go modules (#6585 ) * Switch to go modules * Make fmt	2019-04-13 03:44:06 -04:00
Jeff Mitchell	8bcb533a1b	Create sdk/ and api/ submodules (#6583 )	2019-04-12 17:54:35 -04:00
Brian Kassouf	7b910a093b	Handle ns lease and token renew/revoke via relative paths (#6236 ) (#6312 ) * Handle ns lease and token renew/revoke via relative paths * s/usin/using/ * add token and lease lookup paths; set ctx only on non-nil ns Addtionally, use client token's ns for auth/token/lookup if no token is provided	2019-02-28 16:02:25 -08:00
Jim Kalafut	2547d7fb6a	Simplify base62.Random (#5982 ) Also move existing base62 encode/decode operations to their only points of use.	2018-12-20 07:40:01 -08:00
Konstantinos Tsanaktsidis	f75e3603ba	Paper over GCS backend corruption issues (#5804 ) We're having issues with leases in the GCS backend storage being corrupted and failing MAC checking. When that happens, we need to know the lease ID so we can address the corruption by hand and take appropriate action. This will hopefully prevent any instances of incomplete data being sent to GSS	2018-11-16 08:07:06 -05:00
Jeff Mitchell	9f6dd376e2	Merge branch 'master-oss' into 1.0-beta-oss	2018-10-19 17:47:58 -04:00
Chris Hoffman	09a4c8214f	safely clean up loaded map (#5558 )	2018-10-19 15:21:42 -04:00
Jeff Mitchell	a64fc7d7cb	Batch tokens (#755 )	2018-10-15 12:56:24 -04:00
Vivek Lakshmanan	2c55777606	Fix expiration handling to not leak goroutines (#5506 ) * Fix expiration handling to not leak goroutines * Apply feedback	2018-10-12 19:02:59 -07:00
Calvin Leung Huang	b47e648ddf	Logger cleanup (#5480 )	2018-10-09 09:43:17 -07:00
Joel Thompson	73112c49fb	logical/aws: Harden WAL entry creation (#5202 ) * logical/aws: Harden WAL entry creation If AWS IAM user creation failed in any way, the WAL corresponding to the IAM user would get left around and Vault would try to roll it back. However, because the user never existed, the rollback failed. Thus, the WAL would essentially get "stuck" and Vault would continually attempt to roll it back, failing every time. A similar situation could arise if the IAM user that Vault created got deleted out of band, or if Vault deleted it but was unable to write the lease revocation back to storage (e.g., a storage failure). This attempts to harden it in two ways. One is by deleting the WAL log entry if the IAM user creation fails. However, the WAL deletion could still fail, and this wouldn't help where the user is deleted out of band, so second, consider the user rolled back if the user just doesn't exist, under certain circumstances. Fixes #5190 * Fix segfault in expiration unit tests TestExpiration_Tidy was passing in a leaseEntry that had a nil Secret, which then caused a segfault as the changes to revokeEntry didn't check whether Secret was nil; this is probably unlikely to occur in real life, but good to be extra cautious. * Fix potential segfault Missed the else... * Respond to PR feedback	2018-09-27 09:54:59 -05:00
Jeff Mitchell	919b968c27	The big one (#5346 )	2018-09-17 23:03:00 -04:00
Jeff Mitchell	c28ed23972	Allow most parts of Vault's logging to have its level changed on-the-fly (#5280 ) * Allow most parts of Vault's logging to have its level changed on-the-fly * Use a const for not set	2018-09-05 15:52:54 -04:00
Jeff Mitchell	362a92945e	Don't resetnamed	2018-08-23 15:04:18 -04:00
Jeff Mitchell	50197d5bfd	Only write valid group alias memberships into leases (#5164 )	2018-08-22 21:53:04 -04:00
Jeff Mitchell	21cd0dd71a	Use strings.Contains for error possibly coming from storage They may not well errwrap Fixes #5046	2018-08-17 16:06:47 -04:00
Chris Hoffman	d8b1d19ed6	Plumbing request context through to expiration manager (#5021 ) * plumbing request context to expiration manager * moar context * address feedback * only using active context for revoke prefix * using active context for revoke commands * cancel tidy on active context * address feedback	2018-08-01 21:39:39 -04:00
Jeff Mitchell	4261618d10	Add request timeouts in normal request path and to expirations (#4971 ) * Add request timeouts in normal request path and to expirations * Add ability to adjust default max request duration * Some test fixes * Ensure tests have defaults set for max request duration * Add context cancel checking to inmem/file * Fix tests * Fix tests * Set default max request duration to basically infinity for this release for BC * Address feedback	2018-07-24 14:50:49 -07:00
Jeff Mitchell	1d99b7fd05	Properly watch quit context in expireID instead of locking first (#4970 )	2018-07-20 17:00:09 -04:00
Brian Kassouf	57d9c335d8	Don't shutdown if we lose leadership during lease restoration (#4924 ) * Don't shutdown if we lose leadership during lease restoration * Update comment	2018-07-13 11:30:08 -07:00
Jeff Mitchell	98bf463a65	Make single-lease revocation behave like expiration (#4883 ) This change makes it so that if a lease is revoked through user action, we set the expiration time to now and update pending, just as we do with tokens. This allows the normal retry logic to apply in these cases as well, instead of just erroring out immediately. The idea being that once you tell Vault to revoke something it should keep doing its darndest to actually make that happen.	2018-07-11 15:45:35 -04:00
Jeff Mitchell	e52b554c0b	Add an idle timeout for the server (#4760 ) * Add an idle timeout for the server Because tidy operations can be long-running, this also changes all tidy operations to behave the same operationally (kick off the process, get a warning back, log errors to server log) and makes them all run in a goroutine. This could mean a sort of hard stop if Vault gets sealed because the function won't have the read lock. This should generally be okay (running tidy again should pick back up where it left off), but future work could use cleanup funcs to trigger the functions to stop. * Fix up tidy test * Add deadline to cluster connections and an idle timeout to the cluster server, plus add readheader/read timeout to api server	2018-06-16 18:21:33 -04:00
Jeff Mitchell	45da5a45ba	Store lease times suitable for export in pending (#4730 ) * Store lease times suitable for export in pending This essentially caches lease information for token lookups, preventing going to disk over and over. * Simplify logic	2018-06-11 11:58:56 -04:00
Jeff Mitchell	8916f6b625	Some atomic cleanup (#4732 ) Taking inspiration from https://github.com/golang/go/issues/17604#issuecomment-256384471 suggests that taking the address of a stack variable for use in atomics works (at least, the race detector doesn't complain) but is doing it wrong. The only other change is a change in Leader() detecting if HA is enabled to fast-path out. This value never changes after NewCore, so we don't need to grab the read lock to check it.	2018-06-09 15:35:22 -04:00
Jeff Mitchell	be64c859e1	Make sure updating pending and storage are done as a group (#4727 )	2018-06-08 17:24:44 -04:00
Jeff Mitchell	575a606db7	Move TokenEntry into logical. (#4729 ) This allows the HTTP logicalAuth handler to cache the value in the logical.Request, avoiding a lookup later when performing acl checks/counting a use.	2018-06-08 17:24:27 -04:00

1 2 3 4

191 Commits