* Fix a deadlock if a panic happens during request handling
During request handling, if a panic is created, deferred functions are
run but otherwise execution stops. #5889 changed some locks to
non-defers but had the side effect of causing the read lock to not be
released if the request panicked. This fixes that and addresses a few
other potential places where things could go wrong:
1) In sealInitCommon we always now defer a function that unlocks the
read lock if it hasn't been unlocked already
2) In StepDown we defer the RUnlock but we also had two error cases that
were calling it manually. These are unlikely to be hit but if they were
I believe would cause a panic.
* Add panic recovery test
Move audit.LogInput to sdk/logical. Allow the Data values in audited
logical.Request and Response to implement OptMarshaler, in which case
we delegate hashing/serializing responsibility to them. Add new
ClientCertificateSerialNumber audit request field.
SystemView can now be cast to ExtendedSystemView to expose the Auditor
interface, which allows submitting requests and responses to the audit
broker.
* Port over some SP v2 bits
Specifically:
* Add too-large handling to Physical (Consul only for now)
* Contextify some identity funcs
* Update SP protos
* Add size limiting to inmem storage
Increment a counter whenever a request is received.
The in-memory counter is persisted to counters/requests/YYYY/MM.
When the month wraps around, we reset the in-memory counter to
zero.
Add an endpoint for querying the request counters across all time.
* Fixes a regression in forwarding from #6115
Although removing the authentication header is good defense in depth,
for forwarding mechanisms that use the raw request, we never add it
back. This caused perf standby tests to throw errors. Instead, once
we're past the point at which we would do any raw forwarding, but before
routing the request, remove the header.
To speed this up, a flag is set in the logical.Request to indicate where
the token is sourced from. That way we don't iterate through maps
unnecessarily.
The result will still pass gofmtcheck and won't trigger additional
changes if someone isn't using goimports, but it will avoid the
piecemeal imports changes we've been seeing.
* Fix for using ExplicitMaxTTL in auth method plugins.
* Reverted pb.go files for readability of PR.
* Fixed indenting of comment.
* Reverted unintended change by go test.
* Initial work on templating
* Add check for unbalanced closing in front
* Add missing templated assignment
* Add first cut of end-to-end test on templating.
* Make template errors be 403s and finish up testing
* Review feedback
* plumbing request context to expiration manager
* moar context
* address feedback
* only using active context for revoke prefix
* using active context for revoke commands
* cancel tidy on active context
* address feedback
* core: Cancel context before taking state lock
* Create active context outside of postUnseal
* Attempt to drain requests before canceling context
* fix test
* Add request timeouts in normal request path and to expirations
* Add ability to adjust default max request duration
* Some test fixes
* Ensure tests have defaults set for max request duration
* Add context cancel checking to inmem/file
* Fix tests
* Fix tests
* Set default max request duration to basically infinity for this release for BC
* Address feedback
* Tackle #4929 a different way
This turns c.sealed into an atomic, which allows us to call sealInternal
without a lock. By doing so we can better control lock grabbing when a
condition causing the standby loop to get out of active happens. This
encapsulates that logic into two distinct pieces (although they could
be combined into one), and makes lock guarding more understandable.
* Re-add context canceling to the non-HA version of sealInternal
* Return explicitly after stopCh triggered
1) In backends, ensure they are now using TokenPolicies
2) Don't reassign auth.Policies until after expmgr registration as we
don't need them at that point
Fixes#4829
* Allow max request size to be user-specified
This turned out to be way more impactful than I'd expected because I
felt like the right granularity was per-listener, since an org may want
to treat external clients differently from internal clients. It's pretty
straightforward though.
This also introduces actually using request contexts for values, which
so far we have not done (using our own logical.Request struct instead),
but this allows non-logical methods to still get this benefit.
* Switch to ioutil.ReadAll()
If we have a panic defer functions are run but unlocks aren't. Since we
can't really trust plugins and storage, this backs out the changes for
those parts of the request path.
* This changes the way policies are reported in audit logs.
Previously, only policies tied to tokens would be reported. This could
make it difficult to perform after-the-fact analysis based on both the
initial response entry and further requests. Now, the full set of
applicable policies from both the token and any derived policies from
Identity are reported.
To keep things consistent, token authentications now also return the
full set of policies in api.Secret.Auth responses, so this both makes it
easier for users to understand their actual full set, and it matches
what the audit logs now report.
* Remove a lot of deferred functions in the request path.
There is an interesting benchmark at https://www.reddit.com/r/golang/comments/3h21nk/simple_micro_benchmark_to_measure_the_overhead_of/
It shows that defer actually adds quite a lot of overhead -- maybe 100ns
per call but we defer a *lot* of functions in the request path. So this
removes some of the ones in request handling, ha, barrier, router, and
physical cache.
One meta-note: nearly every metrics function is in a defer which means
every metrics call we add could add a non-trivial amount of time, e.g.
for every 10 extra metrics statements we add 1ms to a request. I don't
know how to solve this right now without doing what I did in some of
these cases and putting that call into a simple function call that then
goes before each return.
* Simplify barrier defer cleanup
* Hand off lease expiration to expiration manager via timers
* Use sync.Map as the cache to track token deletion state
* Add CreateOrFetchRevocationLeaseByToken to hand off token revocation to exp manager
* Update revoke and revoke-self handlers
* Fix tests
* revokeSalted: Move token entry deletion into the deferred func
* Fix test race
* Add blocking lease revocation test
* Remove test log
* Add HandlerFunc on NoopBackend, adjust locks, and add test
* Add sleep to allow for revocations to settle
* Various updates
* Rename some functions and variables to be more clear
* Change step-down and seal to use expmgr for revoke functionality like
during request handling
* Attempt to WAL the token as being invalid as soon as possible so that
further usage will fail even if revocation does not fully complete
* Address feedback
* Return invalid lease on negative TTL
* Revert "Return invalid lease on negative TTL"
This reverts commit a39597ecdc23cf7fc69fe003eef9f10d533551d8.
* Extend sleep on tests
This takes place in two parts, since working on this exposed an issue
with response wrapping when there is a raw body set. The changes are (in
diff order):
* A CurrentWrappingLookupFunc has been added to return the current
value. This is necessary for the lookahead call since we don't want the
lookahead call to be wrapped.
* Support for unwrapping < 0.6.2 tokens via the API/CLI has been
removed, because we now have backends returning 404s with data and can't
rely on the 404 trick. These can still be read manually via
cubbyhole/response.
* KV preflight version request now ensures that its calls is not
wrapped, and restores any given function after.
* When responding with a raw body, instead of always base64-decoding a
string value and erroring on failure, on failure we assume that it
simply wasn't a base64-encoded value and use it as is.
* A test that fails on master and works now that ensures that raw body
responses that are wrapped and then unwrapped return the expected
values.
* A flag for response data that indicates to the wrapping handling that
the data contained therein is already JSON decoded (more later).
* RespondWithStatusCode now defaults to a string so that the value is
HMAC'd during audit. The function always JSON encodes the body, so
before now it was always returning []byte which would skip HMACing. We
don't know what's in the data, so this is a "better safe than sorry"
issue. If different behavior is needed, backends can always manually
populate the data instead of relying on the helper function.
* We now check unwrapped data after unwrapping to see if there were raw
flags. If so, we try to detect whether the value can be unbase64'd. The
reason is that if it can it was probably originally a []byte and
shouldn't be audit HMAC'd; if not, it was probably originally a string
and should be. In either case, we then set the value as the raw body and
hit the flag indicating that it's already been JSON decoded so not to
try again before auditing. Doing it this way ensures the right typing.
* There is now a check to see if the data coming from unwrapping is
already JSON decoded and if so the decoding is skipped before setting
the audit response.
* govet cleanup in token store
* adding general ttl handling to login requests
* consolidating TTL calculation to system view
* deprecate LeaseExtend
* deprecate LeaseExtend
* set the increment to the correct value
* move calculateTTL out of SystemView
* remove unused value
* add back clearing of lease id
* implement core ttl in some backends
* removing increment and issue time from lease options
* adding ttl tests, fixing some compile issue
* adding ttl tests
* fixing some explicit max TTL logic
* fixing up some tests
* removing unneeded test
* off by one errors...
* adding back some logic for bc
* adding period to return on renewal
* tweaking max ttl capping slightly
* use the appropriate precision for ttl calculation
* deprecate proto fields instead of delete
* addressing feedback
* moving TTL handling for backends to core
* mongo is a secret backend not auth
* adding estimated ttl for backends that also manage the expiration time
* set the estimate values before calling the renew request
* moving calculate TTL to framework, revert removal of increment and issue time from logical
* minor edits
* addressing feedback
* address more feedback
* logbridge with hclog and identical output
* Initial search & replace
This compiles, but there is a fair amount of TODO
and commented out code, especially around the
plugin logclient/logserver code.
* strip logbridge
* fix majority of tests
* update logxi aliases
* WIP fixing tests
* more test fixes
* Update test to hclog
* Fix format
* Rename hclog -> log
* WIP making hclog and logxi love each other
* update logger_test.go
* clean up merged comments
* Replace RawLogger interface with a Logger
* Add some logger names
* Replace Trace with Debug
* update builtin logical logging patterns
* Fix build errors
* More log updates
* update log approach in command and builtin
* More log updates
* update helper, http, and logical directories
* Update loggers
* Log updates
* Update logging
* Update logging
* Update logging
* Update logging
* update logging in physical
* prefixing and lowercase
* Update logging
* Move phyisical logging name to server command
* Fix som tests
* address jims feedback so far
* incorporate brians feedback so far
* strip comments
* move vault.go to logging package
* update Debug to Trace
* Update go-plugin deps
* Update logging based on review comments
* Updates from review
* Unvendor logxi
* Remove null_logger.go
* Add some requirements for versioned k/v
* Add a warning message when an upgrade is triggered
* Add path help values
* Make the kv header a const
* Add the uid to mount entry instead of options map
* Pass the backend aware uuid to the mounts and plugins
* Fix comment
* Add options to secret/auth enable and tune CLI commands (#4170)
* Switch mount/tune options to use TypeKVPairs (#4171)
* switching options to TypeKVPairs, adding bool parse for versioned flag
* flipping bool check
* Fix leases coming back from non-leased pluin kv store
* add a test for updating mount options
* Fix tests
* Start work on passing context to backends
* More work on passing context
* Unindent logical system
* Unindent token store
* Unindent passthrough
* Unindent cubbyhole
* Fix tests
* use requestContext in rollback and expiration managers
* Add logic for using Auth.Period when handling auth login/renew requests
* Set auth.TTL if not set in handleLoginRequest
* Always set auth.TTL = te.TTL on handleLoginRequest, check TTL and period against sys values on RenewToken
* Get sysView from le.Path, revert tests
* Add back auth.Policies
* Fix TokenStore tests, add resp warning when capping values
* Use switch for ttl/period check on RenewToken
* Move comments around
* external identity groups
* add local LDAP groups as well to group aliases
* add group aliases for okta credential backend
* Fix panic in tests
* fix build failure
* remove duplicated struct tag
* add test steps to test out removal of group member during renewals
* Add comment for having a prefix check in router
* fix tests
* s/parent_id/canonical_id
* s/parent/canonical in comments and errors
This removes all references I could find to:
- credential provider
- authentication backend
- authentication provider
- auth provider
- auth backend
in favor of the unified:
- auth method
* porting identity to OSS
* changes that glue things together
* add testing bits
* wrapped entity id
* fix mount error
* some more changes to core
* fix storagepacker tests
* fix some more tests
* fix mount tests
* fix http mount tests
* audit changes for identity
* remove upgrade structs on the oss side
* added go-memdb to vendor
* Store original request path in WrapInfo as CreationPath
* Add wrapping_token_creation_path to CLI output
* Add CreationPath to AuditResponseWrapInfo
* Fix tests
* Add and fix tests, update API docs with new sample responses
* exclude /sys/leases/renew from registering with expiration manager
* adding sys/leases/renew to return full secret object, adding tests to catch renew errors
A change in copystructure has caused some panics due to the custom copy
function. I'm more nervous about production panics than I am about
keeping some bad code wiping out some existing warnings, so remove the
custom copy function and just allow direct setting of Warnings.
* Add /sys/config/audited-headers endpoint for configuring the headers that will be audited
* Remove some debug lines
* Add a persistant layer and refactor a bit
* update the api endpoints to be more restful
* Add comments and clean up a few functions
* Remove unneeded hash structure functionaility
* Fix existing tests
* Add tests
* Add test for Applying the header config
* Add Benchmark for the ApplyConfig method
* ResetTimer on the benchmark:
* Update the headers comment
* Add test for audit broker
* Use hyphens instead of camel case
* Add size paramater to the allocation of the result map
* Fix the tests for the audit broker
* PR feedback
* update the path and permissions on config/* paths
* Add docs file
* Fix TestSystemBackend_RootPaths test
We've disabled this in the token store, but it makes no sense to have
that disabled but have it enabled elsewhere. It's the same issue across
all, so simply remove the ability altogether.