This PR relates to a feature request logged through HashiCorp commercial
support.
Vault lacks pagination in its APIs. As a result, certain list operations
can return **very** large responses. The user's chosen audit sinks may
experience difficulty consuming audit records that swell to tens of
megabytes of JSON.
In our case, one of the systems consuming audit log data could not cope,
and failed.
The responses of list operations are typically not very interesting, as
they are mostly lists of keys, or, even when they include a "key_info"
field, are not returning confidential information. They become even less
interesting once HMAC-ed by the audit system.
Some example Vault "list" operations that are prone to becoming very
large in an active Vault installation are:
auth/token/accessors/
identity/entity/id/
identity/entity-alias/id/
pki/certs/
In response, I've coded a new option that can be applied to audit
backends, `elide_list_responses`. When enabled, response data is elided
from audit logs, only when the operation type is "list".
For added safety, the elision only applies to the "keys" and "key_info"
fields within the response data - these are conventionally the only
fields present in a list response - see logical.ListResponse, and
logical.ListResponseWithInfo. However, other fields are technically
possible if a plugin author writes unusual code, and these will be
preserved in the audit log even with this option enabled.
The elision replaces the values of the "keys" and "key_info" fields with
an integer count of the number of entries. This allows even the elided
audit logs to still be useful for answering questions like "Was any data
returned?" or "How many records were listed?".
* fix adding clientID to request in audit log
* fix boolean statement
* use standard encoding for client ID instead of urlEncoding
* change encoding in tests
* add in client counts to request handling
* remove redundant client ID generation in request handling
* directly add clientID to req after handling token usage
* Send a test message before committing a new audit device.
Also, lower timeout on connection attempts in socket device.
* added changelog
* go mod vendor (picked up some unrelated changes.)
* Skip audit device check in integration test.
Co-authored-by: swayne275 <swayne@hashicorp.com>
* Populate a token_ttl and token_issue_time field on the Auth struct of audit log entries, and in the Auth portion of a response for login methods
* Revert go fmt, better zero checking
* Update unit tests
* changelog++
Before this it was passed as a []byte, which doesn't get HMAC'd. The original non-HMACing behaviour can be obtained by adding "http_raw_body" to audit_non_hmac_response_keys. (#8130)
Add support for hashing time.Time within slices, which unbreaks auditing of requests returning the request counters.
Break Hash into struct-specific func like HashAuth, HashRequest. Move all the copying/hashing logic from FormatRequest/FormatResponse into the new Hash* funcs. HashStructure now modifies in place instead of copying.
Instead of returning an error when trying to hash map keys of type time.Time, ignore them, i.e. pass them through unhashed.
Enable auditing on test clusters by default if the caller didn't specify any audit backends. If they do, they're responsible for setting it up.
Move audit.LogInput to sdk/logical. Allow the Data values in audited
logical.Request and Response to implement OptMarshaler, in which case
we delegate hashing/serializing responsibility to them. Add new
ClientCertificateSerialNumber audit request field.
SystemView can now be cast to ExtendedSystemView to expose the Auditor
interface, which allows submitting requests and responses to the audit
broker.
* This changes the way policies are reported in audit logs.
Previously, only policies tied to tokens would be reported. This could
make it difficult to perform after-the-fact analysis based on both the
initial response entry and further requests. Now, the full set of
applicable policies from both the token and any derived policies from
Identity are reported.
To keep things consistent, token authentications now also return the
full set of policies in api.Secret.Auth responses, so this both makes it
easier for users to understand their actual full set, and it matches
what the audit logs now report.
* govet cleanup in token store
* adding general ttl handling to login requests
* consolidating TTL calculation to system view
* deprecate LeaseExtend
* deprecate LeaseExtend
* set the increment to the correct value
* move calculateTTL out of SystemView
* remove unused value
* add back clearing of lease id
* implement core ttl in some backends
* removing increment and issue time from lease options
* adding ttl tests, fixing some compile issue
* adding ttl tests
* fixing some explicit max TTL logic
* fixing up some tests
* removing unneeded test
* off by one errors...
* adding back some logic for bc
* adding period to return on renewal
* tweaking max ttl capping slightly
* use the appropriate precision for ttl calculation
* deprecate proto fields instead of delete
* addressing feedback
* moving TTL handling for backends to core
* mongo is a secret backend not auth
* adding estimated ttl for backends that also manage the expiration time
* set the estimate values before calling the renew request
* moving calculate TTL to framework, revert removal of increment and issue time from logical
* minor edits
* addressing feedback
* address more feedback
* porting identity to OSS
* changes that glue things together
* add testing bits
* wrapped entity id
* fix mount error
* some more changes to core
* fix storagepacker tests
* fix some more tests
* fix mount tests
* fix http mount tests
* audit changes for identity
* remove upgrade structs on the oss side
* added go-memdb to vendor
* Store original request path in WrapInfo as CreationPath
* Add wrapping_token_creation_path to CLI output
* Add CreationPath to AuditResponseWrapInfo
* Fix tests
* Add and fix tests, update API docs with new sample responses
This enables audit.Hash to hash time.Time values that may exist as
direct fields in the map. This will error (instead of panic) for any
time.Time values that don't occur within map values. For example, this
does not support a time.Time within a slice. If that needs to be
supported then modifications will need to be made.
This also requires an update to reflectwalk (included in this PR). This
is a minimal change that allows SkipEntry to signal to skip an entire
struct. We do this because we don't want to walk any of time.Time since
we handle it directly.
* audit: Added token_num_uses to audit response
* Fixed jsonx tests
* Revert logical auth to NumUses instead of TokenNumUses
* s/TokenNumUses/NumUses
* Audit: Add num uses to audit requests as well
* Added RemainingUses to distinguish NumUses in audit requests
A static token at the beginning of a log line can help systems parse
logs better. For example, rsyslog and syslog-ng will recognize the
'@cee: ' prefix and will parse the rest of the line as a valid json message.
This is useful in environments where there is a mix of structured and
unstructured logs.