This PR relates to a feature request logged through HashiCorp commercial
support.
Vault lacks pagination in its APIs. As a result, certain list operations
can return **very** large responses. The user's chosen audit sinks may
experience difficulty consuming audit records that swell to tens of
megabytes of JSON.
In our case, one of the systems consuming audit log data could not cope,
and failed.
The responses of list operations are typically not very interesting, as
they are mostly lists of keys, or, even when they include a "key_info"
field, are not returning confidential information. They become even less
interesting once HMAC-ed by the audit system.
Some example Vault "list" operations that are prone to becoming very
large in an active Vault installation are:
auth/token/accessors/
identity/entity/id/
identity/entity-alias/id/
pki/certs/
In response, I've coded a new option that can be applied to audit
backends, `elide_list_responses`. When enabled, response data is elided
from audit logs, only when the operation type is "list".
For added safety, the elision only applies to the "keys" and "key_info"
fields within the response data - these are conventionally the only
fields present in a list response - see logical.ListResponse, and
logical.ListResponseWithInfo. However, other fields are technically
possible if a plugin author writes unusual code, and these will be
preserved in the audit log even with this option enabled.
The elision replaces the values of the "keys" and "key_info" fields with
an integer count of the number of entries. This allows even the elided
audit logs to still be useful for answering questions like "Was any data
returned?" or "How many records were listed?".
Before this it was passed as a []byte, which doesn't get HMAC'd. The original non-HMACing behaviour can be obtained by adding "http_raw_body" to audit_non_hmac_response_keys. (#8130)
Add support for hashing time.Time within slices, which unbreaks auditing of requests returning the request counters.
Break Hash into struct-specific func like HashAuth, HashRequest. Move all the copying/hashing logic from FormatRequest/FormatResponse into the new Hash* funcs. HashStructure now modifies in place instead of copying.
Instead of returning an error when trying to hash map keys of type time.Time, ignore them, i.e. pass them through unhashed.
Enable auditing on test clusters by default if the caller didn't specify any audit backends. If they do, they're responsible for setting it up.
This enables audit.Hash to hash time.Time values that may exist as
direct fields in the map. This will error (instead of panic) for any
time.Time values that don't occur within map values. For example, this
does not support a time.Time within a slice. If that needs to be
supported then modifications will need to be made.
This also requires an update to reflectwalk (included in this PR). This
is a minimal change that allows SkipEntry to signal to skip an entire
struct. We do this because we don't want to walk any of time.Time since
we handle it directly.
with a new endpoint '/sys/audit-hash', which returns the given input
string hashed with the given audit backend's hash function and salt
(currently, always HMAC-SHA256 and a backend-specific salt).
In the process of adding the HTTP handler, this also removes the custom
HTTP handlers for the other audit endpoints, which were simply
forwarding to the logical system backend. This means that the various
audit functions will now redirect correctly from a standby to master.
(Tests all pass.)
Fixes#784