We're having issues with leases in the GCS backend storage being
corrupted and failing MAC checking. When that happens, we need to know
the lease ID so we can address the corruption by hand and take
appropriate action.
This will hopefully prevent any instances of incomplete data being sent
to GSS
* The added method customTLSDial() creates a tls connection to the zookeeper backend when 'tls_enabled' is set to true in config
* Update to the document for TLS configuration that is required to enable TLS connection to Zookeeper backend
* Minor formatting update
* Minor update to the description for example config
* As per review comments from @kenbreeman, additional property description indicating support for multiple Root CAs in a single file has been added
* minor formatting
* Slight cleanup around mysql ha lock implementation
* Removes some duplication around lock table naming
* Escapes lock table name with backticks to handle weird characters
* Lock table defaults to regular table name + "_lock"
* Drop lock table after tests run
* Add `ha_enabled` option for mysql storage
It defaults to false, and we gate a few things like creating the lock
table and preparing lock related statements on it
* storage/gcs: fix race condition in releasing lock
Previously we were deleting a lock without first checking if the lock we were deleting was our own. There existed a small period of time where vault-0 would lose leadership and vault-1 would get leadership. vault-0 would delete the lock key while vault-1 would write it. If vault-0 won, there'd be another leader election, etc.
This fixes the race by using a CAS operation instead.
* storage/gcs: properly break out of loop during stop
* storage/spanner: properly break out of loop during stop
when use mysql storage, set` database = "dev-dassets-bc"` , create database and create table will throw exceptions as follows:
Error initializing storage of type mysql: failed to create mysql database: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-dassets-bc' at line 1
Error initializing storage of type mysql: failed to create mysql table: Error 1046: No database selected
cause of `-` is a MySQL built-in symbol. so add backtick for create database sql\create table sql \dml sqls.
* Add request timeouts in normal request path and to expirations
* Add ability to adjust default max request duration
* Some test fixes
* Ensure tests have defaults set for max request duration
* Add context cancel checking to inmem/file
* Fix tests
* Fix tests
* Set default max request duration to basically infinity for this release for BC
* Address feedback
* Add an idle timeout for the server
Because tidy operations can be long-running, this also changes all tidy
operations to behave the same operationally (kick off the process, get a
warning back, log errors to server log) and makes them all run in a
goroutine.
This could mean a sort of hard stop if Vault gets sealed because the
function won't have the read lock. This should generally be okay
(running tidy again should pick back up where it left off), but future
work could use cleanup funcs to trigger the functions to stop.
* Fix up tidy test
* Add deadline to cluster connections and an idle timeout to the cluster server, plus add readheader/read timeout to api server
If we have a panic defer functions are run but unlocks aren't. Since we
can't really trust plugins and storage, this backs out the changes for
those parts of the request path.
* Remove a lot of deferred functions in the request path.
There is an interesting benchmark at https://www.reddit.com/r/golang/comments/3h21nk/simple_micro_benchmark_to_measure_the_overhead_of/
It shows that defer actually adds quite a lot of overhead -- maybe 100ns
per call but we defer a *lot* of functions in the request path. So this
removes some of the ones in request handling, ha, barrier, router, and
physical cache.
One meta-note: nearly every metrics function is in a defer which means
every metrics call we add could add a non-trivial amount of time, e.g.
for every 10 extra metrics statements we add 1ms to a request. I don't
know how to solve this right now without doing what I did in some of
these cases and putting that call into a simple function call that then
goes before each return.
* Simplify barrier defer cleanup
Taking inspiration from
https://github.com/golang/go/issues/17604#issuecomment-256384471
suggests that taking the address of a stack variable for use in atomics
works (at least, the race detector doesn't complain) but is doing it
wrong.
The only other change is a change in Leader() detecting if HA is enabled
to fast-path out. This value never changes after NewCore, so we don't
need to grab the read lock to check it.
* Do some best-effort cleanup in file backend
If put results in an encoding error and after the file is closed we
detect it's zero bytes, it could be caused by an out of space error on
the disk since file info is often stored in filesystem metadata with
reserved space. This tries to detect that scenario and perform
best-effort cleanup. We only do this on zero length files to ensure that
if an encode fails to write but the system hasn't already performed
truncation, we leave the existing data alone.
Vault should never write a zero-byte file (as opposed to a zero-byte
value in the encoded JSON) so if this case is hit it's always an error.
* Also run a check on Get