Establishing leadership should be very fast and never make external API
calls.
This fixes a situation where there is a long backlog of Vault tokens to
be revoked on when leadership is gained. In such case, revoking the
tokens will significantly slow down leadership establishment and slow
down processing. Worse, the revocation call does not honor leadership
`stopCh` signals, so it will not stop when the leader loses leadership.
Vault 1.2.0 deprecated `period` field in favor of `token_period` in auth
role:
> * Token store roles use new, common token fields for the values
> that overlap with other auth backends. `period`, `explicit_max_ttl`, and
> `bound_cidrs` will continue to work, with priority being given to the
> `token_` prefixed versions of those parameters. They will also be returned
> when doing a read on the role if they were used to provide values initially;
> however, in Vault 1.4 if `period` or `explicit_max_ttl` is zero they will no
> longer be returned. (`explicit_max_ttl` was already not returned if empty.)
https://github.com/hashicorp/vault/blob/master/CHANGELOG.md#120-july-30th-2019
This seems to be the minimum viable patch for fixing a deadlock between
establishConnection and SetConfig.
SetConfig calls tomb.Kill+tomb.Wait while holding v.lock.
establishConnection needs to acquire v.lock to exit but SetConfig is
holding v.lock until tomb.Wait exits. tomb.Wait can't exit until
establishConnect does!
```
SetConfig -> tomb.Wait
^ |
| v
v.lock <- establishConnection
```
Keep attempting to renew Vault token past locally recorded expiry, just
in case the token was renewed out of band, e.g. on another Nomad server,
until Vault returns an unrecoverable error.
This PR removes enforcement that the Vault token role disallows orphaned
tokens and recommends orphaned tokens to simplify the
bootstrapping/upgrading of Nomad clusters. The requirement that Nomad's
Vault token never expire and be shared by all instances of Nomad servers
is not operationally friendly.
The Vault API returns a nil secret and nil error when reading an object
that doesn't exist. The old code assumed an error would be returned and
thus will panic when trying to validate a non-existant role.
This PR fixes our vet script and fixes all the missed vet changes.
It also fixes pointers being printed in `nomad stop <job>` and `nomad
node-status <node>`.
Token revocation
Remove from the statestore
Revoke tokens
Don't error when Vault is disabled as this could cause issue if the operator ever goes from enabled to disabled
update server interface to allow enable/disable and config loading
test the new functions
Leader revoke
Use active