If the local Consul agent is not available while attempting to step down from active or up to active, retry once a second. Allow for concurrent changes to the state with a single registration updater. Fix standby initialization.
Vault will now register itself with Consul. The active node can be found using `active.vault.service.consul`. All standby vaults are available via `standby.vault.service.consul`. All unsealed vaults are considered healthy and available via `vault.service.consul`. Change in status and registration is event driven and should happen at the speed of a write to Consul (~network RTT + ~1x fsync(2)).
Healthy/active:
```
curl -X GET 'http://127.0.0.1:8500/v1/health/service/vault?pretty' && echo;
[
{
"Node": {
"Node": "vm1",
"Address": "127.0.0.1",
"TaggedAddresses": {
"wan": "127.0.0.1"
},
"CreateIndex": 3,
"ModifyIndex": 20
},
"Service": {
"ID": "vault:127.0.0.1:8200",
"Service": "vault",
"Tags": [
"active"
],
"Address": "127.0.0.1",
"Port": 8200,
"EnableTagOverride": false,
"CreateIndex": 17,
"ModifyIndex": 20
},
"Checks": [
{
"Node": "vm1",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": "",
"CreateIndex": 3,
"ModifyIndex": 3
},
{
"Node": "vm1",
"CheckID": "vault-sealed-check",
"Name": "Vault Sealed Status",
"Status": "passing",
"Notes": "Vault service is healthy when Vault is in an unsealed status and can become an active Vault server",
"Output": "",
"ServiceID": "vault:127.0.0.1:8200",
"ServiceName": "vault",
"CreateIndex": 19,
"ModifyIndex": 19
}
]
}
]
```
Healthy/standby:
```
[snip]
"Service": {
"ID": "vault:127.0.0.2:8200",
"Service": "vault",
"Tags": [
"standby"
],
"Address": "127.0.0.2",
"Port": 8200,
"EnableTagOverride": false,
"CreateIndex": 17,
"ModifyIndex": 20
},
"Checks": [
{
"Node": "vm2",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": "",
"CreateIndex": 3,
"ModifyIndex": 3
},
{
"Node": "vm2",
"CheckID": "vault-sealed-check",
"Name": "Vault Sealed Status",
"Status": "passing",
"Notes": "Vault service is healthy when Vault is in an unsealed status and can become an active Vault server",
"Output": "",
"ServiceID": "vault:127.0.0.2:8200",
"ServiceName": "vault",
"CreateIndex": 19,
"ModifyIndex": 19
}
]
}
]
```
Sealed:
```
"Checks": [
{
"Node": "vm2",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": "",
"CreateIndex": 3,
"ModifyIndex": 3
},
{
"Node": "vm2",
"CheckID": "vault-sealed-check",
"Name": "Vault Sealed Status",
"Status": "critical",
"Notes": "Vault service is healthy when Vault is in an unsealed status and can become an active Vault server",
"Output": "Vault Sealed",
"ServiceID": "vault:127.0.0.2:8200",
"ServiceName": "vault",
"CreateIndex": 19,
"ModifyIndex": 38
}
]
```
From the PostgreSQL docs
(http://www.postgresql.org/docs/9.4/static/datatype-character.html):
> Tip: There is no performance difference among these three types,
> apart from increased storage space when using the blank-padded type,
> and a few extra CPU cycles to check the length when storing into a
> length-constrained column. While character(n) has performance
> advantages in some other database systems, there is no such advantage
> in PostgreSQL; in fact character(n) is usually the slowest of the
> three because of its additional storage costs. In most situations
> text or character varying should be used instead.
Some etcd configurations (such as that provided by compose.io) place the
etcd cluster behind multiple load balancers or proxies. In this
configuration, calling Sync (or AutoSync) on the etcd client will
replace the load balancer addresses with the underlying etcd server
address.
This will cause the etcd client to bypass the load balancers, and may
cause the connection to fail completely if the etcd servers are
protected by a firewall.
This patch provides a "sync" option for the etcd backend, which defaults
to the current behavior, but which can be used to turn off of sync.
This corresponds to etcdctl's --no-sync option.
When Vault is killed without the chance to clean up the lock
entry in DynamoDB, no further Vault nodes can become leaders after
that.
To recover from this situation, this commit adds an environment
variable and a configuration flag that when set to "1" causes Vault
to delete the lock entry from DynamoDB.
The permit pool controls the number of outstanding operations that can
be queued for Consul (and inmem, for testing purposes). This prevents
possible situations where Vault launches thousands of concurrent
connections to Consul if e.g. a huge number of leases need to be
expired.
Fixes#677
This strips out http.DefaultClient everywhere I could immediately find
it. Too many things use it and then modify it in incompatible ways.
Fixes#700, I believe.
The semaphore key is used to determine whether we are the leader or not and is set to expire after TTL of 15 seconds. There was no logic implemented to renew the key before it expired, which caused the leader to step down and change every 15 seconds. A periodic timer is now added to update the key every 5 seconds to renew the TTL of the key.
This thing is SUPER slow and has some dumb edge cases. It is only really
meant for development at this point and is commented as such. We won't
document it publicly unless we make it good.