Updates documentation for ACL replication.

This commit is contained in:
James Phillips 2016-08-05 00:23:28 -07:00
parent 4a931ae12e
commit 5577b8ef66
No known key found for this signature in database
GPG Key ID: 77183E682AC5FC11
3 changed files with 222 additions and 113 deletions

View File

@ -17,6 +17,7 @@ The following endpoints are supported:
* [`/v1/acl/info/<id>`](#acl_info): Queries the policy of a given token
* [`/v1/acl/clone/<id>`](#acl_clone): Creates a new token by cloning an existing token
* [`/v1/acl/list`](#acl_list): Lists all the active tokens
* [`/v1/acl/replication`](#acl_replication_status): Checks status of ACL replication
### <a name="acl_create"></a> /v1/acl/create
@ -166,3 +167,56 @@ It returns a JSON body like this:
...
]
```
### <a name="acl_replication_status"></a> /v1/acl/replication
The endpoint must be hit with a GET. It returns the status of the
[ACL replication](/docs/internals/acl.html#replication) process in
the datacenter. This is intended to be used by operators, or by
automation checking the health of ACL replication.
By default, the datacenter of the agent is queried; however, the dc can be provided
using the "?dc=" query parameter.
It returns a JSON body like this:
```javascript
{
"Enabled": true,
"Running": true,
"SourceDatacenter": "dc1",
"ReplicatedIndex": 1976,
"LastSuccess": "2016-08-05T06:28:58Z",
"LastError": "2016-08-05T06:28:28Z"
}
```
`Enabled` reports whether ACL replication is enabled for the datacenter.
`Running` reports whether the ACL replication process is running. The process
may take approximately 60 seconds to begin running after a leader election occurs.
`SourceDatacenter` is the authoritative ACL datacenter that ACLs are being
replicated from, and will match the
[`acl_datacenter`](/docs/agent/options.html#acl_datacenter) configuration.
`ReplicatedIndex` is the last index that was successfully replicated. You can
compare this to the `X-Consul-Index` header returned by the [`/v1/acl/list`](#acl_list)
endpoint to determine if the replication process has gotten all available
ACLs. Note that replication runs as a background process approximately every 30
seconds, and that local updates are rate limited to 100 update/second, so so it
may take several minutes to perform the initial sync of a large set of ACLs.
After the initial sync, replica lag should be on the order of about 30 seconds.
`LastSuccess` is the UTC time of the last successful sync operation. Note that
since ACL replication is done with a blocking query, this may not update for up
to 5 minutes if there have been no ACL changes to replicate. A zero value of
"0001-01-01T00:00:00Z" will be present if no sync has been successful.
`LastError` is the UTC time of the last error encountered during a sync operation.
If this time is later than `LastSuccess`, you can assume the replication process
is not in a good state. A zero value of "0001-01-01T00:00:00Z" will be present if
no sync has resulted in an error.
Please see the [ACL replication](/docs/internals/acl.html#replication)
section of the internals guide for more details.

View File

@ -351,6 +351,17 @@ Consul will not enable TLS for the HTTP API unless the `https` port has been ass
token. When you provide a value, it can be any string value. Using a UUID would ensure that it looks
the same as the other tokens, but isn't strictly necessary.
* <a name="acl_replication_token"></a><a href="#acl_replication_token">`acl_replication_token`</a> -
Only used for servers outside the [`acl_datacenter`](#acl_datacenter) running Consul 0.7 or later.
When provided, this will enable [ACL replication](/docs/internals/acl.html#replication) using this
token to retrieve and replicate the ACLs to the non-authoritative local datacenter.
<br><br>
If there's a partition or other outage affecting the authoritative datacenter, and the
[`acl_down_policy`](/docs/agent/options.html#acl_down_policy) is set to "extend-cache", tokens not
in the cache can be resolved during the outage using the replicated set of ACLs. Please see the
[ACL replication](/docs/internals/acl.html#replication) section of the internals guide for more
details.
* <a name="acl_token"></a><a href="#acl_token">`acl_token`</a> - When provided, the agent will use this
token when making requests to the Consul servers. Clients can override this token on a per-request
basis by providing the "?token" query parameter. When not provided, the empty token, which maps to

View File

@ -39,14 +39,24 @@ prior versions do not provide a token. This is handled by the special "anonymous
token. If no token is provided, the rules associated with the anonymous token are
automatically applied: this allows policy to be enforced on legacy clients.
ACLs can also act in either a whitelist or blacklist mode depending
on the configuration of
[`acl_default_policy`](/docs/agent/options.html#acl_default_policy). If the
default policy is to deny all actions, then token rules can be set to whitelist
specific actions. In the inverse, the allow all default behavior is a blacklist
where rules are used to prohibit actions. By default, Consul will allow all
actions.
#### ACL Datacenter
Enforcement is always done by the server nodes. All servers must be configured
to provide an [`acl_datacenter`](/docs/agent/options.html#acl_datacenter) which
enables ACL enforcement but also specifies the authoritative datacenter. Consul does not
replicate data cross-WAN and instead relies on [RPC forwarding](/docs/internals/architecture.html)
to support Multi-Datacenter configurations. However, because requests can be made
enables ACL enforcement but also specifies the authoritative datacenter. Consul
relies on [RPC forwarding](/docs/internals/architecture.html) to support
Multi-Datacenter configurations. However, because requests can be made
across datacenter boundaries, ACL tokens must be valid globally. To avoid
replication issues, a single datacenter is considered authoritative and stores
all the tokens.
consistency issues, a single datacenter is considered authoritative and stores
the canonical set of tokens.
When a request is made to a server in a non-authoritative datacenter server, it
must be resolved into the appropriate policy. This is done by reading the token
@ -55,7 +65,9 @@ from the authoritative server and caching the result for a configurable
of caching is that the cache TTL is an upper bound on the staleness of policy
that is enforced. It is possible to set a zero TTL, but this has adverse
performance impacts, as every request requires refreshing the policy via a
cross-datacenter WAN call.
cross-datacenter WAN RPC call.
#### Outages and ACL Replication
The Consul ACL system is designed with flexible rules to accommodate for an outage
of the [`acl_datacenter`](/docs/agent/options.html#acl_datacenter) or networking
@ -66,114 +78,46 @@ choices to tune behavior. It is possible to deny or permit all actions or to ign
cache TTLs and enter a fail-safe mode. The default is to ignore cache TTLs
for any previously resolved tokens and to deny any uncached tokens.
ACLs can also act in either a whitelist or blacklist mode depending
on the configuration of
[`acl_default_policy`](/docs/agent/options.html#acl_default_policy). If the
default policy is to deny all actions, then token rules can be set to whitelist
specific actions. In the inverse, the allow all default behavior is a blacklist
where rules are used to prohibit actions. By default, Consul will allow all
actions.
<a name="replication"></a>
Consul 0.7 added an ACL Replication capability that can allow non-authoritative
datacenter servers to resolve even uncached tokens. This is enabled by setting an
[`acl_replication_token`](/docs/agent/options.html#acl_replication_token) in the
configuration on the servers in the non-authoritative datacenters. With replication
enabled, the servers will maintain a replica of the authoritative datacenter's full
set of ACLs on the non-authoritative servers.
### Blacklist mode and `consul exec`
Replication occurs with a background process that looks for new ACLs approximately
every 30 seconds. Replicated changes are written at a rate that's throttled to
100 updates/second, so it may take several minutes to perform the initial sync of
a large set of ACLs.
If you set [`acl_default_policy`](/docs/agent/options.html#acl_default_policy)
to `deny`, the `anonymous` token won't have permission to read the default
`_rexec` prefix; therefore, Consul agents using the `anonymous` token
won't be able to perform [`consul exec`](/docs/commands/exec.html) actions.
If there's a partition or other outage affecting the authoritative datacenter,
and the [`acl_down_policy`](/docs/agent/options.html#acl_down_policy)
is set to "extend-cache", tokens will be resolved during the outage using the
replicated set of ACLs. An [ACL replication status](http://localhost:4567/docs/agent/http/acl.html#acl_replication_status)
endpoint is available to monitor the health of the replication process.
Here's why: the agents need read/write permission to the `_rexec` prefix for
[`consul exec`](/docs/commands/exec.html) to work properly. They use that prefix
as the transport for most data.
Locally-resolved ACLs will be cached using the [`acl_ttl`](/docs/agent/options.html#acl_ttl)
setting of the non-authoritative datacenter, so these entries may persist in the
cache for up to the TTL, even after the authoritative datacenter comes back online.
You can enable [`consul exec`](/docs/commands/exec.html) from agents that are not
configured with a token by allowing the `anonymous` token to access that prefix.
This can be done by giving this rule to the `anonymous` token:
ACL replication can also be used to migrate ACLs from one datacenter to another
using a process like this:
```javascript
key "_rexec/" {
policy = "write"
}
```
1. Enable ACL replication in all datacenters to allow continuation of service
during the migration, and to populate the target datacenter. Verify replication
is healthy and caught up to the current ACL index in the target datacenter
using the [ACL replication status](http://localhost:4567/docs/agent/http/acl.html#acl_replication_status)
endpoint.
2. Turn down the old authoritative datacenter servers.
3. Rolling restart the servers in the target datacenter and change the
`acl_datacenter` configuration to itself. This will automatically turn off
replication and will enable the datacenter to start acting as the authoritative
datacenter, using its replicated ACLs from before.
3. Rolling restart the servers in other datacenters and change their `acl_datacenter`
configuration to the target datacenter.
Alternatively, you can, of course, add an explicit
[`acl_token`](/docs/agent/options.html#acl_token) to each agent, giving it access
to that prefix.
### Blacklist mode and Service Discovery
If your [`acl_default_policy`](/docs/agent/options.html#acl_default_policy) is
set to `deny`, the `anonymous` token will be unable to read any service
information. This will cause the service discovery mechanisms in the REST API
and the DNS interface to return no results for any service queries. This is
because internally the API's and DNS interface consume the RPC interface, which
will filter results for services the token has no access to.
You can allow all services to be discovered, mimicing the behavior of pre-0.6.0
releases, by configuring this ACL rule for the `anonymous` token:
```
service "" {
policy = "read"
}
```
Note that the above will allow access for reading service information only. This
level of access allows discovering other services in the system, but is not
enough to allow the agent to sync its services and checks into the global
catalog during [anti-entropy](/docs/internals/anti-entropy.html).
The most secure way of handling service registration and discovery is to run
Consul 0.6+ and issue tokens with explicit access for the services or service
prefixes which are expected to run on each agent.
### Blacklist mode and Events
Similar to the above, if your
[`acl_default_policy`](/docs/agent/options.html#acl_default_policy) is set to
`deny`, the `anonymous` token will have no access to allow firing user events.
This deviates from pre-0.6.0 builds, where user events were completely
unrestricted.
Events have their own first-class expression in the ACL syntax. To restore
access to user events from arbitrary agents, configure an ACL rule like the
following for the `anonymous` token:
```
event "" {
policy = "write"
}
```
As always, the more secure way to handle user events is to explicitly grant
access to each API token based on the events they should be able to fire.
### Blacklist mode and Prepared Queries
After Consul 0.6.3, significant changes were made to ACLs for prepared queries,
including a new `query` ACL policy. See [Prepared Query ACLs](#prepared_query_acls) below for more details.
### Blacklist mode and Keyring Operations
Consul 0.6 and later supports securing the encryption keyring operations using
ACL's. Encryption is an optional component of the gossip layer. More information
about Consul's keyring operations can be found on the [keyring
command](/docs/commands/keyring.html) documentation page.
If your [`acl_default_policy`](/docs/agent/options.html#acl_default_policy) is
set to `deny`, then the `anonymous` token will not have access to read or write
to the encryption keyring. The keyring policy is yet another first-class citizen
in the ACL syntax. You can configure the anonymous token to have free reign over
the keyring using a policy like the following:
```
keyring = "write"
```
Encryption keyring operations are sensitive and should be properly secured. It
is recommended that instead of configuring a wide-open policy like above, a
per-token policy is applied to maximize security.
### Bootstrapping ACLs
#### Bootstrapping ACLs
Bootstrapping the ACL system is done by providing an initial [`acl_master_token`
configuration](/docs/agent/options.html#acl_master_token) which will be created
@ -187,8 +131,7 @@ for all servers. Once this is done, restart the current leader to force a leader
## Rule Specification
A core part of the ACL system is a rule language which is used to describe the policy
that must be enforced. Consul supports ACLs for both [K/Vs](/intro/getting-started/kv.html)
and [services](/intro/getting-started/services.html).
that must be enforced.
Key policies are defined by coupling a prefix with a policy. The rules are enforced
using a longest-prefix match policy: Consul picks the most specific policy possible. The
@ -309,7 +252,108 @@ This is equivalent to the following JSON input:
}
```
## Services and Checks with ACLs
## Building ACL Policies
#### Blacklist mode and `consul exec`
If you set [`acl_default_policy`](/docs/agent/options.html#acl_default_policy)
to `deny`, the `anonymous` token won't have permission to read the default
`_rexec` prefix; therefore, Consul agents using the `anonymous` token
won't be able to perform [`consul exec`](/docs/commands/exec.html) actions.
Here's why: the agents need read/write permission to the `_rexec` prefix for
[`consul exec`](/docs/commands/exec.html) to work properly. They use that prefix
as the transport for most data.
You can enable [`consul exec`](/docs/commands/exec.html) from agents that are not
configured with a token by allowing the `anonymous` token to access that prefix.
This can be done by giving this rule to the `anonymous` token:
```javascript
key "_rexec/" {
policy = "write"
}
```
Alternatively, you can, of course, add an explicit
[`acl_token`](/docs/agent/options.html#acl_token) to each agent, giving it access
to that prefix.
#### Blacklist mode and Service Discovery
If your [`acl_default_policy`](/docs/agent/options.html#acl_default_policy) is
set to `deny`, the `anonymous` token will be unable to read any service
information. This will cause the service discovery mechanisms in the REST API
and the DNS interface to return no results for any service queries. This is
because internally the API's and DNS interface consume the RPC interface, which
will filter results for services the token has no access to.
You can allow all services to be discovered, mimicing the behavior of pre-0.6.0
releases, by configuring this ACL rule for the `anonymous` token:
```
service "" {
policy = "read"
}
```
Note that the above will allow access for reading service information only. This
level of access allows discovering other services in the system, but is not
enough to allow the agent to sync its services and checks into the global
catalog during [anti-entropy](/docs/internals/anti-entropy.html).
The most secure way of handling service registration and discovery is to run
Consul 0.6+ and issue tokens with explicit access for the services or service
prefixes which are expected to run on each agent.
#### Blacklist mode and Events
Similar to the above, if your
[`acl_default_policy`](/docs/agent/options.html#acl_default_policy) is set to
`deny`, the `anonymous` token will have no access to allow firing user events.
This deviates from pre-0.6.0 builds, where user events were completely
unrestricted.
Events have their own first-class expression in the ACL syntax. To restore
access to user events from arbitrary agents, configure an ACL rule like the
following for the `anonymous` token:
```
event "" {
policy = "write"
}
```
As always, the more secure way to handle user events is to explicitly grant
access to each API token based on the events they should be able to fire.
#### Blacklist mode and Prepared Queries
After Consul 0.6.3, significant changes were made to ACLs for prepared queries,
including a new `query` ACL policy. See [Prepared Query ACLs](#prepared_query_acls) below for more details.
#### Blacklist mode and Keyring Operations
Consul 0.6 and later supports securing the encryption keyring operations using
ACL's. Encryption is an optional component of the gossip layer. More information
about Consul's keyring operations can be found on the [keyring
command](/docs/commands/keyring.html) documentation page.
If your [`acl_default_policy`](/docs/agent/options.html#acl_default_policy) is
set to `deny`, then the `anonymous` token will not have access to read or write
to the encryption keyring. The keyring policy is yet another first-class citizen
in the ACL syntax. You can configure the anonymous token to have free reign over
the keyring using a policy like the following:
```
keyring = "write"
```
Encryption keyring operations are sensitive and should be properly secured. It
is recommended that instead of configuring a wide-open policy like above, a
per-token policy is applied to maximize security.
#### Services and Checks with ACLs
Consul allows configuring ACL policies which may control access to service and
check registration. In order to successfully register a service or check with
@ -330,7 +374,7 @@ methods of configuring ACL tokens to use for registration events:
[HTTP API](/docs/agent/http.html) for operations that require them.
<a name="discovery_acls"></a>
## Restricting service discovery with ACLs
#### Restricting service discovery with ACLs
In Consul 0.6, the ACL system was extended to support restricting read access to
service registrations. This allows tighter access control and limits the ability
@ -413,7 +457,7 @@ Capturing ACL Tokens is analogous to
Token is similar to the complementary `SECURITY INVOKER` attribute.
<a name="prepared_query_acl_changes"></a>
#### ACL Implementation Changes
#### ACL Implementation Changes for Prepared Queries
Prepared queries were originally introduced in Consul 0.6.0, and ACL behavior remained
unchanged through version 0.6.3, but was then changed to allow better management of the