Acl upgrade guide (#4880)

* Very WIP upgrade docs. Actual Upgrade notes doneish; token migration guide WIP.

* Token migration guide

* Complete ACL migration guide

* Upgrade guide cleanup

* Updated upgrade and migration guides

* Typo fix

Co-Authored-By: banks <banks@banksco.de>

* Update website/source/docs/guides/acl-migrate-tokens.html.md

Co-Authored-By: banks <banks@banksco.de>

* Update website/source/docs/guides/acl-migrate-tokens.html.md

Co-Authored-By: banks <banks@banksco.de>

* Update upgrade-specific.html.md

* Update website/source/docs/guides/acl-migrate-tokens.html.md

* Update website/source/docs/guides/acl-migrate-tokens.html.md

* Note Multi-DC changes in upgrade guide.

* Update website/source/docs/upgrade-specific.html.md
This commit is contained in:
Paul Banks 2018-11-14 15:40:02 +00:00 committed by GitHub
parent 0a80a60f8e
commit a0ee09f254
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 556 additions and 0 deletions

View File

@ -0,0 +1,453 @@
---
layout: "docs"
page_title: "ACL Token Migration"
sidebar_current: "docs-guides-acl-migrate-tokens"
description: |-
Consul 1.4.0 introduces a new ACL system with improvements for the security and
management of ACL tokens and policies. This guide documents how to upgrade
existing (now called "legacy") tokens after upgrading to 1.4.0.
---
# ACL Token Migration
Consul 1.4.0 introduces a new ACL system with improvements for the security and
management of ACL tokens and policies. This guide documents how to upgrade
existing (now called "legacy") tokens after upgrading to 1.4.0.
Since the policy syntax changed to be more precise and flexible to manage, it's
necessary to manually translate old tokens into new ones to take advantage of
the new ACL system features. Tooling is provided to help automate this and this
guide describes the overall process.
~> **Note:** **1.4.0 retains full support for "legacy" ACL tokens** so upgrades
from Consul 1.3.0 are safe. Existing tokens will continue to work in the same
way for at least two "major" releases (1.5.x, 1.6.x, etc; note HashiCorp does
not use SemVer for our products).
This document will briefly describe [what changed](#what-changed), and then walk
through the [high-level migration process options](#migration-process), finally
giving some [specific examples](#migration-examples) of migration strategies.
## New ACL System Differences
The [ACL guide](/docs/guides/acl.html) and [legacy ACL
guide](/docs/guides/acl-legacy.html) describes the new and old systems in
detail. Below is a summary of the changes that need to be considered when
migrating legacy tokens to the new system.
### Token and Policy Separation
You can use a single policy in the new system for all tokens that share access
rules. For example, all tokens created using the clone endpoint in the legacy
system can be represented with a single policy and a set of tokens that map to
that policy.
### Rule Syntax Changes
The most significant change is that rules with selectors _no longer prefix match
by default_. In the legacy system the following rules would grant access to
nodes, services and keys _prefixed_ with foo.
```
node "foo" { policy = "write" }
service "foo" { policy = "write" }
key "foo" { policy = "write" }
```
In the new system the same syntax will only perform _exact_ match on the whole
node name, service name or key.
In general, exact match is what most operators intended most of the time so the
same policy can be kept, however if you rely on prefix match behavior then using
the same syntax will break behavior.
Prefix matching can be expressed in the new ACL system explicitly, making the
following rules in the new system exactly the same as the rules above in the
old.
```
node_prefix "foo" { policy = "write" }
service_prefix "foo" { policy = "write" }
key_prefix "foo" { policy = "write" }
```
### API Separation
The "old" API endpoints below continue to work for backwards compatibility but
will continue to create or show only "legacy" tokens that can't take full
advantage of the new ACL system improvements. They are documented fully under
[Legacy Tokens](/api/acl/legacy.html).
- [`PUT /acl/create` - Create Legacy Token](/api/acl/legacy.html#create-acl-token)
- [`PUT /acl/update` - Update Legacy Token](/api/acl/legacy.html#update-acl-token)
- [`PUT /acl/destroy/:uuid` - Delete Legacy Token](/api/acl/legacy.html#delete-acl-token)
- [`GET /acl/info/:uuid` - Read Legacy Token](/api/acl/legacy.html#read-acl-token)
- [`PUT /acl/clone/:uuid` - Clone Legacy Token](/api/acl/legacy.html#clone-acl-token)
- [`GET /acl/list` - List Legacy Tokens](/api/acl/legacy.html#list-acls)
The new ACL system includes new API endpoints to manage
the [ACL System](/api/acl/acl.html), [Tokens](/api/acl/tokens.html)
and [Policies](/api/acl/policies.html).
## Migration Process
While "legacy" tokens will continue to work for several major releases, it's
advisable to plan on migrating existing tokens as soon as is convenient.
Migrating also enables using the new policy management improvements, stricter
policy syntax rules and other features of the new system without
re-issuing all the secrets in use.
The high-level process for migrating a legacy token is as follows:
1. Create a new policy or policies that grant the required access
2. Update the existing token to use those policies
### Prerequisites
This process assumes that the 1.4.0 upgrade is complete including all legacy
ACLs having their accessor IDs populated. This might take up to several minutes
after the servers upgrade in the primary datacenter. You can tell if this is the
case by using `consul acl token list` and checking that no tokens exist with a
blank `AccessorID`.
In addition, it is assumed that all clients that might _create_ ACL tokens (e.g.
Vault's Consul secrets engine) have been updated to use the [new ACL
APIs](/docs/guides/acl-migrate-tokens.html#api-separation).
Specifically if you are using Vault's Consul secrets engine you need to be
running Vault 1.0.0 or higher, _and_ you must update all roles defined in Vault
to specify a list of policy names rather than an inline policy (which causes
Vault to use the legacy API).
~> **Note:** if you have systems still creating "legacy" tokens with the old
APIs, the migration steps below will still work, however you'll have to keep
re-running them until nothing is creating legacy tokens to ensure all tokens are
migrated.
### Creating Policies
There are a range of different strategies for creating new policies from existing
tokens. Two high-level strategies are described here although others or a
mixture of these may be most appropriate depending on the ACL tokens you already
have.
#### Strategy 1: Simple Policy Mapping
The simplest and most automatic strategy is to create one new policy for every
existing token. This is easy to automate, but may result in a lot of policies
with exactly the same rules and with non-human-readable names which will make
managing policies harder. This approach can be accomplished using the [`consul
acl policy create`](/docs/commands/acl/acl-policy.html#create) command with
`-from-token` option.
| Pros | Cons |
| ---- | ---- |
| &#9989; Simple | &#10060; May leave many duplicated policies |
| &#9989; Easy to automate | &#10060; Policy names not human-readable |
A detailed example of using this approach is [given
below](#simple-policy-mapping).
#### Strategy 2: Combining Policies
This strategy takes a more manual approach to create a more manageable set of
policies. There are a spectrum of options for how to do this which tradeoff
increasing human involvement for increasing clarity and re-usability of the
resulting policies.
For example you could use hashes of the policy rules to de-duplicate identical
token policies automatically, however naming them something meaningful for
humans would likely still need manual intervention.
Toward the other end of the spectrum it might be beneficial for security to
translate prefix matches into exact matches. This however requires the operator
knowing that clients using the token really doesn't rely on the prefix matching
semantics of the old ACL system.
To assist with this approach, there is a CLI tool and corresponding API that can
translate a legacy ACL token's rules into a new ACL policy that is exactly
equivalent. See [`consul acl
translate-rules`](/docs/commands/acl/acl-translate-rules.html).
| Pros | Cons |
| ---- | ---- |
| &#9989; Clearer, more manageable policies | &#10060; Requires more manual effort |
| &#9989; Policies can be re-used by new ACL tokens | &#10060; May take longer for large or complex existing policy sets |
A detailed example of using this approach is [given below](#combining-policies).
### Updating Existing Tokens
Once you have created one or more policies that adequately express the rules
needed for a legacy token, you can update the token via the CLI or API to use
those policies.
After updating, the token is no longer considered "legacy" and will have all the
properties of a new token, however it keeps it's `SecretID` (the secret part of
the token used in API calls) so clients already using that token will continue
to work. It is assumed that the policies you attach continue to grant the
necessary access for existing clients; this is up to the operator to ensure.
#### Update via API
Use the [`PUT /v1/acl/token/:AccessorID`](/api/acl/tokens.html#update-a-token)
endpoint. Specifically, ensure that the `Rules` field is omitted or empty. Empty
`Rules` indicates that this is now treated as a new token.
#### Update via CLI
Use the [`consul acl token update`](/docs/commands/acl/acl-token.html#update)
command to update the token. Specifically you need to use `-upgrade-legacy`
which will ensure that legacy rules are removed as well as the new policies
added.
## Migration Examples
Below are two detailed examples of the two high-level strategies for creating
polices discussed above. It should be noted these are intended to clarify the
concrete steps you might take. **We don't recommend you perform production
migrations with ad-hoc terminal commands**. Combining these or something similar
into a script might be appropriate.
### Simple Policy Mapping
This strategy uses the CLI to create a new policy for every existing legacy
token with exactly equivalent rules. It's easy to automate and clients will see
no change in behavior for their tokens, but it does leave you with a lot of
potentially identical policies to manage or clean up later.
#### Create Policies
You can get the AccessorID of every legacy token from the API. For example,
using `curl` and `jq` in bash:
```sh
$ LEGACY_IDS=$(curl -sH "X-Consul-Token: $CONSUL_HTTP_TOKEN" \
'localhost:8500/v1/acl/tokens' | jq -r '.[] | select (.Legacy) | .AccessorID')
$ echo "$LEGACY_IDS"
621cbd12-dde7-de06-9be0-e28d067b5b7f
65cecc86-eb5b-ced5-92dc-f861cf7636fe
ba464aa8-d857-3d26-472c-4d49c3bdae72
```
To create a policy for each one we can use something like:
```sh
for id in $LEGACY_IDS; do \
consul acl policy create -name "migrated-$id" -from-token $id \
-description "Migrated from legacy ACL token"; \
done
```
Each policy now has an identical set of rules to the original token. You can
inspect these:
```sh
$ consul acl policy read -name migrated-621cbd12-dde7-de06-9be0-e28d067b5b7f
ID: 573d84bd-8b08-3061-e391-d2602e1b4947
Name: migrated-621cbd12-dde7-de06-9be0-e28d067b5b7f
Description: Migrated from legacy ACL token
Datacenters:
Rules:
service_prefix "" {
policy = "write"
}
```
Notice how the policy here is `service_prefix` and not `service` since the old
ACL syntax was an implicit prefix match. This ensures any clients relying on
prefix matching behavior will still work.
#### Update Tokens
With the policies created as above, we can automatically upgrade all legacy
tokens.
```sh
for id in $LEGACY_IDS; do \
consul acl token update -id $id -policy-name "migrated-$id" -upgrade-legacy; \
done
```
The update is now complete, all legacy tokens are now new tokens with identical
secrets and enforcement rules.
### Combining Policies
This strategy has more manual elements but results in a cleaner and more
manageable set of policies than the fully automatic solutions. Note that this is
**just an example** to illustrate a few ways you may choose to merge or
manipulate policies.
#### Find All Unique Policies
You can get the AccessorID of every legacy token from the API. For example,
using `curl` and `jq` in bash:
```sh
$ LEGACY_IDS=$(curl -sH "X-Consul-Token: $CONSUL_HTTP_TOKEN" \
'localhost:8500/v1/acl/tokens' | jq -r '.[] | select (.Legacy) | .AccessorID')
$ echo "$LEGACY_IDS"
8b65fdf9-303e-0894-9f87-e71b3273600c
d9deb39b-1b30-e100-b9c5-04aba3f593a1
f2bce42e-cdcc-848d-28ca-cfd0556a22e3
```
Now we want to read the actual policy for each legacy token and de-duplicate
them. We can use the `translate-rules` helper sub-command which will read the
token's policy and return a new ACL policy that is exactly equivalent.
```sh
$ for id in $LEGACY_IDS; do \
echo "Policy for $id:"
consul acl translate-rules -token-accessor "$id"; \
done
Policy for 8b65fdf9-303e-0894-9f87-e71b3273600c:
service_prefix "bar" {
policy = "write"
}
Policy for d9deb39b-1b30-e100-b9c5-04aba3f593a1:
service_prefix "foo" {
policy = "write"
}
Policy for f2bce42e-cdcc-848d-28ca-cfd0556a22e3:
service_prefix "bar" {
policy = "write"
}
```
Notice that two policies are the same and one different.
We can change the loop above to take a hash of this policy definition to
de-duplicate the policies into a set of files locally. This example uses command
available on macOS but equivalents for other platforms should be easy to find.
```sh
$ mkdir policies
$ for id in $LEGACY_IDS; do \
# Fetch the equivalent new policy rules based on the legacy token rules
NEW_POLICY=$(consul acl translate-rules -token-accessor "$id"); \
# Sha1 hash the rules
HASH=$(echo -n "$NEW_POLICY" | shasum | awk '{ print $1 }'); \
# Write rules to a policy file named with the hash to de-duplicated
echo "$NEW_POLICY" > policies/$HASH.hcl; \
done
$ tree policies
policies
├── 024ce11f26f59436c518fb31f0999d1400485c17.hcl
└── 501b787c9444fbd62f346ab257eeb27197be2444.hcl
```
#### Cleaning Up Policies
You can now manually inspect and potentially edit these policies. For example we
could rename them according to their intended use. In this case we maintain the
hash as it will allow us to match tokens to policies later.
```sh
$ cat policies/024ce11f26f59436c518fb31f0999d1400485c17.hcl
service_prefix "bar" {
policy = "write"
}
$ # Add human-readable suffix to the file name so policies end up clearly named
$ mv policies/024ce11f26f59436c518fb31f0999d1400485c17.hcl \
policies/024ce11f26f59436c518fb31f0999d1400485c17-bar-service.hcl
```
You might also choose to tighten up the rules, for example if you know you never
rely on prefix-matching the service name `foo` you might choose to modify the
policy to use exact match.
```sh
$ cat policies/501b787c9444fbd62f346ab257eeb27197be2444.hcl
service_prefix "foo" {
policy = "write"
}
$ echo 'service "foo" { policy = "write" }' > policies/501b787c9444fbd62f346ab257eeb27197be2444.hcl
$ # Add human-readable suffix to the file name so policies end up clearly named
$ mv policies/501b787c9444fbd62f346ab257eeb27197be2444.hcl \
policies/501b787c9444fbd62f346ab257eeb27197be2444-foo-service.hcl
```
#### Creating Policies
We now have a minimal set of policies to create, with human-readable names. We
can create each one with something like the following.
```sh
$ for p in $(ls policies | grep ".hcl"); do \
# Extract the hash part of the file name
HASH=$(echo "$p" | cut -d - -f 1); \
# Extract the name suffix without .hcl
NAME=$(echo "$p" | cut -d - -f 2- | cut -d . -f 1); \
# Create new policy based on the rules in the file and the name we gave
consul acl policy create -name $NAME \
-rules "@policies/$p" \
-description "Migrated from legacy token"; \
done
ID: da2a9f9b-4e44-13f8-e308-76ce7a8dcb21
Name: bar-service
Description: Migrated from legacy token
Datacenters:
Rules:
service_prefix "bar" {
policy = "write"
}
ID: 9fbded86-9140-efe4-b661-c8bd07b6c584
Name: foo-service
Description: Migrated from legacy token
Datacenters:
Rules:
service "foo" { policy = "write" }
```
#### Upgrading Tokens
Finally we can map our existing tokens to those policies using the hash in the
policy file names. The `-upgrade-legacy` flag removes the token's legacy
embedded rules at the same time as associating them with the new policies
created from those rules.
```sh
$ for id in $LEGACY_IDS; do \
NEW_POLICY=$(consul acl translate-rules -token-accessor "$id"); \
HASH=$(echo -n "$NEW_POLICY" | shasum | awk '{ print $1 }'); \
# Lookup the hash->new policy mapping from the policy file names
POLICY_FILE=$(ls policies | grep "^$HASH"); \
POLICY_NAME=$(echo "$POLICY_FILE" | cut -d - -f 2- | cut -d . -f 1); \
echo "==> Mapping token $id to policy $POLICY_NAME"; \
consul acl token update -id $id -policy-name $POLICY_NAME -upgrade-legacy; \
done
==> Mapping token 8b65fdf9-303e-0894-9f87-e71b3273600c to policy bar-service
Token updated successfully.
AccessorID: 8b65fdf9-303e-0894-9f87-e71b3273600c
SecretID: 3dbb3981-7654-733a-3475-5ce20fc5a7b9
Description:
Local: false
Create Time: 0001-01-01 00:00:00 +0000 UTC
Policies:
da2a9f9b-4e44-13f8-e308-76ce7a8dcb21 - bar-service
==> Mapping token d9deb39b-1b30-e100-b9c5-04aba3f593a1 to policy foo-service
Token updated successfully.
AccessorID: d9deb39b-1b30-e100-b9c5-04aba3f593a1
SecretID: 5f54733b-4c76-eb74-8781-3550c20f4969
Description:
Local: false
Create Time: 0001-01-01 00:00:00 +0000 UTC
Policies:
9fbded86-9140-efe4-b661-c8bd07b6c584 - foo-service
==> Mapping token f2bce42e-cdcc-848d-28ca-cfd0556a22e3 to policy bar-service
Token updated successfully.
AccessorID: f2bce42e-cdcc-848d-28ca-cfd0556a22e3
SecretID: f3aaa3e2-2c6f-cf3c-1e86-454de728e8ab
Description:
Local: false
Create Time: 0001-01-01 00:00:00 +0000 UTC
Policies:
da2a9f9b-4e44-13f8-e308-76ce7a8dcb21 - bar-service
```
At this point all tokens are upgraded and can use new ACL features while
retaining the same secret clients are already using.

View File

@ -14,6 +14,106 @@ details provided for their upgrades as a result of new features or changed
behavior. This page is used to document those details separately from the
standard upgrade flow.
## Consul 1.4.0
There are two major features in Consul 1.4.0 that may impact upgrades: a [new ACL system](#acl-upgrade) and [multi-datacenter support for Connect](#connect-multi-datacenter) in the Enterprise version.
### ACL Upgrade
Consul 1.4.0 includes a [new ACL system](/docs/guides/acl.html) that is
designed to have a smooth upgrade path but requires care to upgrade components
in the right order.
**Note:** As with most major version upgrades, you cannot downgrade once the
upgrade to 1.4.0 is complete as it adds new state to the raft store. As always
it is _strongly_ recommended that you test the upgrade first outside of
production and ensure you take backup snapshots of all datacenters before
upgrading.
#### Primary Datacenter
The "ACL datacenter" in 1.3.x and earlier is now referred to as the "Primary
datacenter". All configuration is backwards compatible and shouldn't need to
change prior to upgrade although it's strongly recommended to migrate ACL
configuration to the new syntax soon after upgrade. This includes moving to
`primary_datacenter` rather than `acl_datacenter` and `acl_*` to the new [ACL
block](/docs/agent/options.html#acl).
Datacenters can be upgraded in any order although secondaries will remain in
[Legacy ACL mode](#legacy-acl-mode) until the primary datacenter is fully
ugraded.
Each datacenter should follow the [standard rolling upgrade
procedure](/docs/upgrading.html#standard-upgrades).
#### Legacy ACL Mode
When a 1.4.0 server first starts, it runs in "Legacy ACL mode". In this mode,
bootstrap requests and new ACL APIs will not be functional yet and will return
an error. The server advertises it's ability to support 1.4.0 ACLs via gossip
and waits.
In the primary datacenter, the servers all wait in legacy ACL mode until they
see every server in the primary datacenter advertise 1.4.0 ACL support. Once
this happens, the leader will complete the transition out of "legacy ACL mode"
and write this into the state so future restarts don't need to go through the
same transition.
In a secondary datacenter, the same process happens except that servers
_additionally_ wait for all servers in the primary datacenter making it safe to
upgrade datacenters in any order.
It should be noted that even if you are not upgrading, starting a brand new
1.4.0 cluster will transition through legacy ACL mode so you may be unable to
bootstrap ACLs until all the expected servers are up and healthy.
#### Legacy Token Accessor Migration
As soon as all servers in the primary datacenter have been upgraded to 1.4.0,
the leader will begin the process of creating new accessor IDs for all existing
ACL tokens.
This process completes in the background and is rate limited to ensure it
doesn't overload the leader. It completes upgrades in batches of 128 tokens and
will not upgrade more than one batch per second so on a cluster with 10,000
tokens, this may take several minutes.
While this is happening both old and new ACLs will work correctly with the
caveat that new ACL [Token APIs](/api/acl/tokens.html) may not return an
accessor ID for legacy tokens that are not yet migrated.
#### Migrating Existing ACLs
New ACL policies have slightly different syntax designed to fix some
shortcomings in old ACL syntax. During and after the upgrade process, any old
ACL tokens will continue to work and grant exactly the same level of access.
After upgrade, it is still possible to create "legacy" tokens using the existing
API so existing integrations that create tokens (e.g. Vault) will continue to
work. The "legacy" tokens generated though will not be able to take advantage of
new policy features. It's recommended that you complete migration of all tokens
as soon as possible after upgrade, as well as updating any integrations to work
with the the new ACL [Token](/api/acl/tokens.html) and
[Policy](/api/acl/policies.html) APIs.
More complete details on how to upgrade "legacy" tokens is available [here](/docs/guides/acl-migrate-tokens.html).
### Connect Multi-datacenter
This only applies to users upgrading from an older version of Consul Enterprise to Consul Enterprise 1.4.0 (all license types).
In addition, this upgrade will only affect clusters where [Connect is enabled](/docs/connect/configuration.html) on your servers before the migration.
Connect multi-datacenter uses the same primary/secondary approach as ACLs and will use the same [primary_datacenter](#primary-datacenter). When a secondary datacenter server restarts with 1.4.0 it will detect it is not the primary and begin an automatic bootstrap of multi-datacenter CA federation.
Datacenters can be upgraded in either order; secondary datacenters will not switch into multi-datacenter mode until all servers in both the secondary and primary datacenter are detected to be running at least Consul 1.4.0. Secondary datacenters monitor this periodically (every few minutes) and will automatically upgrade Connect to use a federated Certificate Authority when they do.
In general, migrating a Consul cluster from OSS to Enterprise will update the CA to be federated automatically and without impact on Connect traffic. When upgrading Consul Enterprise 1.3.x to Consul Enterprise 1.4.0 upgrades the CA upgrade is seamless, however depending on the size of the cluster, _new_ connection attempts in the secondary datacenter might fail for a short window (typically seconds) while the update is propagated due to the 1.3.x Beta authorization endpoint validating originating cluster in a way that was not fully forwards compatible with migrating between cluster trust domains. That issue is fixed in 1.4.0 as part of General Availability.
Once migrated (typically a few seconds). Connect will use the primary datacenter's Certificate Authority as the root of trust for all other datacenters. CA migration or root key changes in the primary will now rotate automatically and without loss of connectivity throughout all datacenters and workloads.
For more information see [Connect Multi-datacenter](/docs/enterprise/connect-multi-datacenter/index.html).
## Consul 1.1.0
#### Removal of Deprecated Features

View File

@ -379,6 +379,9 @@
<li<%= sidebar_current("docs-guides-acl-legacy") %>>
<a href="/docs/guides/acl-legacy.html">Legacy</a>
</li>
<li<%= sidebar_current("docs-guides-acl-migrate-tokens") %>>
<a href="/docs/guides/acl-migrate-tokens.html">Token Migration</a>
</li>
</ul>
</li>
<li<%= sidebar_current("docs-guides-servers") %>>