diff --git a/website/source/docs/guides/acl-migrate-tokens.html.md b/website/source/docs/guides/acl-migrate-tokens.html.md new file mode 100644 index 000000000..bf4ca3d71 --- /dev/null +++ b/website/source/docs/guides/acl-migrate-tokens.html.md @@ -0,0 +1,453 @@ +--- +layout: "docs" +page_title: "ACL Token Migration" +sidebar_current: "docs-guides-acl-migrate-tokens" +description: |- + Consul 1.4.0 introduces a new ACL system with improvements for the security and + management of ACL tokens and policies. This guide documents how to upgrade + existing (now called "legacy") tokens after upgrading to 1.4.0. +--- + +# ACL Token Migration + +Consul 1.4.0 introduces a new ACL system with improvements for the security and +management of ACL tokens and policies. This guide documents how to upgrade +existing (now called "legacy") tokens after upgrading to 1.4.0. + +Since the policy syntax changed to be more precise and flexible to manage, it's +necessary to manually translate old tokens into new ones to take advantage of +the new ACL system features. Tooling is provided to help automate this and this +guide describes the overall process. + +~> **Note:** **1.4.0 retains full support for "legacy" ACL tokens** so upgrades +from Consul 1.3.0 are safe. Existing tokens will continue to work in the same +way for at least two "major" releases (1.5.x, 1.6.x, etc; note HashiCorp does +not use SemVer for our products). + +This document will briefly describe [what changed](#what-changed), and then walk +through the [high-level migration process options](#migration-process), finally +giving some [specific examples](#migration-examples) of migration strategies. + +## New ACL System Differences + +The [ACL guide](/docs/guides/acl.html) and [legacy ACL +guide](/docs/guides/acl-legacy.html) describes the new and old systems in +detail. Below is a summary of the changes that need to be considered when +migrating legacy tokens to the new system. + +### Token and Policy Separation + +You can use a single policy in the new system for all tokens that share access +rules. For example, all tokens created using the clone endpoint in the legacy +system can be represented with a single policy and a set of tokens that map to +that policy. + +### Rule Syntax Changes + +The most significant change is that rules with selectors _no longer prefix match +by default_. In the legacy system the following rules would grant access to +nodes, services and keys _prefixed_ with foo. + +``` +node "foo" { policy = "write" } +service "foo" { policy = "write" } +key "foo" { policy = "write" } +``` + +In the new system the same syntax will only perform _exact_ match on the whole +node name, service name or key. + +In general, exact match is what most operators intended most of the time so the +same policy can be kept, however if you rely on prefix match behavior then using +the same syntax will break behavior. + +Prefix matching can be expressed in the new ACL system explicitly, making the +following rules in the new system exactly the same as the rules above in the +old. + +``` +node_prefix "foo" { policy = "write" } +service_prefix "foo" { policy = "write" } +key_prefix "foo" { policy = "write" } +``` + +### API Separation + +The "old" API endpoints below continue to work for backwards compatibility but +will continue to create or show only "legacy" tokens that can't take full +advantage of the new ACL system improvements. They are documented fully under +[Legacy Tokens](/api/acl/legacy.html). + +- [`PUT /acl/create` - Create Legacy Token](/api/acl/legacy.html#create-acl-token) +- [`PUT /acl/update` - Update Legacy Token](/api/acl/legacy.html#update-acl-token) +- [`PUT /acl/destroy/:uuid` - Delete Legacy Token](/api/acl/legacy.html#delete-acl-token) +- [`GET /acl/info/:uuid` - Read Legacy Token](/api/acl/legacy.html#read-acl-token) +- [`PUT /acl/clone/:uuid` - Clone Legacy Token](/api/acl/legacy.html#clone-acl-token) +- [`GET /acl/list` - List Legacy Tokens](/api/acl/legacy.html#list-acls) + +The new ACL system includes new API endpoints to manage +the [ACL System](/api/acl/acl.html), [Tokens](/api/acl/tokens.html) +and [Policies](/api/acl/policies.html). + +## Migration Process + +While "legacy" tokens will continue to work for several major releases, it's +advisable to plan on migrating existing tokens as soon as is convenient. +Migrating also enables using the new policy management improvements, stricter +policy syntax rules and other features of the new system without +re-issuing all the secrets in use. + +The high-level process for migrating a legacy token is as follows: + + 1. Create a new policy or policies that grant the required access + 2. Update the existing token to use those policies + +### Prerequisites + +This process assumes that the 1.4.0 upgrade is complete including all legacy +ACLs having their accessor IDs populated. This might take up to several minutes +after the servers upgrade in the primary datacenter. You can tell if this is the +case by using `consul acl token list` and checking that no tokens exist with a +blank `AccessorID`. + +In addition, it is assumed that all clients that might _create_ ACL tokens (e.g. +Vault's Consul secrets engine) have been updated to use the [new ACL +APIs](/docs/guides/acl-migrate-tokens.html#api-separation). + +Specifically if you are using Vault's Consul secrets engine you need to be +running Vault 1.0.0 or higher, _and_ you must update all roles defined in Vault +to specify a list of policy names rather than an inline policy (which causes +Vault to use the legacy API). + +~> **Note:** if you have systems still creating "legacy" tokens with the old +APIs, the migration steps below will still work, however you'll have to keep +re-running them until nothing is creating legacy tokens to ensure all tokens are +migrated. + +### Creating Policies + +There are a range of different strategies for creating new policies from existing +tokens. Two high-level strategies are described here although others or a +mixture of these may be most appropriate depending on the ACL tokens you already +have. + +#### Strategy 1: Simple Policy Mapping + +The simplest and most automatic strategy is to create one new policy for every +existing token. This is easy to automate, but may result in a lot of policies +with exactly the same rules and with non-human-readable names which will make +managing policies harder. This approach can be accomplished using the [`consul +acl policy create`](/docs/commands/acl/acl-policy.html#create) command with +`-from-token` option. + +| Pros | Cons | +| ---- | ---- | +| ✅ Simple | ❌ May leave many duplicated policies | +| ✅ Easy to automate | ❌ Policy names not human-readable | + +A detailed example of using this approach is [given +below](#simple-policy-mapping). + +#### Strategy 2: Combining Policies + +This strategy takes a more manual approach to create a more manageable set of +policies. There are a spectrum of options for how to do this which tradeoff +increasing human involvement for increasing clarity and re-usability of the +resulting policies. + +For example you could use hashes of the policy rules to de-duplicate identical +token policies automatically, however naming them something meaningful for +humans would likely still need manual intervention. + +Toward the other end of the spectrum it might be beneficial for security to +translate prefix matches into exact matches. This however requires the operator +knowing that clients using the token really doesn't rely on the prefix matching +semantics of the old ACL system. + +To assist with this approach, there is a CLI tool and corresponding API that can +translate a legacy ACL token's rules into a new ACL policy that is exactly +equivalent. See [`consul acl +translate-rules`](/docs/commands/acl/acl-translate-rules.html). + +| Pros | Cons | +| ---- | ---- | +| ✅ Clearer, more manageable policies | ❌ Requires more manual effort | +| ✅ Policies can be re-used by new ACL tokens | ❌ May take longer for large or complex existing policy sets | + +A detailed example of using this approach is [given below](#combining-policies). + +### Updating Existing Tokens + +Once you have created one or more policies that adequately express the rules +needed for a legacy token, you can update the token via the CLI or API to use +those policies. + +After updating, the token is no longer considered "legacy" and will have all the +properties of a new token, however it keeps it's `SecretID` (the secret part of +the token used in API calls) so clients already using that token will continue +to work. It is assumed that the policies you attach continue to grant the +necessary access for existing clients; this is up to the operator to ensure. + +#### Update via API + +Use the [`PUT /v1/acl/token/:AccessorID`](/api/acl/tokens.html#update-a-token) +endpoint. Specifically, ensure that the `Rules` field is omitted or empty. Empty +`Rules` indicates that this is now treated as a new token. + +#### Update via CLI + +Use the [`consul acl token update`](/docs/commands/acl/acl-token.html#update) +command to update the token. Specifically you need to use `-upgrade-legacy` +which will ensure that legacy rules are removed as well as the new policies +added. + +## Migration Examples + +Below are two detailed examples of the two high-level strategies for creating +polices discussed above. It should be noted these are intended to clarify the +concrete steps you might take. **We don't recommend you perform production +migrations with ad-hoc terminal commands**. Combining these or something similar +into a script might be appropriate. + +### Simple Policy Mapping + +This strategy uses the CLI to create a new policy for every existing legacy +token with exactly equivalent rules. It's easy to automate and clients will see +no change in behavior for their tokens, but it does leave you with a lot of +potentially identical policies to manage or clean up later. + +#### Create Policies + +You can get the AccessorID of every legacy token from the API. For example, +using `curl` and `jq` in bash: + +```sh +$ LEGACY_IDS=$(curl -sH "X-Consul-Token: $CONSUL_HTTP_TOKEN" \ + 'localhost:8500/v1/acl/tokens' | jq -r '.[] | select (.Legacy) | .AccessorID') +$ echo "$LEGACY_IDS" +621cbd12-dde7-de06-9be0-e28d067b5b7f +65cecc86-eb5b-ced5-92dc-f861cf7636fe +ba464aa8-d857-3d26-472c-4d49c3bdae72 +``` + +To create a policy for each one we can use something like: + +```sh +for id in $LEGACY_IDS; do \ + consul acl policy create -name "migrated-$id" -from-token $id \ + -description "Migrated from legacy ACL token"; \ +done +``` + +Each policy now has an identical set of rules to the original token. You can +inspect these: + +```sh +$ consul acl policy read -name migrated-621cbd12-dde7-de06-9be0-e28d067b5b7f +ID: 573d84bd-8b08-3061-e391-d2602e1b4947 +Name: migrated-621cbd12-dde7-de06-9be0-e28d067b5b7f +Description: Migrated from legacy ACL token +Datacenters: +Rules: +service_prefix "" { + policy = "write" +} +``` + +Notice how the policy here is `service_prefix` and not `service` since the old +ACL syntax was an implicit prefix match. This ensures any clients relying on +prefix matching behavior will still work. + +#### Update Tokens + +With the policies created as above, we can automatically upgrade all legacy +tokens. + +```sh +for id in $LEGACY_IDS; do \ + consul acl token update -id $id -policy-name "migrated-$id" -upgrade-legacy; \ +done +``` + +The update is now complete, all legacy tokens are now new tokens with identical +secrets and enforcement rules. + +### Combining Policies + +This strategy has more manual elements but results in a cleaner and more +manageable set of policies than the fully automatic solutions. Note that this is +**just an example** to illustrate a few ways you may choose to merge or +manipulate policies. + +#### Find All Unique Policies + +You can get the AccessorID of every legacy token from the API. For example, +using `curl` and `jq` in bash: + +```sh +$ LEGACY_IDS=$(curl -sH "X-Consul-Token: $CONSUL_HTTP_TOKEN" \ + 'localhost:8500/v1/acl/tokens' | jq -r '.[] | select (.Legacy) | .AccessorID') +$ echo "$LEGACY_IDS" +8b65fdf9-303e-0894-9f87-e71b3273600c +d9deb39b-1b30-e100-b9c5-04aba3f593a1 +f2bce42e-cdcc-848d-28ca-cfd0556a22e3 +``` + +Now we want to read the actual policy for each legacy token and de-duplicate +them. We can use the `translate-rules` helper sub-command which will read the +token's policy and return a new ACL policy that is exactly equivalent. + +```sh +$ for id in $LEGACY_IDS; do \ + echo "Policy for $id:" + consul acl translate-rules -token-accessor "$id"; \ +done +Policy for 8b65fdf9-303e-0894-9f87-e71b3273600c: +service_prefix "bar" { + policy = "write" +} +Policy for d9deb39b-1b30-e100-b9c5-04aba3f593a1: +service_prefix "foo" { + policy = "write" +} +Policy for f2bce42e-cdcc-848d-28ca-cfd0556a22e3: +service_prefix "bar" { + policy = "write" +} +``` + +Notice that two policies are the same and one different. + +We can change the loop above to take a hash of this policy definition to +de-duplicate the policies into a set of files locally. This example uses command +available on macOS but equivalents for other platforms should be easy to find. + +```sh +$ mkdir policies +$ for id in $LEGACY_IDS; do \ + # Fetch the equivalent new policy rules based on the legacy token rules + NEW_POLICY=$(consul acl translate-rules -token-accessor "$id"); \ + # Sha1 hash the rules + HASH=$(echo -n "$NEW_POLICY" | shasum | awk '{ print $1 }'); \ + # Write rules to a policy file named with the hash to de-duplicated + echo "$NEW_POLICY" > policies/$HASH.hcl; \ +done +$ tree policies +policies +├── 024ce11f26f59436c518fb31f0999d1400485c17.hcl +└── 501b787c9444fbd62f346ab257eeb27197be2444.hcl +``` + +#### Cleaning Up Policies + +You can now manually inspect and potentially edit these policies. For example we +could rename them according to their intended use. In this case we maintain the +hash as it will allow us to match tokens to policies later. + +```sh +$ cat policies/024ce11f26f59436c518fb31f0999d1400485c17.hcl +service_prefix "bar" { + policy = "write" +} +$ # Add human-readable suffix to the file name so policies end up clearly named +$ mv policies/024ce11f26f59436c518fb31f0999d1400485c17.hcl \ + policies/024ce11f26f59436c518fb31f0999d1400485c17-bar-service.hcl +``` + +You might also choose to tighten up the rules, for example if you know you never +rely on prefix-matching the service name `foo` you might choose to modify the +policy to use exact match. + +```sh +$ cat policies/501b787c9444fbd62f346ab257eeb27197be2444.hcl +service_prefix "foo" { + policy = "write" +} +$ echo 'service "foo" { policy = "write" }' > policies/501b787c9444fbd62f346ab257eeb27197be2444.hcl +$ # Add human-readable suffix to the file name so policies end up clearly named +$ mv policies/501b787c9444fbd62f346ab257eeb27197be2444.hcl \ + policies/501b787c9444fbd62f346ab257eeb27197be2444-foo-service.hcl +``` + +#### Creating Policies + +We now have a minimal set of policies to create, with human-readable names. We +can create each one with something like the following. + +```sh +$ for p in $(ls policies | grep ".hcl"); do \ + # Extract the hash part of the file name + HASH=$(echo "$p" | cut -d - -f 1); \ + # Extract the name suffix without .hcl + NAME=$(echo "$p" | cut -d - -f 2- | cut -d . -f 1); \ + # Create new policy based on the rules in the file and the name we gave + consul acl policy create -name $NAME \ + -rules "@policies/$p" \ + -description "Migrated from legacy token"; \ +done +ID: da2a9f9b-4e44-13f8-e308-76ce7a8dcb21 +Name: bar-service +Description: Migrated from legacy token +Datacenters: +Rules: +service_prefix "bar" { + policy = "write" +} + +ID: 9fbded86-9140-efe4-b661-c8bd07b6c584 +Name: foo-service +Description: Migrated from legacy token +Datacenters: +Rules: +service "foo" { policy = "write" } + +``` + +#### Upgrading Tokens + +Finally we can map our existing tokens to those policies using the hash in the +policy file names. The `-upgrade-legacy` flag removes the token's legacy +embedded rules at the same time as associating them with the new policies +created from those rules. + +```sh +$ for id in $LEGACY_IDS; do \ + NEW_POLICY=$(consul acl translate-rules -token-accessor "$id"); \ + HASH=$(echo -n "$NEW_POLICY" | shasum | awk '{ print $1 }'); \ + # Lookup the hash->new policy mapping from the policy file names + POLICY_FILE=$(ls policies | grep "^$HASH"); \ + POLICY_NAME=$(echo "$POLICY_FILE" | cut -d - -f 2- | cut -d . -f 1); \ + echo "==> Mapping token $id to policy $POLICY_NAME"; \ + consul acl token update -id $id -policy-name $POLICY_NAME -upgrade-legacy; \ +done +==> Mapping token 8b65fdf9-303e-0894-9f87-e71b3273600c to policy bar-service +Token updated successfully. +AccessorID: 8b65fdf9-303e-0894-9f87-e71b3273600c +SecretID: 3dbb3981-7654-733a-3475-5ce20fc5a7b9 +Description: +Local: false +Create Time: 0001-01-01 00:00:00 +0000 UTC +Policies: + da2a9f9b-4e44-13f8-e308-76ce7a8dcb21 - bar-service +==> Mapping token d9deb39b-1b30-e100-b9c5-04aba3f593a1 to policy foo-service +Token updated successfully. +AccessorID: d9deb39b-1b30-e100-b9c5-04aba3f593a1 +SecretID: 5f54733b-4c76-eb74-8781-3550c20f4969 +Description: +Local: false +Create Time: 0001-01-01 00:00:00 +0000 UTC +Policies: + 9fbded86-9140-efe4-b661-c8bd07b6c584 - foo-service +==> Mapping token f2bce42e-cdcc-848d-28ca-cfd0556a22e3 to policy bar-service +Token updated successfully. +AccessorID: f2bce42e-cdcc-848d-28ca-cfd0556a22e3 +SecretID: f3aaa3e2-2c6f-cf3c-1e86-454de728e8ab +Description: +Local: false +Create Time: 0001-01-01 00:00:00 +0000 UTC +Policies: + da2a9f9b-4e44-13f8-e308-76ce7a8dcb21 - bar-service +``` + +At this point all tokens are upgraded and can use new ACL features while +retaining the same secret clients are already using. diff --git a/website/source/docs/upgrade-specific.html.md b/website/source/docs/upgrade-specific.html.md index 882830278..c0f69a91e 100644 --- a/website/source/docs/upgrade-specific.html.md +++ b/website/source/docs/upgrade-specific.html.md @@ -14,6 +14,106 @@ details provided for their upgrades as a result of new features or changed behavior. This page is used to document those details separately from the standard upgrade flow. +## Consul 1.4.0 + +There are two major features in Consul 1.4.0 that may impact upgrades: a [new ACL system](#acl-upgrade) and [multi-datacenter support for Connect](#connect-multi-datacenter) in the Enterprise version. + +### ACL Upgrade + +Consul 1.4.0 includes a [new ACL system](/docs/guides/acl.html) that is +designed to have a smooth upgrade path but requires care to upgrade components +in the right order. + +**Note:** As with most major version upgrades, you cannot downgrade once the +upgrade to 1.4.0 is complete as it adds new state to the raft store. As always +it is _strongly_ recommended that you test the upgrade first outside of +production and ensure you take backup snapshots of all datacenters before +upgrading. + +#### Primary Datacenter + +The "ACL datacenter" in 1.3.x and earlier is now referred to as the "Primary +datacenter". All configuration is backwards compatible and shouldn't need to +change prior to upgrade although it's strongly recommended to migrate ACL +configuration to the new syntax soon after upgrade. This includes moving to +`primary_datacenter` rather than `acl_datacenter` and `acl_*` to the new [ACL +block](/docs/agent/options.html#acl). + +Datacenters can be upgraded in any order although secondaries will remain in +[Legacy ACL mode](#legacy-acl-mode) until the primary datacenter is fully +ugraded. + +Each datacenter should follow the [standard rolling upgrade +procedure](/docs/upgrading.html#standard-upgrades). + +#### Legacy ACL Mode + +When a 1.4.0 server first starts, it runs in "Legacy ACL mode". In this mode, +bootstrap requests and new ACL APIs will not be functional yet and will return +an error. The server advertises it's ability to support 1.4.0 ACLs via gossip +and waits. + +In the primary datacenter, the servers all wait in legacy ACL mode until they +see every server in the primary datacenter advertise 1.4.0 ACL support. Once +this happens, the leader will complete the transition out of "legacy ACL mode" +and write this into the state so future restarts don't need to go through the +same transition. + +In a secondary datacenter, the same process happens except that servers +_additionally_ wait for all servers in the primary datacenter making it safe to +upgrade datacenters in any order. + +It should be noted that even if you are not upgrading, starting a brand new +1.4.0 cluster will transition through legacy ACL mode so you may be unable to +bootstrap ACLs until all the expected servers are up and healthy. + +#### Legacy Token Accessor Migration + +As soon as all servers in the primary datacenter have been upgraded to 1.4.0, +the leader will begin the process of creating new accessor IDs for all existing +ACL tokens. + +This process completes in the background and is rate limited to ensure it +doesn't overload the leader. It completes upgrades in batches of 128 tokens and +will not upgrade more than one batch per second so on a cluster with 10,000 +tokens, this may take several minutes. + +While this is happening both old and new ACLs will work correctly with the +caveat that new ACL [Token APIs](/api/acl/tokens.html) may not return an +accessor ID for legacy tokens that are not yet migrated. + +#### Migrating Existing ACLs + +New ACL policies have slightly different syntax designed to fix some +shortcomings in old ACL syntax. During and after the upgrade process, any old +ACL tokens will continue to work and grant exactly the same level of access. + +After upgrade, it is still possible to create "legacy" tokens using the existing +API so existing integrations that create tokens (e.g. Vault) will continue to +work. The "legacy" tokens generated though will not be able to take advantage of +new policy features. It's recommended that you complete migration of all tokens +as soon as possible after upgrade, as well as updating any integrations to work +with the the new ACL [Token](/api/acl/tokens.html) and +[Policy](/api/acl/policies.html) APIs. + +More complete details on how to upgrade "legacy" tokens is available [here](/docs/guides/acl-migrate-tokens.html). + +### Connect Multi-datacenter + +This only applies to users upgrading from an older version of Consul Enterprise to Consul Enterprise 1.4.0 (all license types). + +In addition, this upgrade will only affect clusters where [Connect is enabled](/docs/connect/configuration.html) on your servers before the migration. + +Connect multi-datacenter uses the same primary/secondary approach as ACLs and will use the same [primary_datacenter](#primary-datacenter). When a secondary datacenter server restarts with 1.4.0 it will detect it is not the primary and begin an automatic bootstrap of multi-datacenter CA federation. + +Datacenters can be upgraded in either order; secondary datacenters will not switch into multi-datacenter mode until all servers in both the secondary and primary datacenter are detected to be running at least Consul 1.4.0. Secondary datacenters monitor this periodically (every few minutes) and will automatically upgrade Connect to use a federated Certificate Authority when they do. + +In general, migrating a Consul cluster from OSS to Enterprise will update the CA to be federated automatically and without impact on Connect traffic. When upgrading Consul Enterprise 1.3.x to Consul Enterprise 1.4.0 upgrades the CA upgrade is seamless, however depending on the size of the cluster, _new_ connection attempts in the secondary datacenter might fail for a short window (typically seconds) while the update is propagated due to the 1.3.x Beta authorization endpoint validating originating cluster in a way that was not fully forwards compatible with migrating between cluster trust domains. That issue is fixed in 1.4.0 as part of General Availability. + +Once migrated (typically a few seconds). Connect will use the primary datacenter's Certificate Authority as the root of trust for all other datacenters. CA migration or root key changes in the primary will now rotate automatically and without loss of connectivity throughout all datacenters and workloads. + +For more information see [Connect Multi-datacenter](/docs/enterprise/connect-multi-datacenter/index.html). + ## Consul 1.1.0 #### Removal of Deprecated Features diff --git a/website/source/layouts/docs.erb b/website/source/layouts/docs.erb index 442cc8285..057935f80 100644 --- a/website/source/layouts/docs.erb +++ b/website/source/layouts/docs.erb @@ -379,6 +379,9 @@ > Legacy + > + Token Migration + >