6bfdfa0a4d
* Document Convergent Tokenization and Token Lookup * tweaks * Fix sample response * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/index.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/tokenization.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/tokenization.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/tokenization.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/tokenization.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/tokenization.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/tokenization.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/docs/secrets/transform/tokenization.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * Update website/content/api-docs/secret/transform.mdx Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com> * update awkward text Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>
360 lines
13 KiB
Plaintext
360 lines
13 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Transform - Secrets Engines
|
|
description: >-
|
|
The Transform secrets engine for Vault performs secure data transformation.
|
|
---
|
|
|
|
# Transform Secrets Engine
|
|
|
|
-> **Note**: This secret engine requires [Vault
|
|
Enterprise](https://www.hashicorp.com/products/vault/) with the Advanced Data
|
|
Protection Transform Module.
|
|
|
|
The Transform secrets engine handles secure data transformation and tokenization
|
|
against provided input value. Transformation methods may encompass NIST vetted
|
|
cryptographic standards such as [format-preserving encryption
|
|
(FPE)](https://en.wikipedia.org/wiki/Format-preserving_encryption) via
|
|
[FF3-1](https://csrc.nist.gov/publications/detail/sp/800-38g/rev-1/draft), but
|
|
can also be pseudonymous transformations of the data through other means, such
|
|
as masking.
|
|
|
|
The secret engine currently supports `fpe`, `masking`, and `tokenization` as
|
|
data transformation types.
|
|
|
|
## Setup
|
|
|
|
Most secrets engines must be configured in advance before they can perform their
|
|
functions. These steps are usually completed by an operator or configuration
|
|
management tool.
|
|
|
|
1. Enable the Transform secrets engine:
|
|
|
|
```text
|
|
$ vault secrets enable transform
|
|
Success! Enabled the transform secrets engine at: transform/
|
|
```
|
|
|
|
By default, the secrets engine will mount at the name of the engine. To enable
|
|
the secrets engine at a different path, use the -path argument.
|
|
|
|
1. Create a named role:
|
|
|
|
```text
|
|
$ vault write transform/role/payments transformations=ccn-fpe
|
|
Success! Data written to: transform/role/payments
|
|
```
|
|
|
|
1. Create a transformation:
|
|
|
|
```text
|
|
$ vault write transform/transformation/ccn-fpe \
|
|
type=fpe \
|
|
template=ccn \
|
|
tweak_source=internal \
|
|
allowed_roles=payments
|
|
Success! Data written to: transform/transformation/ccn-fpe
|
|
```
|
|
|
|
1. Optionally, create a template:
|
|
|
|
```text
|
|
$ vault write transform/template/ccn \
|
|
type=regex \
|
|
pattern='(\d{4})[- ](\d{4})[- ](\d{4})[- ](\d{4})' \
|
|
encode_format='$1-$2-$3-$4' \
|
|
decode_formats=last-four='$4' \
|
|
alphabet=numerics
|
|
Success! Data written to: transform/template/ccn
|
|
```
|
|
|
|
1. Optionally, create an alphabet:
|
|
|
|
```text
|
|
$ vault write transform/alphabet/numerics \
|
|
alphabet="0123456789"
|
|
Success! Data written to: transform/alphabet/numerics
|
|
```
|
|
|
|
## Usage
|
|
|
|
After the secrets engine is configured and a user/machine has a Vault token with
|
|
the proper permission, it can use this secrets engine to encode and decode input
|
|
values.
|
|
|
|
1. Encode some input value using the `/encode` endpoint with a named role:
|
|
|
|
```text
|
|
$ vault write transform/encode/payments value=1111-2222-3333-4444
|
|
Key Value
|
|
--- -----
|
|
encoded_value 9300-3376-4943-8903
|
|
```
|
|
|
|
A transformation must be provided if the role contains more than one
|
|
transformation. A tweak must be provided if the tweak source for the
|
|
transformation is "supplied".
|
|
|
|
1. Decode some input value using the `/decode` endpoint with a named role:
|
|
|
|
```text
|
|
$ vault write transform/decode/payments value=9300-3376-4943-8903
|
|
Key Value
|
|
--- -----
|
|
decoded_value 1111-2222-3333-4444
|
|
```
|
|
|
|
A transformation must be provided if the role contains more than one
|
|
transformation. A tweak must be provided if the tweak source for the
|
|
transformation is "supplied" or "generated".
|
|
|
|
1. Decode some input value using the `/decode` endpoint with a named role and decode format:
|
|
|
|
```text
|
|
$ vault write transform/decode/payments/last-four value=9300-3376-4943-8903
|
|
Key Value
|
|
--- -----
|
|
decoded_value 4444
|
|
```
|
|
|
|
A transformation must be provided if the role contains more than one
|
|
transformation. A tweak must be provided if the tweak source for the
|
|
transformation is "supplied" or "generated". A decode format can optionally
|
|
be provided. If one isn't provided, the decoded output will be formatted to
|
|
match the template's pattern as in the previous example.
|
|
|
|
## Roles, Transformations, Templates, and Alphabets
|
|
|
|
The Transform secrets engine contains several types of resources that
|
|
encapsulate different aspects of the information required in order to perform
|
|
data transformation.
|
|
|
|
- **Roles** are the basic high-level construct that holds the set of
|
|
transformation that it is allowed to performed. The role name is provided when
|
|
performing encode and decode operations.
|
|
|
|
- **Transformations** hold information about a particular transformation. It
|
|
contains information about the type of transformation that we want to perform,
|
|
the template that it should use for value detection, and other
|
|
transformation-specific values such as the tweak source or the masking character
|
|
to use.
|
|
|
|
- **Templates** allow us to determine what and how to capture the value that we
|
|
want to transform.
|
|
|
|
- **Alphabets** provide the set of valid UTF-8 character contained within both
|
|
the input and transformed value on FPE transformations.
|
|
|
|
## Transformations
|
|
|
|
### Format Preserving Encryption
|
|
|
|
Format Preserving Encryption (FPE) performs cryptographically secure
|
|
transformation via FF3-1 to encode input values while maintaining its data
|
|
format and length.
|
|
|
|
#### Tweak and Tweak Source
|
|
|
|
FF3-1 uses a non-confidential parameter called the tweak along with the
|
|
ciphertext when performing encryption and decryption operations. The tweak
|
|
is precisely a 7-byte value. The secret engine consumes a base64 encoded string
|
|
of this value for its encode and decode operation whenever this input is
|
|
required.
|
|
|
|
In order to simplify the flow of encoding and decoding operations, transformation
|
|
creation can take care of generating and associating a tweak value. This allows
|
|
applications to provide a single value without having the need to generate or
|
|
store any other metadata.
|
|
|
|
In cases where more granularity is required, a tweak value can be generated by
|
|
Vault and returned, or it may be independently generated and provided.
|
|
|
|
In summary, there are three ways in which the tweak value may be sourced:
|
|
|
|
- `supplied`: This is the default behavior for FPE transformations. The tweak
|
|
value must be generated externally, and supplied into the on encode and decode
|
|
operations.
|
|
- `generated`: The secret engine will take care of generating the tweak value
|
|
on encode operations and return this back as part of the response along
|
|
with the encoded value. It is up to the application to store this value
|
|
so that it can be provided back when decoding the encoded value.
|
|
- `internal`: The secret engine will generate an internal tweak value per
|
|
transformation. This value is not returned on encode or decode operations
|
|
since it gets re-used for all encode and decode operations for the
|
|
transformation. Depending on the uniqueness of the dataset, this mode may
|
|
introduce higher risks, but provides the most convenience since the value does
|
|
not need to be stored separately. This mode should only be used if the values
|
|
being encoded are sufficiently unique.
|
|
|
|
Your team and organization should weigh the trade-offs when it comes to
|
|
choosing the proper tweak source to use. For `supplied` and `internal`
|
|
sourcing, please see [FF3-1 Tweak Usage Details](transform/ff3-tweak-details)
|
|
|
|
#### Input Limits
|
|
|
|
FF3-1 specifies both minimum and maximum limits on the length of an input.
|
|
These limits are driven by the security goals, making sure that for a given
|
|
alphabet the input size does not leave the input guessable by brute force.
|
|
|
|
Given an alphabet of length A, an input length L is valid if:
|
|
|
|
- L >= 2,
|
|
- A<sup>L</sup> >= 1,000,000
|
|
- and L <= 2 \* floor(log<sub>A</sub>(2<sup>96</sup>)).
|
|
|
|
As a concrete example, for handling credit card numbers, A is 10, L is 16, so
|
|
valid input lengths would be between 6 and 56 characters. This is because
|
|
10<sup>6</sup>=1,000,000 (already greater than 2), and 2 \* floor(log<sub>10</sub>(2<sup>96</sup>)) = 56.
|
|
|
|
Of course, in the case of credit card numbers valid input would always be
|
|
between 12 and 19 decimal digits.
|
|
|
|
#### Output Limitations
|
|
|
|
After transformation and formatting by the template, the value is an encrypted
|
|
version of the input with the format preserved. However, the value itself may
|
|
be _invalid_ with respect to other standards. For example the output credit card
|
|
number may not validate (it likely won't create a valid check digit).
|
|
|
|
So one must consider when the outputs are stored whether validation in storage
|
|
may reject them.
|
|
|
|
### Masking
|
|
|
|
Masking performs replacement of matched characters on the input value with a
|
|
desired character. This form of transformation is non-reversible and thus does
|
|
not support retrieving the original value back using the decode operation.
|
|
|
|
### Tokenization
|
|
|
|
[Tokenization](transform/tokenization) exchanges a
|
|
sensitive value for an unrelated value called a _token_. The original sensitive
|
|
value cannot be recovered from a token alone, they are irreversible.
|
|
|
|
#### Inputs
|
|
|
|
Tokenization inputs are not processed by templates or alphabets, as they do not
|
|
preserve any of the contents or format of the input.
|
|
|
|
#### Outputs
|
|
|
|
Tokenization is not format preserving. The token output is a Base58 encoded
|
|
string value of unrelated length, and is not rendered by a template.
|
|
|
|
The decoded value is returned verbatim as it was before encoding.
|
|
|
|
#### Metadata
|
|
|
|
As tokenization isn't format preserving and is stateful, the input values can be
|
|
any length, subject to other limits in Vault's request processing. In addition,
|
|
non-sensitive _metadata_ can be encoded alongside the value, and retrieved either
|
|
with or independently of the original value.
|
|
|
|
#### Operations
|
|
|
|
In addition to encode and decode, as tokenization is stateful, it provides two
|
|
additional operations:
|
|
|
|
- Retrieve metadata given a token.
|
|
- Check whether an input value has a valid, unexpired token.
|
|
- For some configurations, retrieve a previously encoded token for a plaintext
|
|
input.
|
|
|
|
#### Stores
|
|
|
|
Tokenization is stateful. Tokenized state can be stored internally (the
|
|
default) or in an external store. Currently only PostgreSQL and MySQL are supported
|
|
for external storage.
|
|
|
|
#### Mapping Modes
|
|
|
|
[Tokenization](transform/tokenization) stores the results of an encode operation
|
|
in storage using a cryptographic construct that enhances the safety of its values.
|
|
In the `default` mapping mode, the token itself is transformed via a one way
|
|
function involving the transform key and elements of the token. As Vault does
|
|
not store the token, the values in Vault storage themselves cannot be used to
|
|
retrieve original input.
|
|
|
|
A second mapping mode, `exportable` is provided for cases where
|
|
operators may need to recover the full set of decoded inputs in an emergency via
|
|
the export operation. It is strongly recommended that one use the `default` mode if
|
|
possible, as it is resistant to more types of attack.
|
|
|
|
#### Convergent Tokenization
|
|
|
|
In addition, tokenization transformations may be configured as *convergent*, meaning
|
|
that tokenizing a plaintext and expiration more than once results in the
|
|
same token value. Enabling convergence has performance and security
|
|
[considerations](transform/tokenization#convergence).
|
|
|
|
## Deletion Behavior
|
|
|
|
The deletion of resources, aside from roles, is guarded by checking whether any
|
|
other related resources are currently using it in order to avoid accidental data
|
|
loss of any encoded value that may depend on these bits of information to
|
|
decode and reconstruct the original value. Role deletion can be safely done
|
|
since the information related to the transformation itself is contained within
|
|
transformation object and its related resources.
|
|
|
|
The following rules applies when it comes to deleting a resource:
|
|
|
|
- A transformation cannot be deleted if it's in use by a role.
|
|
- A template or store cannot be deleted if it's in use by a transformation.
|
|
- An alphabet cannot be deleted if it's in use by a template.
|
|
|
|
## Provided Builtin Resources
|
|
|
|
The secret engine provides a set of builtin templates and alphabets that are
|
|
considered common. Builtin templates cannot be deleted, and the prefix
|
|
"builtin/" on template and alphabet names is a reserved keyword.
|
|
|
|
### Templates
|
|
|
|
The following builtin templates are available for use in the secret engine:
|
|
|
|
- builtin/creditcardnumber
|
|
- builtin/socialsecuritynumber
|
|
|
|
Note that these templates only check for the matching pattern(s), and not the
|
|
validity of the value itself. For instance, the builtin credit card number
|
|
template can determine whether the provided value is in the format of commonly
|
|
issued credit cards, but not whether the credit card is a valid number from a
|
|
particular issuer.
|
|
|
|
Templates currently only accept regular expressions as the matching pattern
|
|
type. It uses Go's standard library for the regexp engine, which supports
|
|
[the RE2 syntax](https://github.com/google/re2/wiki/Syntax).
|
|
|
|
**Note**: The `builtin/any` template is only valid and is the default for the tokenization
|
|
transform.
|
|
|
|
### Alphabets
|
|
|
|
The following builtin alphabets are available for use in the secret engine:
|
|
|
|
- builtin/numeric
|
|
- builtin/alphalower
|
|
- builtin/alphaupper
|
|
- builtin/alphanumericlower
|
|
- builtin/alphanumericupper
|
|
- builtin/alphanumeric
|
|
|
|
Custom alphabets must contain between 2 and 65536 unique characters.
|
|
|
|
### Stores
|
|
|
|
The following builtin store is available (and is the default) for tokenization
|
|
transformations:
|
|
|
|
- builtin/internal
|
|
|
|
## Tutorial
|
|
|
|
Refer to the [Transform Secrets Engine](https://learn.hashicorp.com/vault/adp/transform) tutorial to learn how to use the Transform secrets engine to handle secure data transmission and tokenization against provided secrets.
|
|
|
|
## API
|
|
|
|
The Transform secrets engine has a full HTTP API. Please see the
|
|
[Transform secrets engine API](/api-docs/secret/transform) for more
|
|
details.
|