Document Convergent Tokenization and Token Lookup (#15819)

* Document Convergent Tokenization and Token Lookup

* tweaks

* Fix sample response

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/index.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/tokenization.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/tokenization.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/tokenization.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/tokenization.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/tokenization.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/tokenization.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/docs/secrets/transform/tokenization.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* Update website/content/api-docs/secret/transform.mdx

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>

* update awkward text

Co-authored-by: Matt Schultz <975680+schultz-is@users.noreply.github.com>
This commit is contained in:
Scott Miller 2022-06-06 13:34:08 -05:00 committed by GitHub
parent a97da32b4b
commit 6bfdfa0a4d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 147 additions and 5 deletions

View File

@ -1123,7 +1123,7 @@ This endpoint decodes the provided value using a named role.
### Sample Payload
```json
```json
{
"value": "418-56-4374",
"transformation": "example-transformation"
@ -1152,7 +1152,7 @@ $ curl \
### Sample Payload
```json
```json
{
"value": "418-56-4374",
"transformation": "example-transformation"
@ -1303,8 +1303,8 @@ $ curl \
This endpoint determines if a provided plaintext value has an valid, unexpired
tokenized value. Note that this cannot return the token, just confirm that a
tokenized value exists. This endpoint is only valid for tokenization
transformations.
tokenized value exists, but works for all tokenization modes.
This endpoint is only valid for tokenization transformations.
| Method | Path |
| :----- | :-------------------------------- |
@ -1317,7 +1317,7 @@ transformations.
of the URL.
- `value` `(string: <required>)`
Specifies the token to test for whether it has a valid tokenization.
Specifies the token to plaintext for which to check whether it has been tokenized.
- `transformation` `(string)`
Specifies the transformation within the role that should be used for this
@ -1376,6 +1376,107 @@ $ curl \
}
```
## Lookup Token
This endpoint returns the token given a plaintext and optionally an
expiration or range of expirations. This operation is only supported
if the transformation is configured as 'convergent', or if the mapping
mode is exportable and the storage backend is external. Tokens may be
looked up with an explicit expiration, an expiration value of "any", or with a range
of acceptable expiration times. This endpoint is only valid for tokenization
transformations.
| Method | Path |
| :----- | :-------------------------------- |
| `POST` | `/transform/tokens/:role_name` |
### Parameters
- `role_name` `(string: <required>)`
Specifies the role name to use for this operation. This is specified as part
of the URL.
- `value` `(string: <required>)`
Specifies the token to test for whether it has a valid tokenization.
- `expiration` `(string: "")` - The precise expiration of the token. If omitted,
this specifically searches for tokens with no expiration. If the string
"any", will return tokens with any or no expiration. Otherwise,
the string must be the RFC3339 formatted time and date of expiration. `expiration`
may not be used at the same time as `min_expiration` and `max_expiration`.
- `min_expiration` `(string: "")` - The minimum expiration time of the token,
inclusive, as an RFC3339 formatted time and date.
`min_expiration` may not be used at the same time as `expiration`.
When provided, `max_expiration` must also be provided.
- `max_expiration` `(string: "")` - The maximum expiration time of the token,
inclusive, as an RFC3339 formatted time and date.
`max_expiration` may not be used at the same time as `expiration`.
When provided, `min_expiration` must also be provided.
- `transformation` `(string)`
Specifies the transformation within the role that should be used for this
lookup operation. If a single transformation exists for role, this parameter
may be skipped and will be inferred. If multiple transformations exist, one
must be specified.
- `reference` `(string: "")` -
A user-supplied string that will be present in the `reference` field on the
corresponding `batch_results` item in the response, to assist in understanding
which result corresponds to a particular input. Only valid on batch requests
when using `batch_input` below.
- `batch_input` `(array<object>: nil)` -
Specifies a list of items to be decoded in a single batch. When this
parameter is set, the `value`, `transformation`, and `reference` parameters are
ignored. Instead, the aforementioned parameters should be provided within
each object in the list. In addition, batched requests can add the `reference`
field described above.
```json
[
{
"value": "1111-1111-1111-1111",
"expiration": "any",
"transformation": "ccn-tokenization"
}
]
```
### Sample Payload
```json
{
"value": "1111-1111-1111-1111",
"min_expiration": "2022-06-06T3:14:15+00:00",
"min_expiration": "2022-06-07T9:26:53+00:00",
"transformation": "ccn-tokenization"
}
```
### Sample Request
```shell-session
$ curl \
--header "X-Vault-Token: ..." \
--request POST \
--data @payload.json \
http://127.0.0.1:8200/v1/transform/tokens/example-role
```
### Sample Response
```json
{
"data": {
"tokens": [
"AHLdmFvTRknMBgrNSy6Ba7xJxG28KkZeHKqxGJ7e45G3V9UbcUr6gdv83ozwRRQwLfJgyHZvfa9rh7kU9xJXVdY"
]
}
}
```
## Retrieve Token Metadata
This endpoint retrieves metadata for a tokenized value using a named role.

View File

@ -257,6 +257,8 @@ additional operations:
- Retrieve metadata given a token.
- Check whether an input value has a valid, unexpired token.
- For some configurations, retrieve a previously encoded token for a plaintext
input.
#### Stores
@ -278,6 +280,13 @@ operators may need to recover the full set of decoded inputs in an emergency via
the export operation. It is strongly recommended that one use the `default` mode if
possible, as it is resistant to more types of attack.
#### Convergent Tokenization
In addition, tokenization transformations may be configured as *convergent*, meaning
that tokenizing a plaintext and expiration more than once results in the
same token value. Enabling convergence has performance and security
[considerations](transform/tokenization#convergence).
## Deletion Behavior
The deletion of resources, aside from roles, is guarded by checking whether any

View File

@ -25,6 +25,38 @@ Depending on the mapping mode, the plaintext may be decoded only with possession
of the distributed token, or may be recoverable in the export operation. See
[Security Considerations](#security-considerations) for more.
### Convergence
By default, tokenization produces a unique token for every encode operation.
This makes the resulting token fully independent of its plaintext and expiration.
Sometimes, though, it may be beneficial if the tokenization of a plaintext/expiration
pair tokenizes consistently to the same value. For example if one wants to
do a statistical analysis of the tokens as they relate to some other field
in a database (without decoding the token), or if one needed to tokenize
in two different systems but be able relate the results. In this case,
one can create a tokenization transformation that is *convergent*.
When enabled at transformation creation time, Vault alters the calculation so that
encoding a plaintext and expiration tokenizes to the same value every time, and
storage keeps only a single entry of that token. Like the exportable mapping
mode, convergence should only be enabled if needed. Convergent tokenization
has a small performance penalty in external stores and a larger one in the
built in store due to the need to avoid duplicate entries and to update
metadata when convergently encoding. It is recommended that if one has some
use cases that require convergence and some that do not, one should create two
different tokenization transforms with convergence enabled on only one.
### Token Lookup
Some use cases may want to lookup the value of a token given its plaintext. Ordinarily
this is contrary to the nature of tokenization where we want to prevent the ability
of an attacker to determine that a token corresponds to a plaintext value (a known
plaintext attack). But for use cases that require it, the
[token lookup](../../../api-docs/secret/transform#token-lookup)
operation is supported, but only in some configurations of the tokenization
transformation. Token lookup is supported when convergence is enabled, or
if the mapping mode is exportable *and* the storage backend is external.
## Performance Considerations
### Builtin (Internal) Store