--- layout: docs page_title: Transform - Secrets Engines description: >- The Transform secrets engine for Vault performs secure data transformation. --- # Transform Secrets Engine -> **Note**: This secret engine requires [Vault Enterprise](https://www.hashicorp.com/products/vault/) with the Advanced Data Protection Transform Module. The Transform secrets engine handles secure data transformation and tokenization against provided input value. Transformation methods may encompass NIST vetted cryptographic standards such as [format-preserving encryption (FPE)](https://en.wikipedia.org/wiki/Format-preserving_encryption) via [FF3-1](https://csrc.nist.gov/publications/detail/sp/800-38g/rev-1/draft), but can also be pseudonymous transformations of the data through other means, such as masking. The secret engine currently supports `fpe`, `masking`, and `tokenization` as data transformation types. ## Setup Most secrets engines must be configured in advance before they can perform their functions. These steps are usually completed by an operator or configuration management tool. 1. Enable the Transform secrets engine: ```text $ vault secrets enable transform Success! Enabled the transform secrets engine at: transform/ ``` By default, the secrets engine will mount at the name of the engine. To enable the secrets engine at a different path, use the -path argument. 1. Create a named role: ```text $ vault write transform/role/payments transformations=ccn-fpe Success! Data written to: transform/role/payments ``` 1. Create a transformation: ```text $ vault write transform/transformation/ccn-fpe \ type=fpe \ template=ccn \ tweak_source=internal \ allowed_roles=payments Success! Data written to: transform/transformation/ccn-fpe ``` 1. Optionally, create a template: ```text $ vault write transform/template/ccn \ type=regex \ pattern='(\d{4})[- ](\d{4})[- ](\d{4})[- ](\d{4})' \ encode_format='$1-$2-$3-$4' \ decode_formats=last-four='$4' \ alphabet=numerics Success! Data written to: transform/template/ccn ``` 1. Optionally, create an alphabet: ```text $ vault write transform/alphabet/numerics \ alphabet="0123456789" Success! Data written to: transform/alphabet/numerics ``` ## Usage After the secrets engine is configured and a user/machine has a Vault token with the proper permission, it can use this secrets engine to encode and decode input values. 1. Encode some input value using the `/encode` endpoint with a named role: ```text $ vault write transform/encode/payments value=1111-2222-3333-4444 Key Value --- ----- encoded_value 9300-3376-4943-8903 ``` A transformation must be provided if the role contains more than one transformation. A tweak must be provided if the tweak source for the transformation is "supplied". 1. Decode some input value using the `/decode` endpoint with a named role: ```text $ vault write transform/decode/payments value=9300-3376-4943-8903 Key Value --- ----- decoded_value 1111-2222-3333-4444 ``` A transformation must be provided if the role contains more than one transformation. A tweak must be provided if the tweak source for the transformation is "supplied" or "generated". 1. Decode some input value using the `/decode` endpoint with a named role and decode format: ```text $ vault write transform/decode/payments/last-four value=9300-3376-4943-8903 Key Value --- ----- decoded_value 4444 ``` A transformation must be provided if the role contains more than one transformation. A tweak must be provided if the tweak source for the transformation is "supplied" or "generated". A decode format can optionally be provided. If one isn't provided, the decoded output will be formatted to match the template's pattern as in the previous example. ## Roles, Transformations, Templates, and Alphabets The Transform secrets engine contains several types of resources that encapsulate different aspects of the information required in order to perform data transformation. - **Roles** are the basic high-level construct that holds the set of transformation that it is allowed to performed. The role name is provided when performing encode and decode operations. - **Transformations** hold information about a particular transformation. It contains information about the type of transformation that we want to perform, the template that it should use for value detection, and other transformation-specific values such as the tweak source or the masking character to use. - **Templates** allow us to determine what and how to capture the value that we want to transform. - **Alphabets** provide the set of valid UTF-8 character contained within both the input and transformed value on FPE transformations. ## Transformations ### Format Preserving Encryption Format Preserving Encryption (FPE) performs cryptographically secure transformation via FF3-1 to encode input values while maintaining its data format and length. #### Tweak and Tweak Source FF3-1 uses a non-confidential parameter called the tweak along with the ciphertext when performing encryption and decryption operations. The tweak is precisely a 7-byte value. The secret engine consumes a base64 encoded string of this value for its encode and decode operation whenever this input is required. In order to simplify the flow of encoding and decoding operations, transformation creation can take care of generating and associating a tweak value. This allows applications to provide a single value without having the need to generate or store any other metadata. In cases where more granularity is required, a tweak value can be generated by Vault and returned, or it may be independently generated and provided. In summary, there are three ways in which the tweak value may be sourced: - `supplied`: This is the default behavior for FPE transformations. The tweak value must be generated externally, and supplied into the on encode and decode operations. - `generated`: The secret engine will take care of generating the tweak value on encode operations and return this back as part of the response along with the encoded value. It is up to the application to store this value so that it can be provided back when decoding the encoded value. - `internal`: The secret engine will generate an internal tweak value per transformation. This value is not returned on encode or decode operations since it gets re-used for all encode and decode operations for the transformation. Depending on the uniqueness of the dataset, this mode may introduce higher risks, but provides the most convenience since the value does not need to be stored separately. This mode should only be used if the values being encoded are sufficiently unique. Your team and organization should weigh the trade-offs when it comes to choosing the proper tweak source to use. For `supplied` and `internal` sourcing, please see [FF3-1 Tweak Usage Details](transform/ff3-tweak-details) #### Input Limits FF3-1 specifies both minimum and maximum limits on the length of an input. These limits are driven by the security goals, making sure that for a given alphabet the input size does not leave the input guessable by brute force. Given an alphabet of length A, an input length L is valid if: - L >= 2, - AL >= 1,000,000 - and L <= 2 \* floor(logA(296)). As a concrete example, for handling credit card numbers, A is 10, L is 16, so valid input lengths would be between 6 and 56 characters. This is because 106=1,000,000 (already greater than 2), and 2 \* floor(log10(296)) = 56. Of course, in the case of credit card numbers valid input would always be between 12 and 19 decimal digits. #### Output Limitations After transformation and formatting by the template, the value is an encrypted version of the input with the format preserved. However, the value itself may be _invalid_ with respect to other standards. For example the output credit card number may not validate (it likely won't create a valid check digit). So one must consider when the outputs are stored whether validation in storage may reject them. ### Masking Masking performs replacement of matched characters on the input value with a desired character. This form of transformation is non-reversible and thus does not support retrieving the original value back using the decode operation. ### Tokenization [Tokenization](transform/tokenization) exchanges a sensitive value for an unrelated value called a _token_. The original sensitive value cannot be recovered from a token alone, they are irreversible. #### Inputs Tokenization inputs are not processed by templates or alphabets, as they do not preserve any of the contents or format of the input. #### Outputs Tokenization is not format preserving. The token output is a Base58 encoded string value of unrelated length, and is not rendered by a template. The decoded value is returned verbatim as it was before encoding. #### Metadata As tokenization isn't format preserving and is stateful, the input values can be any length, subject to other limits in Vault's request processing. In addition, non-sensitive _metadata_ can be encoded alongside the value, and retrieved either with or independently of the original value. #### Operations In addition to encode and decode, as tokenization is stateful, it provides two additional operations: - Retrieve metadata given a token. - Check whether an input value has a valid, unexpired token. - For some configurations, retrieve a previously encoded token for a plaintext input. #### Stores Tokenization is stateful. Tokenized state can be stored internally (the default) or in an external store. Currently only PostgreSQL and MySQL are supported for external storage. #### Mapping Modes [Tokenization](transform/tokenization) stores the results of an encode operation in storage using a cryptographic construct that enhances the safety of its values. In the `default` mapping mode, the token itself is transformed via a one way function involving the transform key and elements of the token. As Vault does not store the token, the values in Vault storage themselves cannot be used to retrieve original input. A second mapping mode, `exportable` is provided for cases where operators may need to recover the full set of decoded inputs in an emergency via the export operation. It is strongly recommended that one use the `default` mode if possible, as it is resistant to more types of attack. #### Convergent Tokenization In addition, tokenization transformations may be configured as *convergent*, meaning that tokenizing a plaintext and expiration more than once results in the same token value. Enabling convergence has performance and security [considerations](transform/tokenization#convergence). ## Deletion Behavior The deletion of resources, aside from roles, is guarded by checking whether any other related resources are currently using it in order to avoid accidental data loss of any encoded value that may depend on these bits of information to decode and reconstruct the original value. Role deletion can be safely done since the information related to the transformation itself is contained within transformation object and its related resources. The following rules applies when it comes to deleting a resource: - A transformation cannot be deleted if it's in use by a role. - A template or store cannot be deleted if it's in use by a transformation. - An alphabet cannot be deleted if it's in use by a template. ## Provided Builtin Resources The secret engine provides a set of builtin templates and alphabets that are considered common. Builtin templates cannot be deleted, and the prefix "builtin/" on template and alphabet names is a reserved keyword. ### Templates The following builtin templates are available for use in the secret engine: - builtin/creditcardnumber - builtin/socialsecuritynumber Note that these templates only check for the matching pattern(s), and not the validity of the value itself. For instance, the builtin credit card number template can determine whether the provided value is in the format of commonly issued credit cards, but not whether the credit card is a valid number from a particular issuer. Templates currently only accept regular expressions as the matching pattern type. It uses Go's standard library for the regexp engine, which supports [the RE2 syntax](https://github.com/google/re2/wiki/Syntax). **Note**: The `builtin/any` template is only valid and is the default for the tokenization transform. ### Alphabets The following builtin alphabets are available for use in the secret engine: - builtin/numeric - builtin/alphalower - builtin/alphaupper - builtin/alphanumericlower - builtin/alphanumericupper - builtin/alphanumeric Custom alphabets must contain between 2 and 65536 unique characters. ### Stores The following builtin store is available (and is the default) for tokenization transformations: - builtin/internal ## Tutorial Refer to the [Transform Secrets Engine](https://learn.hashicorp.com/vault/adp/transform) tutorial to learn how to use the Transform secrets engine to handle secure data transmission and tokenization against provided secrets. ## API The Transform secrets engine has a full HTTP API. Please see the [Transform secrets engine API](/api-docs/secret/transform) for more details.