- Docs
- SaaS Shield
- Deterministic Encryption
SaaS Shield Deterministic Encryption
Using SaaS Shield to encrypt the sensitive data your applications process and store significantly increases the security and privacy of your system. However, it can have a down side - if you have fields that contain sensitive data and you encrypt them, you will no longer be able to perform searches of your data store for matching records using those fields. One possible solution to this problem is indexing the records in Elasticsearch or OpenSearch while using IronCore’s Cloaked Search product to protect the sensitive fields stored in the search index. If you only need to do exact match searches on sensitive fields though, deterministic encryption offers a simpler alternative. It does not require that you set up a second service to support search - you encrypt the sensitive fields using the Alloy SDK calls very similar to the ones you would normally use and store the encrypted versions in your data store. When you need to search for a particular value you use the SDK to generate an encrypted version of your query value, then you just use your data store to find exact matches for that encrypted query. The encrypted fields can be decrypted using SDK calls much like the ones you would use for fields that don’t need to support search.
How Does It Work?
An explanation of the basics of deterministic encryption is here.
SaaS Shield’s implementation enhances the basic AES-SIV operations. We add functionality to further foil attempts to analyze encrypted data and extract information. The key used to encrypt each tenant’s data is different, so even if two tenants encrypt the same value for a particular field, the deterministically encrypted values will be different. We additionally provide the ability to use a different key for each individual field that is being deterministically encrypted. We provide tools to facilitate the rotation of keys while the data is live, so you don’t need to take the system offline to complete a key rotation.
Secrets
The basic element underlying deterministic encryption in SaaS Shield is the secret. Secrets are associated with KMS Configurations. A secret is randomly generated in the Tenant Security Proxy (TSP) when it is needed, and the secret’s value is encrypted using a tenant’s primary KMS configuration. The encrypted secret is escrowed by the Configuration Broker and distributed to TSPs when they start up. A TSP decrypts a secret (using the associated KMS configuration) when it is required for a deterministic encryption operation, and when the operation is complete, the TSP discards the decrypted secret.
Data Paths
When using the deterministic encryption APIs, the consumer needs to provide a data path along with the tenant ID. This path is essential to ensure that different keys are used to encrypt each data field (like a column in a relational database table or a property in an object). Using the same data path and tenant ID for each instance of that data field will cause the Alloy SDK to use the same key to encrypt that instance, which will allow you to search that field for occurrences of a specific value.
The data path consists of two parts, the secret path and the derivation_path. The TSP maintains a distinct secret for each unique combination of KMS configuration and secret path. That distinct secret is then combined with the derivation path to generate a final encryption key. The caller can decide how to split the data path to control the number of generated secrets. For example, you can generate a separate secret for every field by putting the entire data path in the secret path, or you can have a single secret for each KMS configuration by putting the entire data path in the derivation path.
There are tradeoffs to consider. If a secret must be rotated, all data encrypted with that secret must be re-encrypted, which can be a drawback when using a single secret for the KMS configuration. Managing more secrets will affect pricing, so having a separate secret for each field might not be desirable. A balanced approach could be specifying the object/type/table name in the secret path and the field/property/column name in the derivation path (if multiple fields for the type require deterministic encryption).
Once the data path is set for a specific field, it must remain consistent every time that field is accessed, whether encrypting a new value, decrypting an existing value, or searching for specific values. Changing the data path requires re-encrypting all fields encrypted using the old path, which involves fetching the data, decrypting with the old path, re-encrypting wth the new path, and updating the data store.
Deriving Keys
When the deterministic encryption API is called, the TSP uses the tenant ID to find the current KMS configuration for that tenant. If it doesn’t find a secret for that configuration and secret path, it creates a new secret, encrypts it using the primary KMS configuration, and sends the encrypted secret to the Configuration Broker for escrow.
Once the encrypted secret for the KMS configuration and secret path is found, the TSP decrypts and uses it to derive an encryption key (combining it with the tenant ID and the derivation path). The derived key is then used to encrypt, decrypt, or generate searchable values. After the key is derived, the decrypted secret is discarded.
Secret Migration
Tenants sharing a KMS configuration will share a secret. If one of those tenants then migrates to use a new KMS configuration, SaaS Shield automatically handles the secret migration. TSPs create a copy of the shared secret encrypted with the new KMS configuration. Because the same value is used for both secrets, data does not need to be re-encrypted. The new copy of the secret (and any tenants using it) won’t be affected by rotation of the original.
Although each secret is encrypted in the TSP before it is sent to the Configuration Broker, a fingerprint is computed for each secret and sent and stored along with the encrypted secret. The fingerprint is a cryptographically secure hash of the secret. TSPs use the fingerprints to determine if the Configuration Broker already has a record for a secret. You can use the fingerprint to determine which secrets were copied from an original due to KMS configuration migrations.
Tracking Secret Values
Any time a piece of data is encrypted using a secret, the secret that was used is added to its encrypted data. If you encode the encrypted binary data as a string using an encoding like hex or base64, there will be a prefix on the encoded data that you can search to find all the values associated with a given secret. This facilitates identifying the values that were encrypted for a given tenant and secret.
Secret Rotation
Secrets can be rotated - replaced by a new secret. A goal of deterministic encryption is that each field with the same data path and tenant will have the same value, so when a secret is rotated, all of the data encrypted using that secret needs to be re-encrypted with the new secret. Since this re-encryption process might take a significant amount of time to complete, SaaS Shield provides a rotation process that will allow searches to continue to work correctly during the rotation period.
Each tenant that is using the same KMS configuration can be rotated independently. The vendor can use the Configuration Broker or the Vendor API to initiate rotation for a tenant’s secret. Once the rotation is seen by a TSP, a new secret will be created. Any new data encrypted for that tenant will use the new secret. While rotation is in progress any calls to generate search strings will return two values, one encrypted using the secret that is in rotation and one encrypted using the new secret. Your application can choose how to handle the two search strings, but for the most correct search results, it should look for fields that match either of the two values.
Once rotation has been initiated, your system should go through the data stores and find any values that were encrypted using the secret ID that is being rotated (see Tracking Secret Values. Decrypt each value and encrypt it again using the same tenant and secret path (see rotateField - this will automatically use the new secret for the encryption. The new value should replace the existing one in the data store. When the process has completed, you can use the Configuration Broker or Vendor API to commit rotation for that tenant. After rotation is committed calls to generate search strings will only return one value (generated using the new secret).
In order to limit the number of possible search strings, we limit each KMS configuration to only have two secrets, one in rotation and one current. Rotation cannot be initiated again until the in-process rotation has been completed.
Recovering Data
If you need to recover data encrypted using IronCore’s deterministic encryption feature in SaaS Shield without using SaaS Shield, you can independently decrypt the data in cooperation with your tenants. The Configuration Broker provides a button to download a .zip file of encrypted secret data. The contents of the file look like this:
JSON{ "result": [ { "tenantProvidedId": "test-tenant-four", "numericId": 18, "id": "b30217e5-9962-4021-9208-cb122d276441", "tenantSecretId": 122, "secretFingerprint": "C5RjRgfjkQ+IAim/25Z/tGwh6HKtXGqZKUcchPL4wHo=", "secretPath": "customerRec", "kmsConfigId": 495, "migrationStatus": 1, "encryptedSecret": "xxxxxx", "rotationStatus": 1, "secretType": 2, "created": 1689187641725, "updated": 1689187641725 }, ... ] }
Each of the encrypted secrets can be decrypted using the KMS configuration that was used to encrypt it. There is an explanation of the process to follow here. A given encrypted data value will have a six-byte prefix containing the tenant secret ID (six bytes of binary data, which might expand if you encode that data using a string representation like base64 or hex). Find the encrypted secret you need for a given tenant secret ID and secret path, then work with your tenant to get access to the KMS to decrypt the encrypted secret (or provide them with the secrets to decrypt). Once you have recovered the secret, you can use the HMAC-SHA512 algorithm with the secret as the key to hash a string containing the tenant provided ID, a hyphen, and the derivation path. The output of the hash is the key you can provide to the AES-SIV algorithm to decrypt their encrypted data, after you have stripped the prefix that specifies the secret ID.
A full NodeJS code example of this process is available in our TSC-node disaster recovery example. The same process laid out there can be applied in any language.
Pro Tip
If your sensitive fields don’t require exact match search, we strongly recommend that you use the randomized encryption and decryption functions provided by the Alloy SDK rather than the deterministic encryption functions. That prevents anyone with access to your data store from identifying records that have the same value for a sensitive field.