- Docs
- SaaS Shield
- What is Deterministic Encryption?
What is Deterministic Encryption?
First and foremost, deterministic encryption is still encryption. Given a block of bytes and a key, a deterministic encryption algorithm produces a new block of bytes that looks random. If you have the key, you can apply it to the encrypted data to recover the original bytes. But without the key, your only option to determine what the original data was is to randomly try each possible key on each possible block of bytes until you find a match to the encrypted data. That’s going to take a long time when the keys are 32 bytes long.
Our normal encryption in SaaS Shield is randomized; that is, when you encrypt a piece of data, some random bytes are added to it before encrypting. This way, if you encrypt the same data twice, even if you use the same key, you will get two different blocks of encrypted data. This makes it a lot harder to try to detect patterns in the encrypted data that might help you guess the original, even if you know some details like the frequency distribution of the original data that was encrypted.
A consequence of this security is that once you encrypt a piece of data, you can’t do a search to see if the encrypted version of that data exists in your data store. This can severely degrade the functionality of apps that need to search for records containing a certain data item, if that item is encrypted. Deterministic Encryption is a solution to that problem. As the name implies, this variation on encryption does not have any randomness - given a key and a piece of data, the encrypted version of the data is always the same. This allows you to encrypt each data item using a key and store the encrypted data. Then when you want to do a search, you encrypt the search data with the same key and look for exact matches to that encrypted data in your data store.
How Does It Work?
SaaS Shield normally uses the AES GCM algorithm to encrypt data, providing a randomly generated initialization vector (IV) with each data item that is encrypted. A simplistic approach to making this algorithm deterministic would be to always use the same IV for every encryption. However, that introduces weaknesses in the encryption that can be exploited, particularly as the amount of data encrypted using the same key increases. An alternative that provides better security is the AES SIV algorithm. This encryption technique was introduced in 2008. It still uses the AES block cipher, but it adds a technique to synthesize an initialization vector from the key. This reduces the ability of an attacker to extract information by analyzing the encrypted data, while maintaining the deterministic nature of the process.