Encrypted Search

Encrypted search is a phrase that is usually shorthand for the process of searching encrypted data for items that match a query string, without actually decrypting the data first. Another term commonly used for this capability is searchable encryption.

There are multiple techniques used to implement searchable encryption, and the appropriate technique depends on the requirements of the search. For instance, you might require the ability to search for text documents that contain one or more keywords, or you might need to search shorter strings for matching substrings.

A simple way to implement search functionality is to just scan all of the data each time you do a search. Unfortunately, this has very poor performance if there is much data, and it becomes even poorer if all the data is encrypted and you need to decrypt it first to scan for matches. Implementing a performant search usually involves creating some sort of index that can be searched more efficiently. This is less straightforward when the system is storing encrypted data; securing all the data with encryption then storing substrings or keywords from the data in plaintext compromises too much privacy and security. In order to mitigate this loss of security, it is necessary to obscure the data in the index as well.

IronCore’s SDKs provide the capability to perform a substring search over short strings using a blind index technique.

Substring Search over Short Strings

This type of search allows you to find data that contains the search query as a substring, not necessarily as a full word. For example, a search for “car” should match both “My car is old” and “escargot”.

Blind Index

A blind index is an approach that obscures the data stored in the search index. Index terms are extracted from the data before it is encrypted, then each of those terms is processed using a keyed hash to produce an index token, which is a representation that cannot be reversed to recover the plaintext. To keep the index data secure, the key for the hash should be different from the key used to encrypt the data, and the hash key should not be known to the service that stores the data.

When a client wants to search the stored data, it extracts index terms from the query, applies the hash algorithm with the same key to generate index tokens, then searches the blind index for matches.

Partitioning the Index

It is not possible to access the tokens stored in the blind index directly to recover portions of the information in the encrypted data. The addition of the hidden key that is used to generate the index terms prevents an attacker with access to the blind index from generating a rainbow table (a list of the hashes of frequently occurring terms that can be matched against terms in the index). However, there is still some potential leakage of information that lessens the security of the encrypted data. It is still possible to use frequency analysis to guess which mappings of plaintext to index tokens are most likely, especially if the attacker knows the domain of the encrypted data (e.g. names or addresses). And if the attacker does know the plaintext for an item of indexed data, they can find other entries in the index that have matching tokens and identify part of the content of those indexed items.

In order to mitigate the impact of these attacks, IronCore’s blind index implementation allows you to associate a partition with each data item that is indexed. The partition name is added to each index term before it is hashed into a token, so that the same term in separate partitions generates a different token. If data can be partitioned into smaller buckets, frequency analysis becomes much less effective, and it is no longer possible to find matches across partitions.

Note that in order to use partitioning, a client that wants to search the index must be able to supply the appropriate partition name. Also, it is not possible to search across partitions.

Indexing Data before Encrypting

When you have a blind index set up, your application can use the IronCore SDKs to generate a set of index tokens that represent the data in a string that you are going to encrypt to protect its contents. The SDK applies a process called transliteration to each string to convert it to a canonical form (all lowercase, punctuation characters removed, and characters converted to an string of equivalent ASCII characters). The transliterated string is then processed to generate a set of index tokens. Your application then stores these tokens associated with the record that contains the string, and it uses the tokens in searches.

Searching Using the Index

If a caller wants to search a collection of encrypted data that has been indexed, it can use the IronCore SDK to generate a set of index tokens for the query string. Your application then searches the stored index tokens to find records that have all of the index tokens generated by your query string. Each record that matches should be returned to your client, which can use the IronCore SDK to decrypt the sensitive data in the record. Because we purposefully obscure some of the data in the index to make it more secure, your client will need to check each record to make sure that the record isn’t a false positive (a potential match that doesn’t actually contain the query string in the sensitive data) before displaying or processing it.

Securing the Index

To minimize the opportunities for an attack to gain access to the blind index where it is stored and use that to extract information about the encrypted data, it is essential to prevent the server that stores the data from also storing the hash key value that is used to generate the index tokens. Otherwise, rainbow tables can be constructed to recover information from the index.

We protect this hash key value using IronCore’s data protection platform. The IronCore SDKs include a method to create a new blind index - this requires the caller to provide the ID of an IronCore group that should be used to protect access to the index. IronCore's orthogonal access control allows you to manage the membership in this goroup even after you have used it to control access to the index, so you can adjust the list of users that can access the search information at any time. The SDK generates a random value for the hash key, encrypts it to the specified group, and returns the encrypted value to the caller to be stored. The SDK also includes a method to initialize a previously created search index for use given the encrypted hash key. The runction decrypts the value and holds it in memory to be used in any SDK methods that generate index tokens.

For the blind index to be used, each user that can index new data before it is encrypted or who can search encrypted data must be a member of the group that was used to encrypt the hash key.

Multiple Indices

A substring search index is best used to index a focused data set - a particular field or subset of fields from records in a database, for instance. It would be possible to use the same index for several disparate fields, or for fields from different types of records, but search results will be better if the user knows the domain of the search and can retrieve only records for that domain. To support this in applications that might handle data across multiple domains, the IronCore SDKs allow the application to create, initialize, index, and search using multiple search indices. For example, the application might use one index for the names of staff members and another index for the addresses of customers. The SDKs provide interfaces to facilitate management of multiple search indices.

Use Cases

Protecting Personal Data that Is Used for Record Location

Suppose you have a front end application that communicates with a back end service you provide, and that your system deals with customer data. Some part of that data is almost certainly Personally Identifiable Information (PII), and you should definitely protect access to that data. You probably store each customer’s name, email address, mailing address, and maybe even some more sensitive information like birth date or social security number. You can use the IronCore SDKs to encrypt that data in the front end application and to only decrypt it at the point of use, in another instance of the application. If your back end service does not need access to this PII data, you can protect the data end to end and eliminate the concern that an attacker or a curious administrator might extract the information from the back end.

However, some parts of the customer data are probably needed by your application to look up a customer’s record, such as the person’s name, email address, and mailing address. This doesn’t eliminate your opportunity to encrypt the data; you can use IronCore’s encrypted search feature to index this data so that you can encrypt the data at the point of origin and store it safely, but you can still use the search features you need to find the right records.

Suppose your application needs to look up customers by name, email, and mailing address. You will probably get the most usability and performance by indexing each of those elements separately. Your application can create three separate indices, and each time a user enters or updates data about a customer, the application uses the IronCore SDK to also generate the index tokens to represent that customer record in each of these indices. Your application and back end service are responsible for storing the index tokens in a persistent store that will allow the back end to search for matches.

Suppose your application needs to search for a customer record given a name. It uses the IronCore SDK, specifying the customer name index and the search string entered by a user, to generate a list of index tokens for the search and retrieve all the matching records from the back end. The IronCore SDK can help your application to decrypt the sensitive data in those records and filter out records that don’t match the query (false positives).

The details on how to put all these pieces together are shown in the Encrypted Search Patterns.

Encrypted Search Patterns

The following implementation patterns illustrate how you can use the IronCore SDKs in your application to do short substring searches of data that is end-to-end encrypted.

The current examples all use IronCore's ironoxide SDK and are implemented using the Rust programming language. However, the ironweb SDK, which is intended for use in web applications written in Javascript, also supports all the encrypted search functions. The functionality is similar - you can find the details in the documentation on the search functions in ironweb here.

Creating an Index

The first step in updating your application to support encrypted search is to create the index that will be used to protect the privacy of the index data. This index encapsulates the information necessary to generate index tokens from strings. The tokenization process extracts index terms from a string, and for each of those terms, it prepends an optional partition name that is provided by your application, then uses a secret salt value (which functions as the secret key) to generate a 32 bit integer index token. As long as that salt remains secret, an attacker cannot create a rainbow table of entries for a given partition name. Our SDKs simplify the process of generating and protecting this salt value, providing a create_blind_index method to handle the details.

There is some setup work that must be done before you can create the index. In order to use an index to process new or updated data or to search, you must be able to access the salt. We protect that salt value using our transform cryptography solution, allowing you to manage which people should have access easily. You just create a scalable encryption group that includes all the users that can enter protected data or need to search for protected records. The ID of this group must be provided when you create a new index. Because IronCore’s transform cryptography supports orthogonal access control, you don’t need to have all the users assigned to the group before you can use it to generate your search index.

The IronCore SDK's create_blind_index method takes the group ID as input, and it generates a random value for the salt, encrypts that value to the group whose ID you provided, and returns the encrypted salt. Your application is responsible for storing that encrypted salt and for providing it when initializing the blind index for future use.

ClientSDKICLBackendCreate Blind Index(groupId)Get public key(groupId)public key for groupgenerate random saltencrypt salt to groupencrypted saltsave encrypted saltOKClientSDKICLBackend

Using IronOxide

This example uses the ironoxide SDK's group_create and create_blind_index methods to set up the group that will be used to protect your blind index's salt and to create the index.

Note: the EncryptedBlindIndexSalt that is returned implements serialization using the serde package, which allows you to decide which of the serde-supported formats to use for serialized value. This example uses JSON for the at-rest representation of the encrypted salt.

This code assumes that there is an initialized instance of the IronOxide SDK available as the object sdk.

let group_id = GroupId::try_from("indexed_search_group")?;
let group_name = GroupName::try_from("PII Search")?;
let opts = GroupCreateOpts::new(
    Some(group_id.clone()), // ID
    Some(group_name),       // name
    true,                   // add as admin
    true,                   // add as user
    None,                   // owner - defaults to caller
    vec![],                 // additional admins
    vec![],                 // additional users
    false,                  // needs rotation
let encrypted_salt = sdk.create_blind_index(&group_id).await?;
let encrypted_salt_str = serde_json::to_string(&encrypted_salt)?;

Preparing an Index for Use

Once you have created a blind index and stored the encrypted salt, your application can start using the index to process new records, update existing records, or search for records. You first need to retrieve the EncryptedBlindIndexSalt that you initially created and serialized. Once you have retrieved the value and deserialized it, you can initialize the index for use, using the initialize_search method, which takes the encrypted salt and uses the SDK to decrypt it.

ClientBackendSDKICLfetch encrypted saltencrypted saltInitialize SDKOKInit Blind Index(encryptedSalt)transform(encrypted salt keys)transformed keysdecrypt keysdecrypt saltOKClientBackendSDKICL

Using IronOxide

This example uses the SDK's EncryptedBlindIndexSalt object and its initialize_search method to set up a blind_index object for further use in your application. It assumes that there is an initialized instance of the IronOxide SDK available as the object sdk.

let encrypted_salt_str = get_encrypted_salt_from_app_server();
let encrypted_salt: EncryptedBlindIndexSalt = serde_json::from_str(&encrypted_salt_str)?;
let blind_index = encrypted_salt.initialize_search(&sdk).await?;

Indexing New Data

Assuming your application has executed the steps shown above in the pattern Preparing an Index for Use, and that the sdk and blind_index are available in your application at the point where you have a new record that has a sensitive data field, you can use code like the following to index the data using the tokenize_data method, then use the SDK to encrypt the field. Let’s assume that you have a customer struct that contains a field name, a string that contains PII, and that you want to make the customer data available to the customerService group. We’ll assume that group has already been created.

In this example, after the customer name has been encrypted, the resulting bytes are base64 encoded, and this string replaces the customer name in the struct. In order to simplify the back end service, a new field name_keys is assumed to be added to the customer record to hold the EDEKs that are needed to decrypt the encrypted name field. Once the index tokens are generated and the PII in the customer record is encrypted, the index tokens and the customer record can be sent together to the back end service for storage.

ClientSDKBackendtokenize data(blind index, PII)token setencrypt data(PII, users and groups)encrypted PII, EDEKsupdate record with encrypted PII, EDEKssave data(record, token set)save recordsave token set for recordOKClientSDKBackend

Using IronOxide

This example uses the ironoxide SDK's tokenize_data and document_encrypt_unmanaged methods to prepare a customer record to be saved. It assumes that there is an initialized instance of the IronOxide SDK available as the object sdk.

let name_tokens = blind_index.tokenize_data(&customer.name, None)?;
let group_id = GroupId::try_from("customerService")?;
let encrypt_opts = DocumentEncryptOpts::with_explicit_grants(
    None,                     // document ID - create unique
    None,                     // document name
    false,                    // don't encrypt to self
    vec![(&group_id).into()], // users and groups to which to grant access
let enc_name = sdk
    .document_encrypt_unmanaged(&customer.name.as_bytes(), &encrypt_opts)
// Replace name with encoded encrypted version. Also need to store EDEKs to decrypt name.
customer.name = base64::encode(&enc_name.encrypted_data());
customer.name_keys = base64::encode(&enc_name.encrypted_deks());
save_customer(&customer, &name_tokens);

Updating an Indexed Field

If you have implemented indexing of the sensitive fields in new records before you encrypt them, as described in Indexing New Data, you will probably need to handle the case where a record is being updated, and one of the sensitive fields that has been indexed is changed. On the front end, handling this follows much the same process as you used for a new record: generate the index tokens for the new value of the field, encrypt the new value, then send the updated record and the new tokens to the back end for storage. However, on the back end, you do need to perform an additional step with the index tokens; before you save the new index tokens associated with the record, you should delete the old set of index tokens for that record. Once the old tokens are deleted, save the new set of tokens the same way you would for a new record, then update the actual record data with the new values.

ClientSDKBackendtokenize data(blind index, new PII)token setencrypt data(new PII, users and groups)encrypted PII, EDEKsupdate record with encrypted PII, EDEKsupdate data(record, token set)delete old tokens for recordupdate recordsave token set for recordOKClientSDKBackend

Searching Using an Index

Once your application has started indexing the values in sensitive fields then encrypting them, you can add the capability to search on the contents of those fields. For our example, if a user of your application wants to search for a customer by name, or some part of the name, your application just extracts a set of index tokens from the search string and using the tokenize_query method, then it uses those tokens to find matching records in your back end service. Given the set of index tokens generated by a search query, any record whose set of index tokens is a superset of the search tokens (i.e. it contains every one of the tokens) is a possible match.

Once your back end has found potential matches and returned the records to the front end, your application needs to make sure that some of the matches are not false positives. To do that, it must use the IronCore SDK to decrypt the field in each record, then confirm that the field does actually match the search query. To do this, the application should apply the transliterate_string method to the decrypted field and to the search query. It should take the transliterated search query, split it into words on white space, and check the transliterated string to ensure that it contains each of the words from the query within it somewhere. For example, suppose your search query was "bei foo". The following

ClientUserSDKBackendget search queryquery stringtokenize query(blind index, query)token setget matching records(token set)matching recordstransliterate string(query)transliterated stringdecrypt data(record PII)decrypted PIItransliterate string(PII)transliterated PIIfilter transliterated PII(transliterated query)record(if match)loop[ for each matching record ]search completeClientUserSDKBackend

Using IronOxide

This example uses the ironoxide SDK's tokenize_query, transliterate_string, and document_decrypt_unmanaged methods to process a user query, fetch matching records, and filter them to display the matching records for the user. It assumes that there is an initialized instance of the IronOxide SDK available as the object sdk and an instance of the BlindSearchIndex called blind_index that was initialized using the encrypted salt from the previous examples. For this example, we use the blocking version of the SDK - sdk is a BlockingIronOxide.

fn filter_customer(sdk: &BlockingIronOxide, cust: &Customer, name_parts: &Vec<&str>) -> Result<Option<String>> {
    let cust_enc_name = base64::decode(&cust.name)?;
    let cust_name_keys = base64::decode(&cust.keys)?;
    let dec_result = sdk.document_decrypt_unmanaged(&cust_enc_name, &cust_name_keys)?;
    let dec_name = str::from_utf8(&dec_result.decrypted_data()).unwrap();
    let dec_name_trans = ironoxide::search::transliterate_string(&dec_name);
    if name_parts.iter().all(|&name_part| dec_name_trans.contains(name_part)) {
    } else {

fn display_matching_customers(sdk: &BlockingIronOxide, name_index: &BlindIndexSearch, query_str: &str) -> Result<()> {
    let query_tokens = name_index.tokenize_query(query_str, None);
    let customer_recs = search_customers(query_tokens);   // returns a Vec<&customerRec>
    let trans_query = ironoxide::search::transliterate_string(&query_str);
    let name_parts: Vec<&str> = trans_query.split_whitespace().collect();
    for cust in customer_recs.iter() {
        let result = filter_customer(&sdk, &cust, &name_parts)?;
        match result {
            Some(decrypted_name) => display_customer(cust, decrypted_name),
            None  => (),

let query_str = get_search_query();
display_matching_customers(&sdk, &blind_index, &query_str);

Using Multiple Indices

So far, our encrypted search patterns have considered a customer record that had a single field of PII, the customer name. Suppose you need to add a second field - an email address, for example. You could use the same blind index to index and search both strings, but search performance will be better if you create a second blind index for the email address. You can just invoke the create_blind_index function twice, once to create an index for the names, and again to create an index for the emails. Since the indices use different salt values, knowledge about any of the contents of the set of index tokens created by one blind index won't provide any insight into the contents of the data protected by the second blind index.

Once you have create the two blind indices, you can use them to index data in your customer record before you encrypt it. Use one blind index to create the tokens for the name field and the other to create the tokens for the email field. We recommend that you maintain these two sets of index tokens separately - you could store them in the back end persistent store in two separate tables or key-value stores, or in the same store with a type discriminator.

You have some options for the actual encryption. You can encrypt each of the fields separately, producing separate encrypted data and EDEKs, or you could put the values into a single structure that you can serialize to a byte stream and encrypt as a single element. This could be a JSON object or a structure that is serialized using protobuf. In either case, you will need to store the encrypted data and the EDEKs, plus the two sets of index tokens.

One consideration of maintaining two separate indices is that you will need to understand the context when a user enters a search query - if the user enters a string that is a potential name, you should generate the index tokens using the name index and search for matches to those tokens in the store of name tokens. Likewise, if the user enters a string that is a potential email address, you would use the email index and search the store of email tokens. If the context of the search query is difficult to determine in your app, you do have the option of generating the index tokens for the query string using each of the blind indices, then searching both of the token stores for matches. This will likely generate additional false positives, but your client-side filtering should be able to handle that.

We do recommend that you use a single scalable encryption group to protect the salt for each of the blind indices. This will simplify the administration of the groups necessary to allow access to everyone who can generate or search data.

Using IronOxide

This example is similar to the one in the pattern Creating an Index, using the ironoxide SDK's create_blind_index method to create two different indices. It assumes that there is an initialized instance of the IronOxide SDK available as the object sdk.

Creating the two indices is straightforward.

let group_id = GroupId::try_from(”indexed_search_group”)?;
let group_name = GroupName:try_from("PII Search")?;
let opts = GroupCreateOpts::new(
    Some(group_id),   // ID
    Some(group_name), // name
    true,             // add as admin
    true,             // add as user
    None,             // owner - defaults to caller
    vec![],           // additional admins
    vec![],           // additional users
    false,            // needs rotation
let name_encrypted_salt = sdk.create_blind_index(&group_id).await?;
let encrypted_salt_str = serde_json::to_string(&name_encrypted_salt)?;
let email_encrypted_salt = sdk.create_blind_index(group_id).await?;
let email_encrypted_salt_str = serde_json::to_string(&email_encrypted_salt)?;
save_encrypted_salts_to_app_server(name_encrypted_salt_str, email_encrypted_salt_str);

Now suppose you have an instance of the IronOxide SDK available as the object sdk, and the two indices in name_index and email_index.

Like the pattern for Indexing New Data, this code uses the SDK's tokenize_data and document_encrypt_unmanaged methods to prepare a customer record to be saved.

let name_tokens = name_index.tokenize_data(customer.name, None)?;
let email_tokens = email_index.tokenize_data(customer.email, None)?;
let group_id = GroupId::try_from(”customerService”)?;
let encrypt_opts = DocumentEncryptOpts::with_explicit_grants(
    None,                                           // document ID - create unique
    None,                                           // document name
    false,                                          // don't encrypt to self
    vec![UserOrGroup::Group{id: group_id.clone()}], // users and groups to which to grant access

// Replace name with encoded encrypted version. Also need to store EDEKs to decrypt name.
let name_result = sdk
    .document_encrypt_unmanaged(&customer.name.as_bytes(), &encrypt_opts)
let customer.name = base64::encode(&name_result.encrypted_data());
let customer.name_keys = base64::encode(&name_result.encrypted_deks());

// Also replace email with encoded encrypted version and store its EDEKs.
let email_result = sdk.document_encrypt_unmanaged(&customer.email.as_bytes(),
let customer.email = base64::encode(&email_result.encrypted_data());
let customer.email_keys = base64::encode(&email_result.encrypted_deks());
save_customer(&customer, &name_tokens, &email_tokens);

Support in other languages

Remember, although the code samples in these patterns were all written in Rust using IronCore's ironoxide, the search functionality is available in our other SDKs as well. In particular, Javascript-based web applications that use the ironweb SDK can access all the encrypted search functions. The details in the documentation on the search functions in ironweb are available here.

How It Works

The following is some more technical information about how the blind index search feature is implemented in the IronCore SDKs.

Indexing Data before Encryption

Generating Index Terms

The IronCore SDKs perform a multi-step process to generate index terms from an input string:

  1. The string is first transliterated. This involves converting the string to a form that is more understandable by someone who might speak a different language. This particular transliteration translates Unicode strings into ASCII characters - accents and other modifiers are removed, and characters in other languages are converted. For example, the string “Æneid” is converted to “AEneid”, and “北亰” is converted to “Bei Jing”.
  2. The transliterated string is converted into lowercase letters, and punctuation characters are removed.
  3. This string is split into words on whitespace boundaries.
  4. The set of all possible trigrams is extracted from each word. A trigram is a string of three consecutive characters; for example, the word “gumby” generates the trigrams “gum”, “umb”, and “mby”. The word “of” generates the trigram “of-” (we use the “-” as padding for shorter strings; this is safe because the “-” is one of the punctuation characters that was stripped in step 2).
  5. These sets of trigrams are unioned together to form a complete set of terms that represents the input string.

Converting Index Terms to Index Tokens

Once an input string has been processed into a list of index terms, those terms are converted into index tokens. Each term is prefixed by the optional partition name and a salt value that serves as the hash key, and a SHA256 hash is computed over the resulting string. This generates a 32-byte binary value; we convert the first 4 bytes into an unsigned 32-bit integer that is the index token.

Random Padding

Another piece of information that can be leaked if an attacker has access to the index entries is an approximate length of the input string. It isn’t precise, because the tokens are a set (so duplicates are ignored), and because we break the string at word boundaries. However, we do take measures to further hide the length of each piece of data indexed by adding a random number of random 32-bit integer values into the collection of index tokens for a string.


To understand the effects that randomization and the partition ID have on the tokens produced, consider the following code snippet:

let name = "J. Fred Muggs";
let pii_tokens = pii_index.tokenize_data(&name, None)?;
println!("{:?}", pii_tokens);
let pii_tokens2 = pii_index.tokenize_data(&name, None)?;
println!("{:?}", pii_tokens2);
let pii_tokens3 = pii_index.tokenize_data(&name, Some("Part1"))?;
println!("{:?}", pii_tokens3);

The string “J. Fred Muggs” produces six index tokens. Due to randomization, each of the lists of tokens will contain at least seven tokens; the lengths might all be different. All of the lists were generated with the same blind index, so they share the same seed value. The first two lists should have six token values in common (it is possible but highly unlikely that there could be a seventh that is the same), since they used the same partition (one with no ID), but it is unlikely that the third list will have any tokens in common with either of the other two lists.

Processing Search Queries

Generating Index Terms

A search query is processed to produce a set of index terms in much the same way that the index tokens were generated when indexing data - transliteration, lowercasing, removing punctuation, splitting into words, generating trigrams, hashing with the partition name and salt. This process does not add any random terms, however.

Once the SDK has generated the set of index tokens, it is the responsibility of the application to search the stored blind index for matching entries. An entry matches if its set of index tokens contains all of the tokens generated for the query string. The application should find all entries that match and return the encrypted data to the client for decryption and processing.

Elimination of False Positives

Because the index tokens are hashes of the index terms and because we truncated the hashes to 32 bits and added random padding, it is possible that the search could return some false positives - that is, strings that had matching index tokens but don’t actually contain the query that was entered. For this reason, the client needs to actually scan the list of matches, decrypt each entry, and eliminate any non-matching entries.

To facilitate this check, the IronCore SDK includes a method that accepts a string and generates the transliterated version. This should be applied to the query string, then applied to each of the decrypted data strings. The client can confirm which of these decrypted transliterated data strings actually contains all of the words in the transliterated query string as substrings.


We Are For

Trust Center

Contact Us

Follow Us