Cloaked AI

Unleash generative AI projects that use private data by protecting sensitive vector embeddings with searchable data-in-use encryption.

World-class companies trust IronCore Labs

Broadcom HubSpot Zendesk Norwegian Cruise Line Holdings

AI vector embeddings you can encrypt and still query

Reduce risk by protecting data at the application layer

Application-layer encryption (ALE) provides strong protection against breaches, unauthorized insider access, injection attacks, and cloud misconfigurations.

Protects your AI data while allowing GenAI workflows

Vector embeddings are vulnerable to inversion attacks that expose the source data including PII, diagnoses, images, biometrics and more.

Comply with privacy and security regulations

Meet data protection obligations without losing functionality. Comply with privacy laws and pending AI regulations by encrypting the data.

Easy to use: just a few lines of code to integrate

Load the Cloaked AI SDK and then call encrypt before storing, and call encrypt on any search vectors before querying.

Why AI data matters

AI has a memory and that memory knows everything

Machine learning systems are hungry for data. More data makes them more useful. But it also makes them a bigger target. Private, confidential, and regulated data remains a risk even when it's represented as a vector embedding. Which is why it's incredibly important to build your systems right with security and privacy by design from the outset.

Use Cases

AI techniques protected by Cloaked AI

Here are a few examples of how Cloaked AI protects embeddings:

Recommendation systems

What are recommendation systems?

AI recommendation systems suggest similar or related items based on insights from data sets or histories. Traditionally used to enhance consumer shopping experiences or to provide food and media recommendations (what you see in your Netflix app), recommendation systems are now used in apps for lawyers, financial managers, and even healthcare workers.

Retrieval Augmented Generation (RAG)

What is retrieval augmented generation (RAG)?

RAG is a pattern that allows a generally intelligent AI model to answer questions based on data that it wasn't trained on, which may be private or sensitive. RAG is typically used to build out question/answer support over knowledge bases or pools of private data. This is most commonly achieved by putting the sensitive data into vector databases, using those databases to find material relevant to a given query, then providing that material as context to the model so it can answer the question. This is also used to keep AI systems from "hallucinating" plausible but false answers in an application commonly called grounding.

Biometric systems

What are biometric systems?

Biometric systems include face recognition, speech recognition, fingerprint recognition, iris recognition, author recognition (based on writing style), and behavior recognition. Because these things inherently have variances at the sensor level, at least, a vector database is a common choice for storing them. Two faces are the same if they are sufficiently similar and a search can be done for all similar faces in a repository. The same is true for voiceprints, fingerprints, and so on.

Anomaly and fraud detection

What are anomaly detection and fraud detection AI systems?

AI brings intelligence to the task of labeling and identifying data sets and clustering those sets so that outliers, anomalies, and other patterns can be detected. Vector databases can be used to group sets of data such as known bad behaviors (fraud, etc.) and known good behaviors. It can then study new behaviors and see if they're similar to good behaviors, bad behaviors, or if they are anomalous and worthy of investigation. Vector databases have the benefit of allowing AI systems to constantly learn and evaluate so bad behaviors can be identified and counteracted in real-time.

Similar image search

What is similar image search?

Also called reverse image search, similar image search powers tools like Google Images and TinEye where you upload an image, and a search is conducted for similar ones. Embedding models are used to capture meaning and information from the images in the form of vector embeddings. These embeddings are stored in vector databases, which perform nearest neighbor searches to find similar images.

Semantic text search

What is semantic text search?

Semantic search is another way of saying meaning-based search. Unlike keyword searches, which find a specific word or its synonyms, semantic search queries over concepts and meanings so even if different words are used it can find matches. It also avoids matches against words with multiple meanings where some are unrelated to the query. This is the future of text-based search.

Getting started

Built for easy integration and quick adoption

Using Cloaked AI is straightforward. The examples below show how to encrypt a vector before saving it when using Cloaked AI together with SaaS Shield to handle the key management concerns.

Python
# pypi: ironcore-alloy plaintext = PlaintextVector([1.2, -1.23, 3.24, 2.37], "contacts", "conversation-sentiment") metadata = AlloyMetadata.new_simple("tenant-123") encrypted = await sdk.vector().encrypt(plaintext, metadata) # Store off encrypted_vector and paired_icl_info in your chosen vector database
Java
// maven: ironcore-alloy PlaintextVector plaintext = new PlaintextVector(List.of(1.2f, -1.23f, 3.24f, 2.37f), "contacts", "conversation-sentiment"); AlloyMetadata metadata = AlloyMetadata.Companion.newSimple("tenant-123"); EncryptedVector encrypted = sdk.vector().encrypt(plaintext, metadata, null); // Store off encryptedVector and pairedIclInfo in your chosen vector database

Watch the short demo video

This demo walks through an example that uploads embeddings to Pinecone and modifies the code to first encrypt the embeddings and queries. You will learn about the threats, the code, how the encryption works, and how it impacts performance.

Play: Play: Cloaked AI Demo: Part One

Easy integration

Integrate with your existing vector database or index

Cloaked AI is deployed as an open-source (AGPL dual license) SDK with support for Weaviate, Elastic, Pinecone, Qdrant, pgvector, and more. The vector, once encrypted, is largely meaningless, but common operations like nearest neighbor searches, clustering, etc., can still take place over the data. Only those allowed access to the key can make sense of the vectors that are stored.

Even if the vector database is hosted by a third-party and even if it is in another country, your data remains safe from foreign subpoenas, breaches, curious service provider employees, and more.

Data Text, Image,Audio, etc. DATA Model Model Embedding Vector Embeddings Plus Plus Vector Database Encryption Key Cloaked AI Encrypted Embeddings
Qdrant
Pinecone
Chroma
Weaviate
OpenSearch
Redis
Vespa
Elasticsearch
KX
Milvus
LanceDB
MongoDB
pgvector
SQLite-vss

Application-layer encryption

Data protection for cloud servers

Encryption requires the correct key to unlock the data. But when you use infrastructure encryption like database or disk encryption, the key has already unlocked the data and stays in memory for as long as the server is running. This doesn’t do much to protect data in the cloud.

The solution is application-layer encryption (ALE), which encrypts data before storing it and keeps the key(s) separate from the data. It’s a simple, but powerful concept that often prevents curious admins, network breaches, cloud misconfigurations, and application vulnerabilities from compromising data.

Cloaked AI FAQ

Learn more about protecting AI data

Whether it’s your sensitive intellectual property, financial information, healthcare information, consumer personal information, or just non-public information at a public company, you need to protect data that can be abused.

Protect your data with non-transparent, application-layer encryption that provides the security and privacy for the data you hold, regardless of where that data lives.