- Docs
- Cloaked AI
- Tools
- VectorLens
VectorLens
VectorLens is a command line tool created to provide developers and security professionals with a straightforward path to find insecure embedding vectors in their datastores and to monitor for new potentially vulnerable data. It’s built on many of the concepts from our discussion of embedding attacks and will be updated as new techniques are introduced.
VectorLens currently supports these embedding models; if you have a need for another to support, you can request it here: [all-minilm-l6-v2, bge-m3, gtr-t5-base, text-embedding-ada-002, text-embedding-3-large]
Use Cases
- Making the case to non-technical decision makers that the threat posed by unencrypted vector embeddings is real.
- Monitoring encrypted vector datastores to detect when a team introduces new unencrypted PII.
- Monitoring unencrypted vector datastores that are intended to be PII free for mistakes.
- Discovering if Cloaked AI makes sense for your data.
See also the FAQs on the product page.
Installation
VectorLens ships as a single self-contained binary. Pick the download that matches your machine, drop it on your PATH, and you’re ready to go.
macOS (Apple Silicon)
For Macs with an M-series chip (M1, M2, M3, M4). VectorLens uses Apple’s built-in Metal GPU on macOS — there is no CPU-only build, and Intel Macs are not supported.
| Acceleration | Download |
|---|---|
| Metal (Apple GPU) | ironcore-vector-lens |
Linux — ARM 64-bit
For arm64 / aarch64 Linux machines, such as AWS Graviton or Ampere Altra instances.
| Acceleration | Download | Compatibility |
|---|---|---|
| CPU only | ironcore-vector-lens | Runs on any 64-bit ARM Linux |
| CUDA 12 (NVIDIA GPU) | ironcore-vector-lens | manifest.json |
| CUDA 13 (NVIDIA GPU) | ironcore-vector-lens | manifest.json |
Linux — Intel / AMD 64-bit
For standard x86_64 Linux machines — the most common server and desktop architecture.
| Acceleration | Download | Compatibility |
|---|---|---|
| CPU only | ironcore-vector-lens | Runs on any 64-bit Intel/AMD Linux |
| CUDA 12 (NVIDIA GPU) | ironcore-vector-lens | manifest.json |
| CUDA 13 (NVIDIA GPU) | ironcore-vector-lens | manifest.json |
Choosing a CUDA build
If you have an NVIDIA GPU, a CUDA build will run faster than the CPU build. Each CUDA download has a manifest.json next to it that lists which NVIDIA driver versions, CUDA toolkit versions, and GPU compute capabilities that binary supports.
To choose the right one:
- Run
nvidia-smion your machine to see your installed driver version and GPU model. - Open the
manifest.jsonfor the build you’re considering. - Confirm your driver version falls in the supported range and your GPU’s compute capability is listed.
If your GPU isn’t supported by either CUDA build, use the CPU only download — it has no driver requirements and runs on any 64-bit Linux machine.
Install
Download the binary, mark it executable, and move it onto your PATH. For example, installing the x86_64 Linux CUDA 13 build:
bashcurl -L -o ironcore-vector-lens \ https://storage.googleapis.com/vector-lens/releases/{latest_version}/x86_64-unknown-linux-gnu/cuda13/ironcore-vector-lens chmod +x ironcore-vector-lens sudo mv ironcore-vector-lens /usr/local/bin/ ironcore-vector-lens --version
Usage
ironcore-vector-lens scan [OPTIONS] <COMMAND>
Commands
| Command | Description |
|---|---|
jsonl-file | Scan a JSONL file. Expects id and embedding fields. |
parquet-file | Scan a Parquet file. Expects id and embedding columns. |
help | Print help for the given subcommand. |
Options
-m, --model <MODEL>
Model to scan with. Weights are fetched automatically unless overridden
with --svm-weights and --forest-weights.
Currently supported models are: [all-minilm-l6-v2, bge-m3, gtr-t5-base, text-embedding-ada-002, text-embedding-3-large]
-s, --svm-weights <PATH>
Path to custom SVM weights, overriding the default for the given model.
-f, --forest-weights <PATH>
Path to custom forest weights, overriding the default for the given model.
-r, --report-path <PATH>
If present, a shareable PDF report will be generated at the given path.
--license-key <KEY>
Signed license key granting access to this product. Env: LICENSE_KEY.
-v, --verbosity <LEVEL>
Logging level. Default: default.
silent— no human output; only the exit code and written files.summary— print a summary at the end; no in-progress output.default— progress while running, plus a summary and inline PII info.
-d, --debug
Write a JSON debug log to ./ironcore_vector_lens_debug.log.
-h, --help
Print help (use -h for a one-line summary).
Examples
Export data for VectorLens
Postgres with pgvector:
psql "$DATABASE_URL" -At -c \
"SELECT row_to_json(t) FROM (SELECT id, embedding::float4[] AS embedding FROM your_table) t" \
> embeddings.jsonl Weaviate:
Pythonimport json, weaviate from weaviate.classes.init import Auth # Local: weaviate.connect_to_local() # Cloud: as below with weaviate.connect_to_weaviate_cloud( cluster_url=URL, auth_credentials=Auth.api_key(API_KEY), ) as client: coll = client.collections.get("MyCollection") with open("embeddings.jsonl", "w") as f: for obj in coll.iterator(include_vector=True): f.write(json.dumps({ "id": str(obj.uuid), "embedding": obj.vector["default"], }) + "\n")
Milvus:
Pythonimport json from pymilvus import MilvusClient client = MilvusClient(uri="http://localhost:19530", token="root:Milvus") client.load_collection("my_collection") it = client.query_iterator( collection_name="my_collection", batch_size=1000, limit=-1, filter="", output_fields=["id", "embedding"], ) with open("embeddings.jsonl", "w") as f: while True: batch = it.next() if not batch: it.close() break for row in batch: f.write(json.dumps({"id": row["id"], "embedding": row["embedding"]}) + "\n")
Qdrant:
Pythonimport json from qdrant_client import QdrantClient client = QdrantClient(url="http://localhost:6333") offset = None with open("embeddings.jsonl", "w") as f: while True: points, offset = client.scroll( collection_name="my_coll", with_vectors=True, with_payload=False, limit=1000, offset=offset, ) for p in points: # p.vector is list[float] for unnamed, dict[str, list[float]] for named. # SparseVector values need .model_dump() before json.dumps. f.write(json.dumps({"id": p.id, "embedding": p.vector}) + "\n") if offset is None: break
Pinecone:
Pythonimport json, os, time from pinecone import Pinecone pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"]) index = pc.Index(host=os.environ["PINECONE_INDEX_HOST"]) NAMESPACE = "my-ns" FETCH_BATCH = 1000 with open("embeddings.jsonl", "w") as f: buf = [] for id_page in index.list(namespace=NAMESPACE): buf.extend(id_page) while len(buf) >= FETCH_BATCH: chunk, buf = buf[:FETCH_BATCH], buf[FETCH_BATCH:] resp = index.fetch(ids=chunk, namespace=NAMESPACE) for vid, v in resp.vectors.items(): f.write(json.dumps({"id": vid, "embedding": list(v.values)}) + "\n") # stay under 100 req/s/index time.sleep(0.05) if buf: resp = index.fetch(ids=buf, namespace=NAMESPACE) for vid, v in resp.vectors.items(): f.write(json.dumps({"id": vid, "embedding": list(v.values)}) + "\n")
Scan
ironcore-vector-lens scan jsonl-file --license-key XXXX --model text-embedding-ada-002 --report-path vector-lens-report.pdf --path embeddings.jsonl
This would scan the embeddings in the provided .jsonl file of ada-002 embeddings for PII and generate a PDF report showing what PII categories were detected and in which embeddings.
Licensing
VectorLens is available for trial or full production use. Submit a request for a trial license key or contact us for a production license.