1. Docs
  2. Cloaked AI
  3. Tools
  4. VectorLens
  1. Docs
  2. Cloaked AI
  3. Tools
  4. VectorLens

VectorLens

VectorLens is a command line tool created to provide developers and security professionals with a straightforward path to find insecure embedding vectors in their datastores and to monitor for new potentially vulnerable data. It’s built on many of the concepts from our discussion of embedding attacks and will be updated as new techniques are introduced.

VectorLens currently supports these embedding models; if you have a need for another to support, you can request it here: [all-minilm-l6-v2, bge-m3, gtr-t5-base, text-embedding-ada-002, text-embedding-3-large]

Use Cases

  • Making the case to non-technical decision makers that the threat posed by unencrypted vector embeddings is real.
  • Monitoring encrypted vector datastores to detect when a team introduces new unencrypted PII.
  • Monitoring unencrypted vector datastores that are intended to be PII free for mistakes.
  • Discovering if Cloaked AI makes sense for your data.

See also the FAQs on the product page.

Installation

VectorLens ships as a single self-contained binary. Pick the download that matches your machine, drop it on your PATH, and you’re ready to go.

macOS (Apple Silicon)

For Macs with an M-series chip (M1, M2, M3, M4). VectorLens uses Apple’s built-in Metal GPU on macOS — there is no CPU-only build, and Intel Macs are not supported.

AccelerationDownload
Metal (Apple GPU)ironcore-vector-lens

Linux — ARM 64-bit

For arm64 / aarch64 Linux machines, such as AWS Graviton or Ampere Altra instances.

AccelerationDownloadCompatibility
CPU onlyironcore-vector-lensRuns on any 64-bit ARM Linux
CUDA 12 (NVIDIA GPU)ironcore-vector-lensmanifest.json
CUDA 13 (NVIDIA GPU)ironcore-vector-lensmanifest.json

Linux — Intel / AMD 64-bit

For standard x86_64 Linux machines — the most common server and desktop architecture.

AccelerationDownloadCompatibility
CPU onlyironcore-vector-lensRuns on any 64-bit Intel/AMD Linux
CUDA 12 (NVIDIA GPU)ironcore-vector-lensmanifest.json
CUDA 13 (NVIDIA GPU)ironcore-vector-lensmanifest.json

Choosing a CUDA build

If you have an NVIDIA GPU, a CUDA build will run faster than the CPU build. Each CUDA download has a manifest.json next to it that lists which NVIDIA driver versions, CUDA toolkit versions, and GPU compute capabilities that binary supports.

To choose the right one:

  1. Run nvidia-smi on your machine to see your installed driver version and GPU model.
  2. Open the manifest.json for the build you’re considering.
  3. Confirm your driver version falls in the supported range and your GPU’s compute capability is listed.

If your GPU isn’t supported by either CUDA build, use the CPU only download — it has no driver requirements and runs on any 64-bit Linux machine.

Install

Download the binary, mark it executable, and move it onto your PATH. For example, installing the x86_64 Linux CUDA 13 build:

bash
curl -L -o ironcore-vector-lens \ https://storage.googleapis.com/vector-lens/releases/{latest_version}/x86_64-unknown-linux-gnu/cuda13/ironcore-vector-lens chmod +x ironcore-vector-lens sudo mv ironcore-vector-lens /usr/local/bin/ ironcore-vector-lens --version

Usage

ironcore-vector-lens scan [OPTIONS] <COMMAND>

Commands

CommandDescription
jsonl-fileScan a JSONL file. Expects id and embedding fields.
parquet-fileScan a Parquet file. Expects id and embedding columns.
helpPrint help for the given subcommand.

Options

-m, --model <MODEL>

Model to scan with. Weights are fetched automatically unless overridden with --svm-weights and --forest-weights.

Currently supported models are: [all-minilm-l6-v2, bge-m3, gtr-t5-base, text-embedding-ada-002, text-embedding-3-large]

-s, --svm-weights <PATH>

Path to custom SVM weights, overriding the default for the given model.

-f, --forest-weights <PATH>

Path to custom forest weights, overriding the default for the given model.

-r, --report-path <PATH>

If present, a shareable PDF report will be generated at the given path.

--license-key <KEY>

Signed license key granting access to this product. Env: LICENSE_KEY.

-v, --verbosity <LEVEL>

Logging level. Default: default.

  • silent — no human output; only the exit code and written files.
  • summary — print a summary at the end; no in-progress output.
  • default — progress while running, plus a summary and inline PII info.

-d, --debug

Write a JSON debug log to ./ironcore_vector_lens_debug.log.

-h, --help

Print help (use -h for a one-line summary).

Examples

Export data for VectorLens

Postgres with pgvector:

psql "$DATABASE_URL" -At -c \
    "SELECT row_to_json(t) FROM (SELECT id, embedding::float4[] AS embedding FROM your_table) t" \
    > embeddings.jsonl

Weaviate:

Python
import json, weaviate from weaviate.classes.init import Auth # Local: weaviate.connect_to_local() # Cloud: as below with weaviate.connect_to_weaviate_cloud( cluster_url=URL, auth_credentials=Auth.api_key(API_KEY), ) as client: coll = client.collections.get("MyCollection") with open("embeddings.jsonl", "w") as f: for obj in coll.iterator(include_vector=True): f.write(json.dumps({ "id": str(obj.uuid), "embedding": obj.vector["default"], }) + "\n")

Milvus:

Python
import json from pymilvus import MilvusClient client = MilvusClient(uri="http://localhost:19530", token="root:Milvus") client.load_collection("my_collection") it = client.query_iterator( collection_name="my_collection", batch_size=1000, limit=-1, filter="", output_fields=["id", "embedding"], ) with open("embeddings.jsonl", "w") as f: while True: batch = it.next() if not batch: it.close() break for row in batch: f.write(json.dumps({"id": row["id"], "embedding": row["embedding"]}) + "\n")

Qdrant:

Python
import json from qdrant_client import QdrantClient client = QdrantClient(url="http://localhost:6333") offset = None with open("embeddings.jsonl", "w") as f: while True: points, offset = client.scroll( collection_name="my_coll", with_vectors=True, with_payload=False, limit=1000, offset=offset, ) for p in points: # p.vector is list[float] for unnamed, dict[str, list[float]] for named. # SparseVector values need .model_dump() before json.dumps. f.write(json.dumps({"id": p.id, "embedding": p.vector}) + "\n") if offset is None: break

Pinecone:

Python
import json, os, time from pinecone import Pinecone pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"]) index = pc.Index(host=os.environ["PINECONE_INDEX_HOST"]) NAMESPACE = "my-ns" FETCH_BATCH = 1000 with open("embeddings.jsonl", "w") as f: buf = [] for id_page in index.list(namespace=NAMESPACE): buf.extend(id_page) while len(buf) >= FETCH_BATCH: chunk, buf = buf[:FETCH_BATCH], buf[FETCH_BATCH:] resp = index.fetch(ids=chunk, namespace=NAMESPACE) for vid, v in resp.vectors.items(): f.write(json.dumps({"id": vid, "embedding": list(v.values)}) + "\n") # stay under 100 req/s/index time.sleep(0.05) if buf: resp = index.fetch(ids=buf, namespace=NAMESPACE) for vid, v in resp.vectors.items(): f.write(json.dumps({"id": vid, "embedding": list(v.values)}) + "\n")

Scan

ironcore-vector-lens scan jsonl-file --license-key XXXX --model text-embedding-ada-002 --report-path vector-lens-report.pdf --path embeddings.jsonl

This would scan the embeddings in the provided .jsonl file of ada-002 embeddings for PII and generate a PDF report showing what PII categories were detected and in which embeddings.

Licensing

VectorLens is available for trial or full production use. Submit a request for a trial license key or contact us for a production license.

Was this page helpful?

One sec... bot checking