Skip to main content

How do I search for similar vectors in HelixDB?

Use SearchV to find vectors similar to your query vector using cosine similarity.
SearchV<Type>(vector, limit)
Currently, Helix only supports using an array of F64 values to represent the vector. We will be adding support for different types such as F32, binary vectors and more in the very near future. Please reach out to us if you need a different vector type.
When using the SDKs or curling the endpoint, the query name must match what is defined in the queries.hx file exactly.
QUERY SearchVector (vector: [F64], limit: I64) =>
    documents <- SearchV<Document>(vector, limit)
    RETURN documents

QUERY InsertVector (vector: [F64], content: String, created_at: Date) =>
    document <- AddV<Document>(vector, { content: content, created_at: created_at })
    RETURN document
Here’s how to run the query using the SDKs or curl
from datetime import datetime, timezone
from helix.client import Client

client = Client(local=True, port=6969)

vector_data = [0.1, 0.2, 0.3, 0.4]
inserted = client.query("InsertVector", {
    "vector": vector_data,
    "content": "Sample document content",
    "created_at": datetime.now(timezone.utc).isoformat(),
})

result = client.query("SearchVector", {
    "vector": vector_data,
    "limit": 10
})
print(result)

Example 2: Vector search with postfiltering

QUERY SearchRecentDocuments (vector: [F64], limit: I64, cutoff_date: Date) =>
    documents <- SearchV<Document>(vector, limit)::WHERE(_::{created_at}::GTE(cutoff_date))
    RETURN documents

QUERY InsertVector (vector: [F64], content: String, created_at: Date) =>
    document <- AddV<Document>(vector, { content: content, created_at: created_at })
    RETURN document
Here’s how to run the query using the SDKs or curl
from datetime import datetime, timezone, timedelta
from helix.client import Client

client = Client(local=True, port=6969)

vector_data = [0.12, 0.34, 0.56, 0.78]

recent_date = datetime.now(timezone.utc).isoformat()
old_date = (datetime.now(timezone.utc) - timedelta(days=45)).isoformat()

client.query("InsertVector", {
    "vector": vector_data,
    "content": "Recent document content",
    "created_at": recent_date,
})

client.query("InsertVector", {
    "vector": [0.15, 0.35, 0.55, 0.75],
    "content": "Old document content", 
    "created_at": old_date,
})

cutoff_date = (datetime.now(timezone.utc) - timedelta(days=30)).isoformat()

result = client.query("SearchRecentDocuments", {
    "vector": vector_data,
    "limit": 5,
    "cutoff_date": cutoff_date,
})

print(result)

Example 3: Using the built in Embed function

You can also use the built in Embed function to search with text directly without sending in the array of floats. It uses the embedding model defined in your config.hx.json file.
All vectors in a vector type must have the same dimensions. If you change your embedding model (e.g., switching from text-embedding-ada-002 to a different model), the new vectors will have different dimensions and will cause an error. Ensure you use the same embedding model consistently for all vectors.
QUERY SearchWithText (text: String, limit: I64) =>
    documents <- SearchV<Document>(Embed(text), limit)
    RETURN documents

QUERY InsertTextAsVector (content: String, created_at: Date) =>
    document <- AddV<Document>(Embed(content), { content: content, created_at: created_at })
    RETURN document
Here’s how to run the query using the SDKs or curl
from datetime import datetime, timezone
from helix.client import Client

client = Client(local=True, port=6969)

sample_texts = [
    "Introduction to machine learning algorithms",
    "Deep neural networks and AI",
    "Natural language processing techniques"
]

for text in sample_texts:
    client.query("InsertTextAsVector", {
        "content": text,
        "created_at": datetime.now(timezone.utc).isoformat(),
    })

result = client.query("SearchWithText", {
    "text": "machine learning algorithms",
    "limit": 10,
})

print(result)