HelixDB Docs

Rerank with RRF (Reciprocal Rank Fusion)

RerankRRF is a powerful technique for combining multiple ranked lists without requiring score calibration. It’s particularly effective for hybrid search scenarios where you want to merge results from different search methods like vector search and BM25 keyword search.

::RerankRRF()           // Uses default k=60
::RerankRRF(k: 60)      // Custom k parameter

When using the SDKs or curling the endpoint, the query name must match what is defined in the queries.hx file exactly.

How It Works

RRF uses a simple but effective formula to combine rankings:

RRF_score(d) = Σ 1/(k + rank_i(d))

Where:

d is a document/result
rank_i(d) is the rank of document d in the i-th result list
k is a constant that controls the impact of ranking differences

The algorithm works by:

Taking multiple ranked lists of results
For each item, calculating its RRF score based on its positions across all lists
Re-ranking items by their combined RRF scores
Items that appear highly ranked in multiple lists get boosted

When to Use RerankRRF

Use RerankRRF when you want to:

Merge multiple search methods: Combine vector search with BM25, or multiple vector searches with different embeddings
Avoid score normalization: RRF doesn’t require calibrating or normalizing scores from different search systems
Boost consensus results: Items that rank highly across multiple search strategies will be prioritized
Implement hybrid search: Create a unified ranking from complementary search approaches

Parameters

k (optional, default: 60)

Controls how much weight is given to ranking position:

Higher values (e.g., 80-100): Reduce the impact of ranking differences, treat all highly-ranked items more equally
Lower values (e.g., 20-40): Emphasize top-ranked items more strongly, create sharper distinctions
Default (60): Provides a balanced approach suitable for most use cases

Start with the default value of 60 and adjust based on your specific data distribution and requirements.

Example 1: Basic reranking with default parameters

QUERY SearchDocuments(query_vec: [F64]) =>
    results <- SearchV<Document>(query_vec, 100)
        ::RerankRRF()
        ::RANGE(0, 10)
    RETURN results

QUERY InsertDocument(vector: [F64], title: String, content: String) =>
    document <- AddV<Document>(vector, {
        title: title,
        content: content
    })
    RETURN document

Here’s how to run the query using the SDKs or curl

from helix.client import Client

client = Client(local=True, port=6969)

documents = [
    {"vector": [0.1, 0.2, 0.3, 0.4], "title": "Machine Learning Basics", "content": "Introduction to ML"},
    {"vector": [0.2, 0.3, 0.4, 0.5], "title": "Deep Learning", "content": "Neural networks explained"},
    {"vector": [0.15, 0.25, 0.35, 0.45], "title": "AI Applications", "content": "Real-world AI uses"},
]

for doc in documents:
    client.query("InsertDocument", doc)

query_vector = [0.12, 0.22, 0.32, 0.42]
results = client.query("SearchDocuments", {"query_vec": query_vector})
print("Search results:", results)

Example 2: Custom k parameter for more aggressive reranking

QUERY SearchWithCustomK(query_vec: [F64]) =>
    results <- SearchV<Document>(query_vec, 100)
        ::RerankRRF(k: 30.0)
        ::RANGE(0, 20)
    RETURN results

QUERY InsertDocument(vector: [F64], title: String, content: String) =>
    document <- AddV<Document>(vector, {
        title: title,
        content: content
    })
    RETURN document

Here’s how to run the query using the SDKs or curl

from helix.client import Client

client = Client(local=True, port=6969)

documents = [
    {"vector": [0.1, 0.2, 0.3, 0.4], "title": "Python Tutorial", "content": "Learn Python basics"},
    {"vector": [0.2, 0.3, 0.4, 0.5], "title": "Advanced Python", "content": "Python design patterns"},
    {"vector": [0.15, 0.25, 0.35, 0.45], "title": "Python for Data Science", "content": "Using pandas and numpy"},
]

for doc in documents:
    client.query("InsertDocument", doc)

query_vector = [0.12, 0.22, 0.32, 0.42]
results = client.query("SearchWithCustomK", {"query_vec": query_vector})
print("Search results with custom k:", results)

Example 3: Dynamic k parameter from query input

QUERY FlexibleRerank(query_vec: [F64], k_value: F64) =>
    results <- SearchV<Document>(query_vec, 100)
        ::RerankRRF(k: k_value)
        ::RANGE(0, 10)
    RETURN results

QUERY InsertDocument(vector: [F64], title: String, content: String) =>
    document <- AddV<Document>(vector, {
        title: title,
        content: content
    })
    RETURN document

Here’s how to run the query using the SDKs or curl

from helix.client import Client

client = Client(local=True, port=6969)

documents = [
    {"vector": [0.1, 0.2, 0.3, 0.4], "title": "Web Development", "content": "HTML, CSS, JavaScript"},
    {"vector": [0.2, 0.3, 0.4, 0.5], "title": "Backend Engineering", "content": "APIs and databases"},
    {"vector": [0.15, 0.25, 0.35, 0.45], "title": "DevOps", "content": "CI/CD and infrastructure"},
]

for doc in documents:
    client.query("InsertDocument", doc)

query_vector = [0.12, 0.22, 0.32, 0.42]

# Try with different k values
for k in [30.0, 60.0, 90.0]:
    results = client.query("FlexibleRerank", {
        "query_vec": query_vector,
        "k_value": k
    })
    print(f"Results with k={k}:", results)

Best Practices

Retrieve Sufficient Candidates

Always fetch more results than you ultimately need to give RRF sufficient candidates to work with:

// Good: Fetch 100, return 10
SearchV<Document>(vec, 100)::RerankRRF()::RANGE(0, 10)

// Not ideal: Fetch 10, return 10
SearchV<Document>(vec, 10)::RerankRRF()::RANGE(0, 10)

Parameter Tuning

Start with the default k=60 and adjust based on your observations:

If results seem too similar, try a lower k (30-40) to emphasize top rankings
If you want more variety, try a higher k (80-100) to flatten differences
Test with real queries and evaluate result quality

Combining with Other Operations

RRF works well with filtering and other result operations:

QUERY AdvancedSearch(query_vec: [F64], min_score: F64) =>
    results <- SearchV<Document>(query_vec, 200)
        ::WHERE(_.distance::GT(min_score))  // Filter first
        ::RerankRRF()                        // Then rerank
        ::RANGE(0, 20)                       // Finally limit
    RETURN results

Performance Considerations

RRF is computationally efficient with O(n) complexity, making it suitable for real-time applications. However:

Larger candidate sets require more processing
Balance result quality with query latency
Consider caching frequently-accessed results

Use Cases

E-commerce Product Search

Combine text search with visual similarity for product discovery:

SearchV<Product>(query_embedding, 100)::RerankRRF(k: 50)::RANGE(0, 20)

Document Retrieval

Merge semantic search with keyword matching for comprehensive document retrieval:

SearchV<Document>(semantic_vec, 150)::RerankRRF()::RANGE(0, 10)

Content Recommendation

Blend multiple recommendation signals (user preferences, popularity, recency):

SearchV<Content>(user_profile_vec, 100)::RerankRRF(k: 70)::RANGE(0, 15)

RerankMMR - For diversifying search results
Vector Search - Basic vector search operations
Result Operations - Other result manipulation operations

HelixQL

​Rerank with RRF (Reciprocal Rank Fusion)

​How It Works

​When to Use RerankRRF

​Parameters

​k (optional, default: 60)

​Example 1: Basic reranking with default parameters

​Example 2: Custom k parameter for more aggressive reranking

​Example 3: Dynamic k parameter from query input

​Best Practices

​Retrieve Sufficient Candidates

​Parameter Tuning

​Combining with Other Operations

​Performance Considerations

​Use Cases

​E-commerce Product Search

​Document Retrieval

​Content Recommendation

​Related

Rerank with RRF (Reciprocal Rank Fusion)

How It Works

When to Use RerankRRF

Parameters

k (optional, default: 60)

Example 1: Basic reranking with default parameters

Example 2: Custom k parameter for more aggressive reranking

Example 3: Dynamic k parameter from query input

Best Practices

Retrieve Sufficient Candidates

Parameter Tuning

Combining with Other Operations

Performance Considerations

Use Cases

E-commerce Product Search

Document Retrieval

Content Recommendation

Related