Rerank with MMR (Maximal Marginal Relevance)
RerankMMR is a diversification technique that balances relevance with diversity to reduce redundancy in search results. It iteratively selects results that are both relevant to the query and dissimilar to already-selected results, ensuring users see varied content instead of near-duplicates.When using the SDKs or curling the endpoint, the query name must match what is defined in the
queries.hx file exactly.How It Works
MMR uses an iterative selection process with the following formula:dis a candidate documentqis the queryd_iare already-selected documentsλ(lambda) controls the relevance vs. diversity trade-offSim1measures relevance to the querySim2measures similarity to selected documents
- Starting with the most relevant result
- For each subsequent position, calculating MMR scores for remaining candidates
- Selecting the candidate with the highest MMR score (balancing relevance and novelty)
- Repeating until all positions are filled
When to Use RerankMMR
Use RerankMMR when you want to:- Diversify search results: Eliminate near-duplicate content and show varied perspectives
- Reduce redundancy: Avoid showing multiple similar articles, products, or documents
- Improve user experience: Provide comprehensive coverage rather than repetitive results
- Balance exploration and relevance: Give users both highly relevant and exploratory options
Parameters
lambda (required)
A value between 0.0 and 1.0 that controls the relevance vs. diversity trade-off:-
Higher values (0.7-1.0): Favor relevance over diversity
- Results will be more similar to original ranking
- Use when relevance is paramount
- Minimal diversification
-
Lower values (0.0-0.3): Favor diversity over relevance
- Results will be maximally varied
- Use when avoiding redundancy is critical
- May include less relevant but diverse results
-
Balanced values (0.4-0.6): Balance relevance and diversity
- Good compromise for most use cases
- Maintains reasonable relevance while reducing duplicates
- Typical recommendation: Start with 0.5-0.7 for most applications
The optimal lambda value depends on your specific use case. Test different values to find what works best for your users.
distance (optional, default: “cosine”)
The distance metric used for calculating similarity:-
“cosine”: Cosine similarity (default)
- Works well for normalized vectors
- Common choice for text embeddings
- Range: -1 to 1
-
“euclidean”: Euclidean distance
- Works well for absolute distances
- Sensitive to vector magnitude
- Good for spatial data
-
“dotproduct”: Dot product similarity
- Works well for unnormalized vectors
- Computationally efficient
- Considers vector magnitude
Example 1: Basic diversification with default cosine distance
Example 2: High diversity with euclidean distance
Example 3: Balanced approach with dot product
Example 4: Dynamic lambda for user-controlled diversity
Chaining with RerankRRF
Combine RRF and MMR for hybrid search with diversification:Best Practices
Retrieve Sufficient Candidates
MMR works best with a larger pool of candidates:Lambda Parameter Tuning
Start with these guidelines and adjust based on your results:- News/Articles: 0.5-0.6 (balance coverage and relevance)
- E-commerce: 0.6-0.7 (favor relevance, some variety)
- Content Discovery: 0.3-0.5 (favor diversity)
- FAQ/Support: 0.7-0.9 (favor relevance)
Choosing Distance Metrics
- Text embeddings: Use “cosine” (default)
- Image embeddings: Use “cosine” or “euclidean”
- Custom vectors: Test all three and compare results
Combining with Filtering
Apply filters before reranking for better performance:Performance Considerations
MMR has O(n²) complexity due to pairwise comparisons. For optimal performance:
- Limit the candidate set (e.g., 100-200 results)
- Consider caching for frequently-accessed queries
- Monitor query latency and adjust candidate size accordingly
- Candidate set size: Larger sets increase computation time quadratically
- Distance metric: Dot product is fastest, euclidean is slowest
- Result count: More results require more iterations
Troubleshooting
Results seem too similar
Problem: Results still show near-duplicates Solutions:- Lower lambda value (try 0.3-0.5)
- Increase candidate pool size
- Verify your vectors capture meaningful differences
Results seem irrelevant
Problem: Diverse results but poor relevance Solutions:- Increase lambda value (try 0.7-0.9)
- Ensure initial search retrieves quality candidates
- Consider using RRF before MMR
Performance issues
Problem: Queries are too slow Solutions:- Reduce candidate pool size
- Use dot product distance metric
- Return fewer final results
- Consider caching popular queries
Use Cases
News and Media
Diversify news articles to show varied perspectives:E-commerce
Show varied product categories while maintaining relevance:Content Discovery
Maximize exploration with diverse recommendations:Document Retrieval
Balance comprehensive coverage with relevance:Related
- RerankRRF - For combining multiple search results
- Rerankers Overview - General reranking concepts
- Vector Search - Basic vector search operations