Retrieval
Hybrid search over indexed documents with vector search, full-text search, filtering, and reranking.
Overview
Section titled “Overview”The retrieval endpoint performs hybrid search combining vector similarity and full-text search. Results can be filtered by built-in fields or user-defined metadata, and reranked for higher relevance.
Basic retrieval
Section titled “Basic retrieval”results = await client.beta.retrieval.retrieve( index_id="<your-index-id>", query="What were the Q3 revenue figures?", top_k=10,)
for result in results.results: print(f"Score: {result.score}, Rerank: {result.rerank_score}") print(result.content)const results = await client.beta.retrieval.retrieve({ index_id: "<your-index-id>", query: "What were the Q3 revenue figures?", top_k: 10,});
for (const result of results.results) { console.log(`Score: ${result.score}, Rerank: ${result.rerank_score}`); console.log(result.content);}Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
index_id | string | required | ID of the index to retrieve against. |
query | string | required | Natural-language query. |
top_k | int | server default | Maximum number of results to return. |
vector_pipeline_weight | float | null | Weight for vector search (0—1). |
full_text_pipeline_weight | float | null | Weight for full-text search (0—1). |
score_threshold | float | null | Minimum score for returned results. |
num_candidates | int | null | Number of ANN candidates (higher = more accurate but slower). |
Reranking
Section titled “Reranking”Reranking is enabled by default. After hybrid search retrieves candidates, a reranker model re-scores them for improved relevance.
# Disable rerankingresults = await client.beta.retrieval.retrieve( index_id="<your-index-id>", query="summary of findings", rerank={"enabled": False},)
# Keep reranking but limit to top 3results = await client.beta.retrieval.retrieve( index_id="<your-index-id>", query="summary of findings", rerank={"enabled": True, "top_n": 3},)// Disable rerankingconst results = await client.beta.retrieval.retrieve({ index_id: "<your-index-id>", query: "summary of findings", rerank: { enabled: false },});
// Keep reranking but limit to top 3const results2 = await client.beta.retrieval.retrieve({ index_id: "<your-index-id>", query: "summary of findings", rerank: { enabled: true, top_n: 3 },});Filtering
Section titled “Filtering”You can filter results using built-in static fields (page range, chunk index) or user-defined custom metadata.
Static filters
Section titled “Static filters”Static filters operate on built-in fields stored for every chunk. Currently the supported field is parsed_directory_file_id, which lets you scope retrieval to specific files.
results = await client.beta.retrieval.retrieve( index_id="<your-index-id>", query="revenue breakdown", static_filters={ "parsed_directory_file_id": {"operator": "eq", "value": "<file-id>"}, },)const results = await client.beta.retrieval.retrieve({ index_id: "<your-index-id>", query: "revenue breakdown", static_filters: { parsed_directory_file_id: { operator: "eq", value: "<file-id>" }, },});Custom metadata filters
Section titled “Custom metadata filters”Custom filters match against user-defined metadata that was attached to files when they were added to the directory.
results = await client.beta.retrieval.retrieve( index_id="<your-index-id>", query="contract terms", custom_filters={ "category": {"operator": "eq", "value": "legal"}, "year": {"operator": "gte", "value": 2024}, },)const results = await client.beta.retrieval.retrieve({ index_id: "<your-index-id>", query: "contract terms", custom_filters: { category: { operator: "eq", value: "legal" }, year: { operator: "gte", value: 2024 }, },});Response format
Section titled “Response format”Each result in the response contains:
| Field | Description |
|---|---|
content | The text content of the retrieved chunk. |
score | Hybrid search relevance score. |
rerank_score | Score from the reranker (if reranking was applied). |
metadata | User-defined metadata associated with the chunk. |
static_fields.parsed_directory_file_id | ID of the parsed file this chunk belongs to. |
static_fields.page_range_start / page_range_end | Page numbers covered by the chunk. |
static_fields.chunk_index | Position of the chunk within the file. |
static_fields.chunk_token_count | Token count of the chunk. |