Skip to content
Guide
Index V2 (Beta)

Retrieval

Hybrid search over indexed documents with vector search, full-text search, filtering, and reranking.

The retrieval endpoint performs hybrid search combining vector similarity and full-text search. Results can be filtered by built-in fields or user-defined metadata, and reranked for higher relevance.

results = await client.beta.retrieval.retrieve(
index_id="<your-index-id>",
query="What were the Q3 revenue figures?",
top_k=10,
)
for result in results.results:
print(f"Score: {result.score}, Rerank: {result.rerank_score}")
print(result.content)
ParameterTypeDefaultDescription
index_idstringrequiredID of the index to retrieve against.
querystringrequiredNatural-language query.
top_kintserver defaultMaximum number of results to return.
vector_pipeline_weightfloatnullWeight for vector search (0—1).
full_text_pipeline_weightfloatnullWeight for full-text search (0—1).
score_thresholdfloatnullMinimum score for returned results.
num_candidatesintnullNumber of ANN candidates (higher = more accurate but slower).

Reranking is enabled by default. After hybrid search retrieves candidates, a reranker model re-scores them for improved relevance.

# Disable reranking
results = await client.beta.retrieval.retrieve(
index_id="<your-index-id>",
query="summary of findings",
rerank={"enabled": False},
)
# Keep reranking but limit to top 3
results = await client.beta.retrieval.retrieve(
index_id="<your-index-id>",
query="summary of findings",
rerank={"enabled": True, "top_n": 3},
)

You can filter results using built-in static fields (page range, chunk index) or user-defined custom metadata.

Static filters operate on built-in fields stored for every chunk. Currently the supported field is parsed_directory_file_id, which lets you scope retrieval to specific files.

results = await client.beta.retrieval.retrieve(
index_id="<your-index-id>",
query="revenue breakdown",
static_filters={
"parsed_directory_file_id": {"operator": "eq", "value": "<file-id>"},
},
)

Custom filters match against user-defined metadata that was attached to files when they were added to the directory.

results = await client.beta.retrieval.retrieve(
index_id="<your-index-id>",
query="contract terms",
custom_filters={
"category": {"operator": "eq", "value": "legal"},
"year": {"operator": "gte", "value": 2024},
},
)

Each result in the response contains:

FieldDescription
contentThe text content of the retrieved chunk.
scoreHybrid search relevance score.
rerank_scoreScore from the reranker (if reranking was applied).
metadataUser-defined metadata associated with the chunk.
static_fields.parsed_directory_file_idID of the parsed file this chunk belongs to.
static_fields.page_range_start / page_range_endPage numbers covered by the chunk.
static_fields.chunk_indexPosition of the chunk within the file.
static_fields.chunk_token_countToken count of the chunk.