Retrieval

Guide

Index V2 (Beta)

Hybrid search over indexed documents with vector search, full-text search, filtering, and reranking.

The retrieval endpoint performs hybrid search combining vector similarity and full-text search. Results can be filtered by built-in fields or user-defined metadata, and reranked for higher relevance.

results = await client.beta.retrieval.retrieve(
    index_id="<your-index-id>",
    query="What were the Q3 revenue figures?",
    top_k=10,
)

for result in results.results:
    print(f"Score: {result.score}, Rerank: {result.rerank_score}")
    print(result.content)

const results = await client.beta.retrieval.retrieve({
  index_id: "<your-index-id>",
  query: "What were the Q3 revenue figures?",
  top_k: 10,
});

for (const result of results.results) {
  console.log(`Score: ${result.score}, Rerank: ${result.rerank_score}`);
  console.log(result.content);
}

Parameters

Parameter	Type	Default	Description
`index_id`	`string`	required	ID of the index to retrieve against.
`query`	`string`	required	Natural-language query.
`top_k`	`int`	server default	Maximum number of results to return.
`vector_pipeline_weight`	`float`	`null`	Weight for vector search (0—1).
`full_text_pipeline_weight`	`float`	`null`	Weight for full-text search (0—1).
`score_threshold`	`float`	`null`	Minimum score for returned results.
`num_candidates`	`int`	`null`	Number of ANN candidates (higher = more accurate but slower).

Reranking

Reranking is enabled by default. After hybrid search retrieves candidates, a reranker model re-scores them for improved relevance.

Python
TypeScript

# Disable reranking
results = await client.beta.retrieval.retrieve(
    index_id="<your-index-id>",
    query="summary of findings",
    rerank={"enabled": False},
)

# Keep reranking but limit to top 3
results = await client.beta.retrieval.retrieve(
    index_id="<your-index-id>",
    query="summary of findings",
    rerank={"enabled": True, "top_n": 3},
)

// Disable reranking
const results = await client.beta.retrieval.retrieve({
  index_id: "<your-index-id>",
  query: "summary of findings",
  rerank: { enabled: false },
});

// Keep reranking but limit to top 3
const results2 = await client.beta.retrieval.retrieve({
  index_id: "<your-index-id>",
  query: "summary of findings",
  rerank: { enabled: true, top_n: 3 },
});

Filtering

You can filter results using built-in static fields (page range, chunk index) or user-defined custom metadata.

Static filters

Static filters operate on built-in fields stored for every chunk. Currently the supported field is parsed_directory_file_id, which lets you scope retrieval to specific files.

Python
TypeScript

results = await client.beta.retrieval.retrieve(
    index_id="<your-index-id>",
    query="revenue breakdown",
    static_filters={
        "parsed_directory_file_id": {"operator": "eq", "value": "<file-id>"},
    },
)

const results = await client.beta.retrieval.retrieve({
  index_id: "<your-index-id>",
  query: "revenue breakdown",
  static_filters: {
    parsed_directory_file_id: { operator: "eq", value: "<file-id>" },
  },
});

Custom metadata filters

Custom filters match against user-defined metadata that was attached to files when they were added to the directory.

Python
TypeScript

results = await client.beta.retrieval.retrieve(
    index_id="<your-index-id>",
    query="contract terms",
    custom_filters={
        "category": {"operator": "eq", "value": "legal"},
        "year": {"operator": "gte", "value": 2024},
    },
)

const results = await client.beta.retrieval.retrieve({
  index_id: "<your-index-id>",
  query: "contract terms",
  custom_filters: {
    category: { operator: "eq", value: "legal" },
    year: { operator: "gte", value: 2024 },
  },
});

Response format

Each result in the response contains:

Field	Description
`content`	The text content of the retrieved chunk.
`score`	Hybrid search relevance score.
`rerank_score`	Score from the reranker (if reranking was applied).
`metadata`	User-defined metadata associated with the chunk.
`static_fields.parsed_directory_file_id`	ID of the parsed file this chunk belongs to.
`static_fields.page_range_start` / `page_range_end`	Page numbers covered by the chunk.
`static_fields.chunk_index`	Position of the chunk within the file.
`static_fields.chunk_token_count`	Token count of the chunk.

Note for AI agents: this documentation is built for programmatic access. - Overview of all docs: https://developers.llamaindex.ai/llms.txt - Any page is available as raw Markdown by appending index.md to its URL — e.g. https://developers.llamaindex.ai/llamaparse/parse/getting_started/index.md - Agent-friendly REST search APIs live under https://developers.llamaindex.ai/api/ — search (BM25 full-text), grep (regex), read (fetch a page), and list (browse the doc tree). See https://developers.llamaindex.ai/llms.txt for parameters. - A hosted documentation MCP server is available at https://developers.llamaindex.ai/mcp. If you support MCP, you can ask the user to install it for browsing these docs directly (an alternative to the REST API). Setup: https://developers.llamaindex.ai/python/shared/mcp/