Skip to content
Guide
Index V2 (Beta)

Chat

Built-in chat agent that performs RAG over your indexes with session management and streaming responses.

The chat API provides a built-in RAG agent that can answer questions over one or more of your indexes. It manages conversation sessions, streams responses as Server-Sent Events, and handles retrieval, tool use, and reasoning internally.

A session holds the conversation history. You can optionally bind it to specific indexes at creation time — once the first message is sent, the bound indexes are locked for the session’s lifetime.

# Session bound to specific indexes
session = await client.beta.chat.create(
index_ids=["<index-id-1>", "<index-id-2>"],
)
print(session.session_id)
# Unbound session (indexes specified per-message)
session = await client.beta.chat.create()

Send a message and receive the agent’s response as a stream of Server-Sent Events. The stream includes thinking steps, tool calls (retrieval), and the final text response.

Use with_streaming_response to get the raw SSE stream and parse events line by line.

import json
response = client.beta.chat.with_streaming_response.stream(
session.session_id,
index_ids=["<index-id>"],
prompt="What are the key findings in the Q3 report?",
)
async with response as stream:
async for line in stream.iter_lines():
if not line or line.startswith(":"):
continue
if line.startswith("data: "):
data = json.loads(line[6:])
if data.get("type") == "text_delta":
print(data["content"], end="", flush=True)
elif data.get("type") == "stop":
print("\n--- Done ---")
else:
print(data) # handle other event types as needed

The stream emits two categories of events:

Delta events (event: delta) — Incremental content for real-time display:

TypeDescription
text_deltaA chunk of the agent’s response text.
thinking_deltaA chunk of the agent’s reasoning (if exposed).

Message events (event: message) — Complete, discrete events:

TypeDescription
textComplete agent response text.
thinkingComplete agent reasoning block.
user_inputThe user’s original message.
tool_callA tool invocation by the agent (e.g., retrieval).
tool_resultThe result of a tool call.
stopEnd of the stream, includes token usage.
warningInformational warning (e.g., skipped indexes).
sessions = await client.beta.chat.list()
for s in sessions.items:
print(f"{s.session_id}: {s.generated_title or '(untitled)'}")

Retrieve a session summary or the full session with its event history.

# Summary only
summary = await client.beta.chat.get_summary(session.session_id)
print(summary.generated_title)
print(summary.job_metadata)
# Full session with event history
full = await client.beta.chat.retrieve(session.session_id)
for event in full.events:
print(event.type, event.content[:100] if hasattr(event, "content") else "")
await client.beta.chat.delete(session.session_id)