OpenAI
Installation
Section titled “Installation”npm i llamaindex @llamaindex/openaiimport { OpenAI } from "@llamaindex/openai";import { Settings } from "llamaindex";
Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0, apiKey: <YOUR_API_KEY> });You can setup the apiKey on the environment variables, like:
export OPENAI_API_KEY="<YOUR_API_KEY>"You can optionally set a custom base URL, like:
export OPENAI_BASE_URL="https://api.scaleway.ai/v1"or
Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0, apiKey: <YOUR_API_KEY>, baseURL: "https://api.scaleway.ai/v1" });Using OpenAI Responses API
Section titled “Using OpenAI Responses API”The OpenAI Responses API provides enhanced functionality for handling complex interactions, including built-in tools, annotations, and streaming responses. Here’s how to use it:
Basic Setup
Section titled “Basic Setup”import { openaiResponses } from "@llamaindex/openai";
const llm = openaiResponses({  model: "gpt-4o",  temperature: 0.1,  maxOutputTokens: 1000});Message Content Types
Section titled “Message Content Types”The API supports different types of message content, including text and images:
const response = await llm.chat({  messages: [    {      role: "user",      content: [        {          type: "input_text",          text: "What's in this image?"        },        {          type: "input_image",          image_url: "https://example.com/image.jpg",          detail: "auto" // Optional: can be "auto", "low", or "high"        }      ]    }  ]});Advanced Features
Section titled “Advanced Features”Built-in Tools
Section titled “Built-in Tools”const llm = openaiResponses({  model: "gpt-4o",  builtInTools: [    {      type: "function",      name: "search_files",      description: "Search through available files"    }  ],  strict: true // Enable strict mode for tool calls});Response Tracking and Storage
Section titled “Response Tracking and Storage”const llm = openaiResponses({  trackPreviousResponses: true, // Enable response tracking  store: true, // Store responses for future reference  user: "user-123", // Associate responses with a user  callMetadata: { // Add custom metadata    sessionId: "session-123",    context: "customer-support"  }});Streaming Responses
Section titled “Streaming Responses”const response = await llm.chat({  messages: [    {      role: "user",      content: "Generate a long response"    }  ],  stream: true // Enable streaming});
for await (const chunk of response) {  console.log(chunk.delta); // Process each chunk of the response}Configuration Options
Section titled “Configuration Options”The OpenAI Responses API supports various configuration options:
const llm = openaiResponses({  // Model and basic settings  model: "gpt-4o",  temperature: 0.1,  topP: 1,  maxOutputTokens: 1000,
  // API configuration  apiKey: "your-api-key",  baseURL: "custom-endpoint",  maxRetries: 10,  timeout: 60000,
  // Response handling  trackPreviousResponses: false,  store: false,  strict: false,
  // Additional options  instructions: "Custom instructions for the model",  truncation: "auto", // Can be "auto", "disabled", or null  include: ["citations", "reasoning"] // Specify what to include in responses});Response Structure
Section titled “Response Structure”The API returns responses with rich metadata and optional annotations:
interface ResponseStructure {  message: {    content: string;    role: "assistant";    options: {      built_in_tool_calls: Array<ToolCall>;      annotations?: Array<Citation | URLCitation | FilePath>;      refusal?: string;      reasoning?: ReasoningItem;      usage?: ResponseUsage;      toolCall?: Array<PartialToolCall>;    }  }}Best Practices
Section titled “Best Practices”- Use trackPreviousResponseswhen you need conversation continuity
- Enable strictmode when using tools to ensure accurate function calls
- Set appropriate maxOutputTokensto control response length
- Use annotationsto track citations and references in responses
- Implement error handling for potential API failures and retries
Using JSON Response Format
Section titled “Using JSON Response Format”You can configure OpenAI to return responses in JSON format:
Settings.llm = new OpenAI({  model: "gpt-4o",  temperature: 0,  responseFormat: { type: "json_object" }});
// You can also use a Zod schema to validate the response structureimport { z } from "zod";
const responseSchema = z.object({  summary: z.string(),  topics: z.array(z.string()),  sentiment: z.enum(["positive", "negative", "neutral"])});
Settings.llm = new OpenAI({  model: "gpt-4o",  temperature: 0,  responseFormat: responseSchema});Response Formats
Section titled “Response Formats”The OpenAI LLM supports different response formats to structure the output in specific ways. There are two main approaches to formatting responses:
1. JSON Object Format
Section titled “1. JSON Object Format”The simplest way to get structured JSON responses is using the json_object response format:
Settings.llm = new OpenAI({  model: "gpt-4o",  temperature: 0,  responseFormat: { type: "json_object" }});
const response = await llm.chat({  messages: [    {      role: "system",      content: "You are a helpful assistant that outputs JSON."    },    {      role: "user",      content: "Summarize this meeting transcript"    }  ]});
// Response will be valid JSONconsole.log(response.message.content);2. Schema Validation with Zod
Section titled “2. Schema Validation with Zod”For more robust type safety and validation, you can use Zod schemas to define the expected response structure:
import { z } from "zod";
// Define the response schemaconst meetingSchema = z.object({  summary: z.string(),  participants: z.array(z.string()),  actionItems: z.array(z.string()),  nextSteps: z.string()});
// Configure the LLM with the schemaSettings.llm = new OpenAI({  model: "gpt-4o",  temperature: 0,  responseFormat: meetingSchema});
const response = await llm.chat({  messages: [    {      role: "user",      content: "Summarize this meeting transcript"    }  ]});
// Response will be typed and validated according to the schemaconst result = response.message.content;console.log(result.summary);console.log(result.actionItems);Response Format Options
Section titled “Response Format Options”The response format can be configured in two ways:
- At LLM initialization:
const llm = new OpenAI({  model: "gpt-4o",  responseFormat: { type: "json_object" } // or a Zod schema});- Per request:
const response = await llm.chat({  messages: [...],  responseFormat: { type: "json_object" } // or a Zod schema});The response format options are:
- { type: "json_object" }- Returns responses as JSON objects
- zodSchema- A Zod schema that defines and validates the response structure
Best Practices
Section titled “Best Practices”- Use JSON object format for simple structured responses
- Use Zod schemas when you need:
- Type safety
- Response validation
- Complex nested structures
- Specific field constraints
 
- Set a low temperature (e.g. 0) when using structured outputs for more reliable formatting
- Include clear instructions in system or user messages about the expected response format
- Handle potential parsing errors when working with JSON responses
Load and index documents
Section titled “Load and index documents”For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
import { Document, VectorStoreIndex } from "llamaindex";
const document = new Document({ text: essay, id_: "essay" });
const index = await VectorStoreIndex.fromDocuments([document]);const queryEngine = index.asQueryEngine();
const query = "What is the meaning of life?";
const results = await queryEngine.query({  query,});Full Example
Section titled “Full Example”import { OpenAI } from "@llamaindex/openai";import { Document, Settings, VectorStoreIndex } from "llamaindex";
// Use the OpenAI LLMSettings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0 });
async function main() {  const document = new Document({ text: essay, id_: "essay" });
  // Load and index documents  const index = await VectorStoreIndex.fromDocuments([document]);
  // get retriever  const retriever = index.asRetriever();
  // Create a query engine  const queryEngine = index.asQueryEngine({    retriever,  });
  const query = "What is the meaning of life?";
  // Query  const response = await queryEngine.query({    query,  });
  // Log the response  console.log(response.response);}API Reference
Section titled “API Reference”OpenAI Live LLM
Section titled “OpenAI Live LLM”The OpenAI Live LLM integration in LlamaIndex provides real-time chat capabilities with support for audio streaming and tool calling.
Basic Usage
Section titled “Basic Usage”import { openai } from "@llamaindex/openai";import { tool, ModalityType } from "llamaindex";
// Get the ephimeral key on the serverconst serverllm = openai({  apiKey: "your-api-key",  model: "gpt-4o-realtime-preview-2025-06-03",});
// Get an ephemeral key// Usually this code is run on the server and the ephemeral key is passed to the// client - the ephemeral key can be securely used on the client sideconst ephemeralKey = await serverllm.live.getEphemeralKey();
// Create a client-side LLM instance with the ephemeral keyconst llm = openai({  apiKey: ephemeralKey,  model: "gpt-4o-realtime-preview-2025-06-03"});
// Create a live sessionimport { tool } from "llamaindex";const session = await llm.live.connect({  systemInstruction: "You are a helpful assistant.",});
// Send a messagesession.sendMessage({  content: "Hello!",  role: "user",});Tool Integration
Section titled “Tool Integration”Tools are handled server-side, making it simple to pass them to the live session:
// Define your toolsconst weatherTool = tool({  name: "weather",  description: "Get the weather for a location",  parameters: z.object({    location: z.string().describe("The location to get weather for"),  }),  execute: async ({ location }) => {    return `The weather in ${location} is sunny`;  },});
// Create session with toolsconst session = await llm.live.connect({  systemInstruction: "You are a helpful assistant.",  tools: [weatherTool],});Audio Support
Section titled “Audio Support”For audio capabilities:
// Get microphone accessconst userStream = await navigator.mediaDevices.getUserMedia({  audio: true,});
// Create session with audioconst session = await llm.live.connect({  audioConfig: {    stream: userStream,    onTrack: (remoteStream) => {      // Handle incoming audio      audioElement.srcObject = remoteStream;    },  },});Event Handling
Section titled “Event Handling”Listen to events from the session:
for await (const event of session.streamEvents()) {  if (liveEvents.open.include(event)) {    // Connection established    console.log("Connected!");  } else if (liveEvents.text.include(event)) {    // Received text response    console.log("Assistant:", event.text);  }}Capabilities
Section titled “Capabilities”The OpenAI Live LLM supports:
- Real-time text chat
- Audio streaming (if configured)
- Tool calling (server-side execution)
- Ephemeral key generation for secure sessions
API Reference
Section titled “API Reference”LiveLLM Methods
Section titled “LiveLLM Methods”// Get an ephemeral key // Usually this code is run on the server and the ephemeral key is passed to the // client - the ephemeral key can be securely used on the client side
connect(config?: LiveConnectConfig)
Section titled “connect(config?: LiveConnectConfig)”Creates a new live session.
interface LiveConnectConfig {  systemInstruction?: string;  tools?: BaseTool[];  audioConfig?: AudioConfig;  responseModality?: ModalityType[];}getEphemeralKey()
Section titled “getEphemeralKey()”Gets a temporary key for the session.
LiveLLMSession Methods
Section titled “LiveLLMSession Methods”sendMessage(message: ChatMessage)
Section titled “sendMessage(message: ChatMessage)”Sends a message to the assistant.
interface ChatMessage {  content: string | MessageContentDetail[];  role: "user" | "assistant";}disconnect()
Section titled “disconnect()”Closes the session and cleans up resources.
Error Handling
Section titled “Error Handling”try {  const session = await llm.live.connect();} catch (error) {  if (error instanceof Error) {    console.error("Connection failed:", error.message);  }}Best Practices
Section titled “Best Practices”- 
Tool Definition - Keep tool implementations server-side
- Use clear descriptions for tools
- Handle tool errors gracefully
 
- 
Session Management - Always disconnect sessions when done
- Clean up audio resources
- Handle reconnection scenarios
 
- 
Security - Use ephemeral keys for sessions
- Validate tool inputs
- Secure API key handling