DeepEval: Evaluation and Observability for LlamaIndex
DeepEval (by Confident AI) now integrates with LlamaIndex, giving you end-to-end visibility and evaluation tools for your LlamaIndex agents.
Quickstart
Section titled “Quickstart”Install the following packages:
!pip install -U deepeval llama-indexLogin with your Confident API key and configure DeepEval as instrument LlamaIndex:
import llama_index.core.instrumentation as instrument
import deepevalfrom deepeval.integrations.llama_index import instrument_llama_index
deepeval.login("<your-confident-api-key>")
instrument_llama_index(instrument.get_dispatcher())Example Agent
Section titled “Example Agent”⚠️ Note: DeepEval may not work reliably in Jupyter notebooks due to event loop conflicts. It is recommended to run examples in a standalone Python script instead.
import osimport timeimport asyncio
from llama_index.llms.openai import OpenAIimport llama_index.core.instrumentation as instrumentfrom llama_index.core.agent.workflow import FunctionAgent
import deepevalfrom deepeval.integrations.llama_index import instrument_llama_index
# Don't forget to setup tracingdeepeval.login("<your-confident-api-key>")
# Instrument LlamaIndexinstrument_llama_index(instrument.get_dispatcher())
os.environ["OPENAI_API_KEY"] = "<your-openai-api-key>"
def multiply(a: float, b: float) -> float: """Useful for multiplying two numbers.""" return a * b
agent = FunctionAgent( tools=[multiply], llm=OpenAI(model="gpt-4o-mini"), system_prompt="You are a helpful assistant that can perform calculations.",)
async def main(): response = await agent.run("What's 7 * 8?") print(response)
if __name__ == "__main__": asyncio.run(main())You can directly view the traces in the Observatory by clicking on the link in the output printed in the console.
Online Evaluations
Section titled “Online Evaluations”You can use DeepEval to evaluate your LlamaIndex agents on Confident AI.
- Create a metric collection on Confident AI.
- Pass the metric collection name on DeepEval’s LlamaIndex agent wrapper.
import osimport timeimport asyncio
from llama_index.llms.openai import OpenAIimport llama_index.core.instrumentation as instrument
import deepevalfrom deepeval.integrations.llama_index import FunctionAgentfrom deepeval.integrations.llama_index import instrument_llama_index
deepeval.login("<your-confident-api-key>")
instrument_llama_index(instrument.get_dispatcher())
os.environ["OPENAI_API_KEY"] = ""
def multiply(a: float, b: float) -> float: """Useful for multiplying two numbers.""" return a * b
agent = FunctionAgent( tools=[multiply], llm=OpenAI(model="gpt-4o-mini"), system_prompt="You are a helpful assistant that can perform calculations.", metric_collection="test_collection_1",)
async def main(): response = await agent.run("What's 7 * 8?") print(response)
if __name__ == "__main__": asyncio.run(main())References
Section titled “References”Note for AI agents: this documentation is built for programmatic access.
- Overview of all docs: https://developers.llamaindex.ai/llms.txt
- Any page is available as raw Markdown by appending index.md to its URL — e.g. https://developers.llamaindex.ai/llamaparse/parse/getting_started/index.md
- Agent-friendly REST search APIs live under https://developers.llamaindex.ai/api/ — search (BM25 full-text), grep (regex), read (fetch a page), and list (browse the doc tree). See https://developers.llamaindex.ai/llms.txt for parameters.
- A hosted documentation MCP server is available at https://developers.llamaindex.ai/mcp. If you support MCP, you can ask the user to install it for browsing these docs directly (an alternative to the REST API). Setup: https://developers.llamaindex.ai/python/shared/mcp/