Starter Tutorial (Using OpenAI)
This tutorial will show you how to get started building agents with LlamaIndex. We’ll start with a basic example and then show how to add RAG (Retrieval-Augmented Generation) capabilities.
!!! tip Make sure you’ve followed the installation steps first.
!!! tip Want to use local models? If you want to do our starter tutorial using only local models, check out this tutorial instead.
Set your OpenAI API key
Section titled “Set your OpenAI API key”LlamaIndex uses OpenAI’s gpt-3.5-turbo
by default. Make sure your API key is available to your code by setting it as an environment variable:
# MacOS/Linuxexport OPENAI_API_KEY=XXXXX
# Windowsset OPENAI_API_KEY=XXXXX
!!! tip
If you are using an OpenAI-Compatible API, you can use the OpenAILike
LLM class. You can find more information in the OpenAILike LLM integration and OpenAILike Embeddings integration.
Basic Agent Example
Section titled “Basic Agent Example”Let’s start with a simple example using an agent that can perform basic multiplication by calling a tool. Create a file called starter.py
:
import asynciofrom llama_index.core.agent.workflow import FunctionAgentfrom llama_index.llms.openai import OpenAI
# Define a simple calculator tooldef multiply(a: float, b: float) -> float: """Useful for multiplying two numbers.""" return a * b
# Create an agent workflow with our calculator toolagent = FunctionAgent( tools=[multiply], llm=OpenAI(model="gpt-4o-mini"), system_prompt="You are a helpful assistant that can multiply two numbers.",)
async def main(): # Run the agent response = await agent.run("What is 1234 * 4567?") print(str(response))
# Run the agentif __name__ == "__main__": asyncio.run(main())
This will output something like: The result of \( 1234 \times 4567 \) is \( 5,678,678 \).
What happened is:
- The agent was given a question:
What is 1234 * 4567?
- Under the hood, this question, plus the schema of the tools (name, docstring, and arguments) were passed to the LLM
- The agent selected the
multiply
tool and wrote the arguments to the tool - The agent received the result from the tool and interpolated it into the final response
!!! tip
As you can see, we are using async
python functions. Many LLMs and models support async calls, and using async code is recommended to improve performance of your application. To learn more about async code and python, we recommend this short section on async + python.
Adding Chat History
Section titled “Adding Chat History”The AgentWorkflow
is also able to remember previous messages. This is contained inside the Context
of the AgentWorkflow
.
If the Context
is passed in, the agent will use it to continue the conversation.
from llama_index.core.workflow import Context
# create contextctx = Context(agent)
# run agent with contextresponse = await agent.run("My name is Logan", ctx=ctx)response = await agent.run("What is my name?", ctx=ctx)
Adding RAG Capabilities
Section titled “Adding RAG Capabilities”Now let’s enhance our agent by adding the ability to search through documents. First, let’s get some example data using our terminal:
mkdir datawget https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt -O data/paul_graham_essay.txt
Your directory structure should look like this now:
├── starter.py └── data └── paul_graham_essay.txt
Now we can create a tool for searching through documents using LlamaIndex. By default, our VectorStoreIndex
will use a text-embedding-ada-002
embeddings from OpenAI to embed and retrieve the text.
Our modified starter.py
should look like this:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReaderfrom llama_index.core.agent.workflow import FunctionAgentfrom llama_index.llms.openai import OpenAIimport asyncioimport os
# Create a RAG tool using LlamaIndexdocuments = SimpleDirectoryReader("data").load_data()index = VectorStoreIndex.from_documents(documents)query_engine = index.as_query_engine()
def multiply(a: float, b: float) -> float: """Useful for multiplying two numbers.""" return a * b
async def search_documents(query: str) -> str: """Useful for answering natural language questions about an personal essay written by Paul Graham.""" response = await query_engine.aquery(query) return str(response)
# Create an enhanced workflow with both toolsagent = FunctionAgent( tools=[multiply, search_documents], llm=OpenAI(model="gpt-4o-mini"), system_prompt="""You are a helpful assistant that can perform calculations and search through documents to answer questions.""",)
# Now we can ask questions about the documents or do calculationsasync def main(): response = await agent.run( "What did the author do in college? Also, what's 7 * 8?" ) print(response)
# Run the agentif __name__ == "__main__": asyncio.run(main())
The agent can now seamlessly switch between using the calculator and searching through documents to answer questions.
Storing the RAG Index
Section titled “Storing the RAG Index”To avoid reprocessing documents every time, you can persist the index to disk:
# Save the indexindex.storage_context.persist("storage")
# Later, load the indexfrom llama_index.core import StorageContext, load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir="storage")index = load_index_from_storage(storage_context)query_engine = index.as_query_engine()
!!! tip If you used a vector store integration besides the default, chances are you can just reload from the vector store:
```pythonindex = VectorStoreIndex.from_vector_store(vector_store)```
What’s Next?
Section titled “What’s Next?”This is just the beginning of what you can do with LlamaIndex agents! You can:
- Add more tools to your agent
- Use different LLMs
- Customize the agent’s behavior using system prompts
- Add streaming capabilities
- Implement human-in-the-loop workflows
- Use multiple agents to collaborate on tasks
Some helpful next links:
- See more advanced agent examples in our Agent documentation
- Learn more about high-level concepts
- Explore how to customize things
- Check out the component guides