Router Query Engine
In this tutorial, we define a custom router query engine that selects one out of several candidate query engines to execute a query.
If youâre opening this Notebook on colab, you will probably need to install LlamaIndex đŚ.
%pip install llama-index-embeddings-openai%pip install llama-index-llms-openai
!pip install llama-index
# NOTE: This is ONLY necessary in jupyter notebook.# Details: Jupyter runs an event-loop behind the scenes.# This results in nested event-loops when we start an event-loop to make async queries.# This is normally not allowed, we use nest_asyncio to allow it for convenience.import nest_asyncio
nest_asyncio.apply()
Global Models
Section titled âGlobal Modelsâimport os
os.environ["OPENAI_API_KEY"] = "sk-..."
from llama_index.llms.openai import OpenAIfrom llama_index.embeddings.openai import OpenAIEmbeddingfrom llama_index.core import Settings
Settings.llm = OpenAI(model="gpt-3.5-turbo-1106", temperature=0.2)Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Load Data
Section titled âLoad DataâWe first show how to convert a Document into a set of Nodes, and insert into a DocumentStore.
from llama_index.core import SimpleDirectoryReader
# load documentsdocuments = SimpleDirectoryReader("../data/paul_graham").load_data()
from llama_index.core import Settings
# initialize settings (set chunk size)Settings.chunk_size = 1024nodes = Settings.node_parser.get_nodes_from_documents(documents)
from llama_index.core import StorageContext
# initialize storage context (by default it's in-memory)storage_context = StorageContext.from_defaults()storage_context.docstore.add_documents(nodes)
Define Summary Index and Vector Index over Same Data
Section titled âDefine Summary Index and Vector Index over Same Dataâfrom llama_index.core import SummaryIndexfrom llama_index.core import VectorStoreIndex
summary_index = SummaryIndex(nodes, storage_context=storage_context)vector_index = VectorStoreIndex(nodes, storage_context=storage_context)
Define Query Engines and Set Metadata
Section titled âDefine Query Engines and Set Metadataâlist_query_engine = summary_index.as_query_engine( response_mode="tree_summarize", use_async=True,)vector_query_engine = vector_index.as_query_engine()
from llama_index.core.tools import QueryEngineTool
list_tool = QueryEngineTool.from_defaults( query_engine=list_query_engine, description=( "Useful for summarization questions related to Paul Graham eassy on" " What I Worked On." ),)
vector_tool = QueryEngineTool.from_defaults( query_engine=vector_query_engine, description=( "Useful for retrieving specific context from Paul Graham essay on What" " I Worked On." ),)
Define Router Query Engine
Section titled âDefine Router Query EngineâThere are several selectors available, each with some distinct attributes.
The LLM selectors use the LLM to output a JSON that is parsed, and the corresponding indexes are queried.
The Pydantic selectors (currently only supported by gpt-4-0613
and gpt-3.5-turbo-0613
(the default)) use the OpenAI Function Call API to produce pydantic selection objects, rather than parsing raw JSON.
For each type of selector, there is also the option to select 1 index to route to, or multiple.
PydanticSingleSelector
Section titled âPydanticSingleSelectorâUse the OpenAI Function API to generate/parse pydantic objects under the hood for the router selector.
from llama_index.core.query_engine import RouterQueryEnginefrom llama_index.core.selectors import LLMSingleSelector, LLMMultiSelectorfrom llama_index.core.selectors import ( PydanticMultiSelector, PydanticSingleSelector,)
query_engine = RouterQueryEngine( selector=PydanticSingleSelector.from_defaults(), query_engine_tools=[ list_tool, vector_tool, ],)
response = query_engine.query("What is the summary of the document?")print(str(response))
The document provides a comprehensive account of the author's diverse experiences, including writing, programming, founding and running startups, and investing in early-stage companies. It covers the challenges, successes, and lessons learned in these ventures, as well as the author's personal and professional growth, interactions with colleagues, and evolving interests and priorities over time.
response = query_engine.query("What did Paul Graham do after RICS?")print(str(response))
Paul Graham started painting after leaving Y Combinator. He wanted to see how good he could get if he really focused on it. After spending most of 2014 painting, he eventually ran out of steam and stopped working on it. He then started writing essays again and wrote a bunch of new ones over the next few months. Later, in March 2015, he started working on Lisp again.
LLMSingleSelector
Section titled âLLMSingleSelectorâUse OpenAI (or any other LLM) to parse generated JSON under the hood to select a sub-index for routing.
query_engine = RouterQueryEngine( selector=LLMSingleSelector.from_defaults(), query_engine_tools=[ list_tool, vector_tool, ],)
response = query_engine.query("What is the summary of the document?")print(str(response))
The document provides a comprehensive account of the author's professional journey, covering his involvement in various projects such as Viaweb, Y Combinator, and Hacker News, as well as his transition to focusing on writing essays and working on Y Combinator. It also delves into his experiences with the Summer Founders Program, the growth and challenges of Y Combinator, personal struggles, and his return to working on Lisp. The author reflects on the challenges and successes encountered throughout his career, including funding startups, developing a new version of Arc, and the impact of Hacker News. Additionally, the document touches on the author's interactions with colleagues, his time in Italy, experiences with painting, and the completion of a new Lisp called Bel. Throughout, the author shares insights and lessons learned from his diverse experiences.
response = query_engine.query("What did Paul Graham do after RICS?")print(str(response))
Paul Graham started painting after leaving Y Combinator. He wanted to see how good he could get if he really focused on it. After spending most of 2014 painting, he eventually ran out of steam and stopped working on it. He then started writing essays again and wrote a bunch of new ones over the next few months. In March 2015, he started working on Lisp again.
# [optional] look at selected resultsprint(str(response.metadata["selector_result"]))
selections=[SingleSelection(index=1, reason='The question is asking for specific context about what Paul Graham did after RICS, which would require retrieving specific information from his essay.')]
PydanticMultiSelector
Section titled âPydanticMultiSelectorâIn case you are expecting queries to be routed to multiple indexes, you should use a multi selector. The multi selector sends to query to multiple sub-indexes, and then aggregates all responses using a summary index to form a complete answer.
from llama_index.core import SimpleKeywordTableIndex
keyword_index = SimpleKeywordTableIndex(nodes, storage_context=storage_context)
keyword_tool = QueryEngineTool.from_defaults( query_engine=vector_query_engine, description=( "Useful for retrieving specific context using keywords from Paul" " Graham essay on What I Worked On." ),)
query_engine = RouterQueryEngine( selector=PydanticMultiSelector.from_defaults(), query_engine_tools=[ list_tool, vector_tool, keyword_tool, ],)
# This query could use either a keyword or vector query engine, so it will combine responses from bothresponse = query_engine.query( "What were noteable events and people from the authors time at Interleaf" " and YC?")print(str(response))
The author's time at Interleaf involved working on software for creating documents and learning valuable lessons about what not to do. Notable individuals associated with Y Combinator during the author's time there include Jessica Livingston, Robert Morris, and Sam Altman, who eventually became the second president of YC. The author's time at Y Combinator included notable events such as the creation of the Summer Founders Program, which attracted impressive individuals like Reddit, Justin Kan, Emmett Shear, Aaron Swartz, and Sam Altman.
# [optional] look at selected resultsprint(str(response.metadata["selector_result"]))
selections=[SingleSelection(index=0, reason='Summarization questions related to Paul Graham essay on What I Worked On.'), SingleSelection(index=2, reason='Retrieving specific context using keywords from Paul Graham essay on What I Worked On.')]