Usage Pattern
Get Started
Section titled “Get Started”Build a query engine from index:
query_engine = index.as_query_engine()
!!! tip To learn how to build an index, see Indexing
Ask a question over your data
response = query_engine.query("Who is Paul Graham?")
Configuring a Query Engine
Section titled “Configuring a Query Engine”High-Level API
Section titled “High-Level API”You can directly build and configure a query engine from an index in 1 line of code:
query_engine = index.as_query_engine( response_mode="tree_summarize", verbose=True,)
Note: While the high-level API optimizes for ease-of-use, it does NOT expose full range of configurability.
See Response Modes for a full list of response modes and what they do.
Low-Level Composition API
Section titled “Low-Level Composition API”You can use the low-level composition API if you need more granular control.
Concretely speaking, you would explicitly construct a QueryEngine
object instead of calling index.as_query_engine(...)
.
Note: You may need to look at API references or example notebooks.
from llama_index.core import VectorStoreIndex, get_response_synthesizerfrom llama_index.core.retrievers import VectorIndexRetrieverfrom llama_index.core.query_engine import RetrieverQueryEngine
# build indexindex = VectorStoreIndex.from_documents(documents)
# configure retrieverretriever = VectorIndexRetriever( index=index, similarity_top_k=2,)
# configure response synthesizerresponse_synthesizer = get_response_synthesizer( response_mode="tree_summarize",)
# assemble query enginequery_engine = RetrieverQueryEngine( retriever=retriever, response_synthesizer=response_synthesizer,)
# queryresponse = query_engine.query("What did the author do growing up?")print(response)
Streaming
Section titled “Streaming”To enable streaming, you simply need to pass in a streaming=True
flag
query_engine = index.as_query_engine( streaming=True,)streaming_response = query_engine.query( "What did the author do growing up?",)streaming_response.print_response_stream()
- Read the full streaming guide
- See an end-to-end example
Defining a Custom Query Engine
Section titled “Defining a Custom Query Engine”You can also define a custom query engine. Simply subclass the CustomQueryEngine
class, define any attributes you’d want to have (similar to defining a Pydantic class), and implement a custom_query
function that returns either a Response
object or a string.
from llama_index.core.query_engine import CustomQueryEnginefrom llama_index.core.retrievers import BaseRetrieverfrom llama_index.core import get_response_synthesizerfrom llama_index.core.response_synthesizers import BaseSynthesizer
class RAGQueryEngine(CustomQueryEngine): """RAG Query Engine."""
retriever: BaseRetriever response_synthesizer: BaseSynthesizer
def custom_query(self, query_str: str): nodes = self.retriever.retrieve(query_str) response_obj = self.response_synthesizer.synthesize(query_str, nodes) return response_obj
See the Custom Query Engine guide for more details.