AIMon Rerank
If you’re opening this Notebook on colab, you will probably need to install LlamaIndex 🦙
%%capture!pip install llama-index!pip install llama-index-postprocessor-aimon-rerank
from llama_index.core import VectorStoreIndex, SimpleDirectoryReaderfrom llama_index.core.response.pprint_utils import pprint_response
An OpenAI and AIMon API key is required for this notebook. Import the AIMon and OpenAI API keys from Colab Secrets
import os
# Import Colab Secrets userdata module.from google.colab import userdata
os.environ["AIMON_API_KEY"] = userdata.get("AIMON_API_KEY")os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
Download data
!mkdir -p 'data/paul_graham/'!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
--2025-03-10 18:01:07-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txtResolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.110.133, ...Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.HTTP request sent, awaiting response... 200 OKLength: 75042 (73K) [text/plain]Saving to: ‘data/paul_graham/paul_graham_essay.txt’
data/paul_graham/pa 100%[===================>] 73.28K --.-KB/s in 0.03s
2025-03-10 18:01:08 (2.59 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]
Generate documents and build an index
# load documentsdocuments = SimpleDirectoryReader("./data/paul_graham/").load_data()
# build indexindex = VectorStoreIndex.from_documents(documents=documents)
Define a task definition for the AIMon reranker and instantiate an instance of the AIMonRerank class. The task definition serves as an explicit instruction to the system, defining what the reranking evaluation should focus on.
import osfrom llama_index.postprocessor.aimon_rerank import AIMonRerank
task_definition = "Your task is to assess the actions of an individual specified in the user query against the context documents supplied."
aimon_rerank = AIMonRerank( top_n=2, api_key=userdata.get("AIMON_API_KEY"), task_definition=task_definition,)
Directly retrieve top 2 most similar nodes (i.e., without using a reranker)
Section titled “Directly retrieve top 2 most similar nodes (i.e., without using a reranker)”query_engine = index.as_query_engine(similarity_top_k=2)response = query_engine.query("What did Sam Altman do in this essay?")
pprint_response(response, show_source=True)
Final Response: Sam Altman was asked to become the president of YCombinator, initially declined the offer to pursue starting a startupfocused on nuclear reactors, but eventually agreed to take overstarting with the winter 2014 batch.______________________________________________________________________Source Node 1/2Node ID: 2940ea4a-69ec-4fc4-9dd4-8ed54a9d4f1bSimilarity: 0.8305926707169754Text: When I was dealing with some urgent problem during YC, there wasabout a 60% chance it had to do with HN, and a 40% chance it had dowith everything else combined. [17] As well as HN, I wrote all ofYC's internal software in Arc. But while I continued to work a gooddeal in Arc, I gradually stopped working on Arc, partly because Ididn't have t...______________________________________________________________________Source Node 2/2Node ID: 2f043635-e4ce-4054-92f3-b624fd90ae04Similarity: 0.8239262662012308Text: Up till that point YC had been controlled by the original LLC wefour had started. But we wanted YC to last for a long time, and to dothat it couldn't be controlled by the founders. So if Sam said yes,we'd let him reorganize YC. Robert and I would retire, and Jessica andTrevor would become ordinary partners. When we asked Sam if he wantedto...
Retrieve top 10 most relevant nodes, but then rerank with AIMon Reranker
Section titled “Retrieve top 10 most relevant nodes, but then rerank with AIMon Reranker”Explanation of the reranking process:
The diagram illustrates how a reranker refines document retrieval for a more accurate response.
-
Initial Retrieval (Vector DB):
- A query is sent to the vector database.
- The system retrieves the top 10 most relevant records based on similarity scores (
top_k = 10
).
-
Reranking with AIMon:
- Instead of using only the highest-scoring records directly, these 10 records are reranked using the AIMon Reranker.
- The reranker evaluates the documents based on their actual relevance to the query, rather than just raw similarity scores.
- During this step, a task definition is applied, serving as an explicit instruction that defines what the reranking evaluation should focus on.
- This ensures that the selected records are not just statistically similar but also contextually relevant to the intended task.
-
Final Selection (
top_n = 2
):- After reranking, the system selects the top 2 most contextually relevant records for response generation.
- The task definition ensures that these records align with the query’s intent, leading to a more precise and informative response.
query_engine = index.as_query_engine( similarity_top_k=10, node_postprocessors=[aimon_rerank])response = query_engine.query("What did Sam Altman do in this essay?")
Total number of batches formed: 1Processing batch 1/1 with 10 context documents.Finished processing. Total batches sent to AIMon reranker: 1Top 2 nodes selected after reranking.
pprint_response(response, show_source=True)
Final Response: Sam Altman was asked to become the president of YCombinator, initially declined the offer to pursue starting a startupfocused on nuclear reactors, but eventually agreed to take over aspresident starting with the winter 2014 batch.______________________________________________________________________Source Node 1/2Node ID: 2940ea4a-69ec-4fc4-9dd4-8ed54a9d4f1bSimilarity: 0.48260445005911023Text: When I was dealing with some urgent problem during YC, there wasabout a 60% chance it had to do with HN, and a 40% chance it had dowith everything else combined. [17] As well as HN, I wrote all ofYC's internal software in Arc. But while I continued to work a gooddeal in Arc, I gradually stopped working on Arc, partly because Ididn't have t...______________________________________________________________________Source Node 2/2Node ID: 0baaf5af-6e6b-4889-8407-e49d1753980cSimilarity: 0.48151918284717965Text: As Jessica and I were walking home from dinner on March 11, atthe corner of Garden and Walker streets, these three threadsconverged. Screw the VCs who were taking so long to make up theirminds. We'd start our own investment firm and actually implement theideas we'd been talking about. I'd fund it, and Jessica could quit herjob and work for ...
Conclusion
Section titled “Conclusion”The AIMon reranker, using task definition, shifted retrieval focus from general YC leadership changes to Sam Altman’s specific actions. Initially, high-similarity documents lacked his decision-making details. After reranking, lower-similarity but contextually relevant documents highlighted his reluctance and timeline, ensuring a more accurate, task-aligned response over purely similarity-based retrieval.