Cohere Rerank
If you’re opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
%pip install llama-index > /dev/null%pip install llama-index-postprocessor-cohere-rerank > /dev/null
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.2[0m[39;49m -> [0m[32;49m24.0[0m[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0mNote: you may need to restart the kernel to use updated packages.
from llama_index.core import VectorStoreIndex, SimpleDirectoryReaderfrom llama_index.core.response.pprint_utils import pprint_response
Download Data
!mkdir -p 'data/paul_graham/'!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
--2024-05-09 17:56:26-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txtResolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8003::154, 2606:50c0:8000::154, 2606:50c0:8002::154, ...Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8003::154|:443... connected.HTTP request sent, awaiting response... 200 OKLength: 75042 (73K) [text/plain]Saving to: ‘data/paul_graham/paul_graham_essay.txt’
data/paul_graham/pa 100%[===================>] 73.28K --.-KB/s in 0.009s
2024-05-09 17:56:26 (7.81 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]
# load documentsdocuments = SimpleDirectoryReader("./data/paul_graham/").load_data()
# build indexindex = VectorStoreIndex.from_documents(documents=documents)
Retrieve top 10 most relevant nodes, then filter with Cohere Rerank
Section titled “Retrieve top 10 most relevant nodes, then filter with Cohere Rerank”import osfrom llama_index.postprocessor.cohere_rerank import CohereRerank
api_key = os.environ["COHERE_API_KEY"]cohere_rerank = CohereRerank(api_key=api_key, top_n=2)
query_engine = index.as_query_engine( similarity_top_k=10, node_postprocessors=[cohere_rerank],)response = query_engine.query( "What did Sam Altman do in this essay?",)
pprint_response(response, show_source=True)
Final Response: Sam Altman was asked if he wanted to be the presidentof Y Combinator. Initially, he declined as he wanted to start astartup focused on making nuclear reactors. However, after persistentpersuasion, he eventually agreed to become the president of YCombinator starting with the winter 2014 batch.______________________________________________________________________Source Node 1/2Node ID: 7ecf4eb2-215d-45e4-ba08-44d9219c7fa6Similarity: 0.93033177Text: When I was dealing with some urgent problem during YC, there wasabout a 60% chance it had to do with HN, and a 40% chance it had dowith everything else combined. [17] As well as HN, I wrote all ofYC's internal software in Arc. But while I continued to work a gooddeal in Arc, I gradually stopped working on Arc, partly because Ididn't have t...______________________________________________________________________Source Node 2/2Node ID: 88be17e9-e0a0-49e1-9ff8-f2b7aa7493edSimilarity: 0.86269903Text: Up till that point YC had been controlled by the original LLC wefour had started. But we wanted YC to last for a long time, and to dothat it couldn't be controlled by the founders. So if Sam said yes,we'd let him reorganize YC. Robert and I would retire, and Jessica andTrevor would become ordinary partners. When we asked Sam if he wantedto...
Directly retrieve top 2 most similar nodes
Section titled “Directly retrieve top 2 most similar nodes”query_engine = index.as_query_engine( similarity_top_k=2,)response = query_engine.query( "What did Sam Altman do in this essay?",)
Retrieved context is irrelevant and response is hallucinated.
pprint_response(response, show_source=True)
Final Response: Sam Altman was asked to become the president of YCombinator, initially declined the offer to pursue starting a startupfocused on nuclear reactors, but eventually agreed to take overstarting with the winter 2014 batch.______________________________________________________________________Source Node 1/2Node ID: 7ecf4eb2-215d-45e4-ba08-44d9219c7fa6Similarity: 0.8308840369082053Text: When I was dealing with some urgent problem during YC, there wasabout a 60% chance it had to do with HN, and a 40% chance it had dowith everything else combined. [17] As well as HN, I wrote all ofYC's internal software in Arc. But while I continued to work a gooddeal in Arc, I gradually stopped working on Arc, partly because Ididn't have t...______________________________________________________________________Source Node 2/2Node ID: 88be17e9-e0a0-49e1-9ff8-f2b7aa7493edSimilarity: 0.8230144027954406Text: Up till that point YC had been controlled by the original LLC wefour had started. But we wanted YC to last for a long time, and to dothat it couldn't be controlled by the founders. So if Sam said yes,we'd let him reorganize YC. Robert and I would retire, and Jessica andTrevor would become ordinary partners. When we asked Sam if he wantedto...