Let's create some demo corpus
sentences = [ “BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.”, “BM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document”, ] documents = [Document(doc_id=i, text=s) for i, s in enumerate(sentences)]
```python# Indexing with BGE-M3 modelindex = BGEM3Index.from_documents( documents, weights_for_different_modes=[ 0.4, 0.2, 0.4, ], # [dense_weight, sparse_weight, multi_vector_weight])
Retrieve relevant documents
Section titled “Retrieve relevant documents”retriever = index.as_retriever()response = retriever.retrieve("What is BGE-M3?")
RAG with BGE-M3
Section titled “RAG with BGE-M3”query_engine = index.as_query_engine()response = query_engine.query("What is BGE-M3?")