Skip to content

Let's create some demo corpus

sentences = [ “BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.”, “BM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document”, ] documents = [Document(doc_id=i, text=s) for i, s in enumerate(sentences)]

```python
# Indexing with BGE-M3 model
index = BGEM3Index.from_documents(
documents,
weights_for_different_modes=[
0.4,
0.2,
0.4,
], # [dense_weight, sparse_weight, multi_vector_weight]
)
retriever = index.as_retriever()
response = retriever.retrieve("What is BGE-M3?")
query_engine = index.as_query_engine()
response = query_engine.query("What is BGE-M3?")