Skip to content

Jina Embeddings

If you’re opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

%pip install llama-index-embeddings-jinaai
%pip install llama-index-llms-openai
!pip install llama-index

You may also need other packages that do not come direcly with llama-index

!pip install Pillow

For this example, you will need an API key which you can get from https://jina.ai/embeddings/

# Initilise with your api key
import os
jinaai_api_key = "YOUR_JINAAI_API_KEY"
os.environ["JINAAI_API_KEY"] = jinaai_api_key

Embed text and queries with Jina embedding models through JinaAI API

Section titled “Embed text and queries with Jina embedding models through JinaAI API”

You can encode your text and your queries using the JinaEmbedding class. Jina offers a range of models adaptable to various use cases.

ModelDimensionLanguageMRL (matryoshka)Context
jina-embeddings-v31024Multilingual (89 languages)Yes8192
jina-embeddings-v2-base-en768EnglishNo8192
jina-embeddings-v2-base-de768German & EnglishNo8192
jina-embeddings-v2-base-es768Spanish & EnglishNo8192
jina-embeddings-v2-base-zh768Chinese & EnglishNo8192

Recommended Model: jina-embeddings-v3 :

We recommend jina-embeddings-v3 as the latest and most performant embedding model from Jina AI. This model features 5 task-specific adapters trained on top of its backbone, optimizing various embedding use cases.

By default JinaEmbedding class uses jina-embeddings-v3. On top of the backbone, jina-embeddings-v3 has been trained with 5 task-specific adapters for different embedding uses.

Task-Specific Adapters:

Include task in your request to optimize your downstream application:

  • retrieval.query: Used to encode user queries or questions in retrieval tasks.
  • retrieval.passage: Used to encode large documents in retrieval tasks at indexing time.
  • classification: Used to encode text for text classification tasks.
  • text-matching: Used to encode text for similarity matching, such as measuring similarity between two sentences.
  • separation: Used for clustering or reranking tasks.

Matryoshka Representation Learning:

jina-embeddings-v3 supports Matryoshka Representation Learning, allowing users to control the embedding dimension with minimal performance loss.
Include dimensions in your request to select the desired dimension.
By default, dimensions is set to 1024, and a number between 256 and 1024 is recommended.
You can reference the table below for hints on dimension vs. performance:

Dimension32641282565127681024
Average Retrieval Performance (nDCG@10)52.5458.5461.6462.7263.1663.363.35

Late Chunking in Long-Context Embedding Models

jina-embeddings-v3 supports Late Chunking, the technique to leverage the model’s long-context capabilities for generating contextual chunk embeddings. Include late_chunking=True in your request to enable contextual chunked representation. When set to true, Jina AI API will concatenate all sentences in the input field and feed them as a single string to the model. Internally, the model embeds this long concatenated string and then performs late chunking, returning a list of embeddings that matches the size of the input list.

from llama_index.embeddings.jinaai import JinaEmbedding
text_embed_model = JinaEmbedding(
api_key=jinaai_api_key,
model="jina-embeddings-v3",
# choose `retrieval.passage` to get passage embeddings
task="retrieval.passage",
)
embeddings = text_embed_model.get_text_embedding("This is the text to embed")
print("Text dim:", len(embeddings))
print("Text embed:", embeddings[:5])
query_embed_model = JinaEmbedding(
api_key=jinaai_api_key,
model="jina-embeddings-v3",
# choose `retrieval.query` to get query embeddings, or choose your desired task type
task="retrieval.query",
# `dimensions` allows users to control the embedding dimension with minimal performance loss. by default it is 1024.
# A number between 256 and 1024 is recommended.
dimensions=512,
)
embeddings = query_embed_model.get_query_embedding(
"This is the query to embed"
)
print("Query dim:", len(embeddings))
print("Query embed:", embeddings[:5])

Embed images and queries with Jina CLIP through JinaAI API

Section titled “Embed images and queries with Jina CLIP through JinaAI API”

You can also encode your images and your queries using the JinaEmbedding class

from llama_index.embeddings.jinaai import JinaEmbedding
from PIL import Image
import requests
from numpy import dot
from numpy.linalg import norm
embed_model = JinaEmbedding(
api_key=jinaai_api_key,
model="jina-clip-v1",
)
image_url = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcStMP8S3VbNCqOQd7QQQcbvC_FLa1HlftCiJw&s"
im = Image.open(requests.get(image_url, stream=True).raw)
print("Image:")
display(im)
image_embeddings = embed_model.get_image_embedding(image_url)
print("Image dim:", len(image_embeddings))
print("Image embed:", image_embeddings[:5])
text_embeddings = embed_model.get_text_embedding(
"Logo of a pink blue llama on dark background"
)
print("Text dim:", len(text_embeddings))
print("Text embed:", text_embeddings[:5])
cos_sim = dot(image_embeddings, text_embeddings) / (
norm(image_embeddings) * norm(text_embeddings)
)
print("Cosine similarity:", cos_sim)

You can also embed text in batches, the batch size can be controlled by setting the embed_batch_size parameter (the default value will be 10 if not passed, and it should not be larger than 2048)

embed_model = JinaEmbedding(
api_key=jinaai_api_key,
model="jina-embeddings-v3",
embed_batch_size=16,
task="retrieval.passage",
)
embeddings = embed_model.get_text_embedding_batch(
["This is the text to embed", "More text can be provided in a batch"]
)
print(len(embeddings))
print(embeddings[0][:5])

Let’s build a RAG pipeline using Jina AI Embeddings

Section titled “Let’s build a RAG pipeline using Jina AI Embeddings”
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
from llama_index.core.response.notebook_utils import display_source_node
from IPython.display import Markdown, display
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
your_openai_key = "YOUR_OPENAI_KEY"
llm = OpenAI(api_key=your_openai_key)
embed_model = JinaEmbedding(
api_key=jinaai_api_key,
model="jina-embeddings-v3",
embed_batch_size=16,
task="retrieval.passage",
)
index = VectorStoreIndex.from_documents(
documents=documents, embed_model=embed_model
)
search_query_retriever = index.as_retriever()
search_query_retrieved_nodes = search_query_retriever.retrieve(
"What happened after the thesis?"
)
for n in search_query_retrieved_nodes:
display_source_node(n, source_length=2000)