DeepInfra
To use DeepInfra embeddings, you need to import DeepInfraEmbedding from llamaindex.
Check out available embedding models here.
Installation
Section titled “Installation”npm i llamaindex @llamaindex/deepinfraimport { Document, Settings, VectorStoreIndex } from "llamaindex";import { DeepInfraEmbedding } from "@llamaindex/deepinfra";
// Update Embed ModelSettings.embedModel = new DeepInfraEmbedding();
const document = new Document({ text: essay, id_: "essay" });
const index = await VectorStoreIndex.fromDocuments([document]);
const queryEngine = index.asQueryEngine();
const query = "What is the meaning of life?";
const results = await queryEngine.query({ query,});By default, DeepInfraEmbedding is using the sentence-transformers/clip-ViT-B-32 model. You can change the model by passing the model parameter to the constructor. For example:
import { DeepInfraEmbedding } from "@llamaindex/deepinfra";
const model = "intfloat/e5-large-v2";Settings.embedModel = new DeepInfraEmbedding({ model,});You can also set the maxRetries and timeout parameters when initializing DeepInfraEmbedding for better control over the request behavior.
For example:
import { Settings } from "llamaindex";import { DeepInfraEmbedding } from "@llamaindex/deepinfra";
const model = "intfloat/e5-large-v2";const maxRetries = 5;const timeout = 5000; // 5 seconds
Settings.embedModel = new DeepInfraEmbedding({ model, maxRetries, timeout,});Standalone usage:
import { DeepInfraEmbedding } from "@llamaindex/deepinfra";import { config } from "dotenv";// For standalone usage, you need to configure DEEPINFRA_API_TOKEN in .env fileconfig();
const main = async () => { const model = "intfloat/e5-large-v2"; const embeddings = new DeepInfraEmbedding({ model }); const text = "What is the meaning of life?"; const response = await embeddings.embed([text]); console.log(response);};
main();For questions or feedback, please contact us at feedback@deepinfra.com
API Reference
Section titled “API Reference”Note for AI agents: this documentation is built for programmatic access.
- Overview of all docs: https://developers.llamaindex.ai/llms.txt
- Any page is available as raw Markdown by appending index.md to its URL — e.g. https://developers.llamaindex.ai/llamaparse/parse/getting_started/index.md
- Agent-friendly REST search APIs live under https://developers.llamaindex.ai/api/ — search (BM25 full-text), grep (regex), read (fetch a page), and list (browse the doc tree). See https://developers.llamaindex.ai/llms.txt for parameters.
- A hosted documentation MCP server is available at https://developers.llamaindex.ai/mcp. If you support MCP, you can ask the user to install it for browsing these docs directly (an alternative to the REST API). Setup: https://developers.llamaindex.ai/python/shared/mcp/