Skip to content
LlamaIndex Framework
Integrations
Embeddings

Heroku LLM Managed Inference Embedding

The llama-index-embeddings-heroku package contains LlamaIndex integrations for building applications with embeddings models on Heroku’s Managed Inference platform. This integration allows you to easily connect to and use AI models deployed on Heroku’s infrastructure.

%pip install llama-index-embeddings-heroku

First, create an app in Heroku:

Terminal window
heroku create $APP_NAME

Create and attach a chat model to your app:

Terminal window
heroku ai:models:create -a $APP_NAME cohere-embed-multilingual --as EMBEDDING

Export the required configuration variables:

Terminal window
export EMBEDDING_KEY=$(heroku config:get EMBEDDING_KEY -a $APP_NAME)
export EMBEDDING_MODEL_ID=$(heroku config:get EMBEDDING_MODEL_ID -a $APP_NAME)
export EMBEDDING_URL=$(heroku config:get EMBEDDING_URL -a $APP_NAME)
# Initialize the Heroku LLM
from llama_index.embeddings.heroku import HerokuEmbedding
# Initialize the Heroku Embedding
embedding_model = HerokuEmbedding()
# Get a single embedding
embedding = embedding_model.get_text_embedding("Hello, world!")
print(f"Embedding dimension: {len(embedding)}")
# Get embeddings for multiple texts
texts = ["Hello", "world", "from", "Heroku"]
embeddings = embedding_model.get_text_embedding_batch(texts)
print(f"Number of embeddings: {len(embeddings)}")

The integration automatically reads from environment variables:

import os
# Set environment variables
os.environ["EMBEDDING_KEY"] = "your-embedding-key"
os.environ["EMBEDDING_URL"] = "https://us.inference.heroku.com"
os.environ["EMBEDDING_MODEL_ID"] = "claude-3-5-haiku"
# Initialize without parameters
llm = HerokuEmbedding()

You can also pass parameters directly:

import os
from llama_index.embeddings.heroku import HerokuEmbedding
embedding_model = HerokuEmbedding(
model=os.getenv("EMBEDDING_MODEL_ID", "cohere-embed-multilingual"),
api_key=os.getenv("EMBEDDING_KEY", "your-embedding-key"),
base_url=os.getenv("EMBEDDING_URL", "https://us.inference.heroku.com"),
timeout=60.0,
)
print(embedding_model.get_text_embedding("Hello Heroku!"))

For a complete list of available models, see the Heroku Managed Inference documentation.

The integration includes proper error handling for common issues:

  • Missing API key
  • Invalid inference URL
  • Missing model configuration

For more information about Heroku Managed Inference, visit the official documentation.

Note for AI agents: this documentation is built for programmatic access. - Overview of all docs: https://developers.llamaindex.ai/llms.txt - Any page is available as raw Markdown by appending index.md to its URL — e.g. https://developers.llamaindex.ai/llamaparse/parse/getting_started/index.md - Agent-friendly REST search APIs live under https://developers.llamaindex.ai/api/ — search (BM25 full-text), grep (regex), read (fetch a page), and list (browse the doc tree). See https://developers.llamaindex.ai/llms.txt for parameters. - A hosted documentation MCP server is available at https://developers.llamaindex.ai/mcp. If you support MCP, you can ask the user to install it for browsing these docs directly (an alternative to the REST API). Setup: https://developers.llamaindex.ai/python/shared/mcp/