Centralized Provider Configuration
Self-Hosting Documentation Access
This section requires a password to access. Interested in self-hosting? Contact sales to learn more.
Centralized LLM provider configuration is the deploy-time registry of model endpoints your self-hosted LlamaParse deployment can call.
The simpler provider settings under config.llms.* register the default
endpoints for each provider. config.llms.providerConfigs adds explicit
entries when you need multiple credentials, custom endpoints, custom headers,
provider-native model names, or an API gateway such as Bifrost, Portkey, or
LiteLLM.
Each entry maps a LlamaParse model_id to one provider endpoint:
model_idis the LlamaParse model identifier that platform features request.providersays which client implementation to use.provider_model_nameis the model or deployment name sent to the upstream provider or gateway. If omitted, LlamaParse uses its built-in provider-native name for thatmodel_id.priorityandtagsare routing hints. They only affect features that look at them, such as parsing tier ordering or product-specific model preference.
A recognized model_id is not an upstream availability guarantee. The provider
account, region, Azure deployment, Vertex location, or gateway must still be
able to serve the configured provider-native model.
Use Cases
Section titled “Use Cases”- Custom API Gateways: Route LLM requests through gateways like Bifrost, Portkey, or LiteLLM
- Custom Endpoints: Use custom base URLs for proxies or regional endpoints
- Custom Headers: Add custom HTTP headers per provider instance (e.g., for gateway authentication)
- Multiple Credentials: Configure multiple provider instances with different API keys
- Managed Embeddings: Route managed Index embeddings through centrally configured OpenAI-compatible embedding providers
Configuration Structure
Section titled “Configuration Structure”Add provider configurations to your Helm values under config.llms.providerConfigs:
config: llms: providerConfigs: - id: "my-config-name" # User-defined identifier (can be anything) provider: "openai" # Provider type accepted by this config entry model_id: "openai-gpt-4o" # Recognized LlamaParse model identifier (fixed values, see below) provider_model_name: "gpt-4o" # Optional: Provider-specific model name override, usually only needed if using an LLM gateway that requires a specific model name enabled: true # Enable/disable this configuration tags: # Optional: see "Controlling Parsing Model Order" - "llamaparse-tier:agentic" priority: 100 # Optional: higher = preferred within a tier (default 100) credentials: # Provider-specific credentials api_key: "sk-..." base_url: "https://custom.api.endpoint" # Optional custom endpoint headers: # Custom HTTP headers (optional) X-Custom-Header: "value"Most entries do not need tags. Use tags when a feature documents a specific tag convention, such as parsing tier tags.
Supported Providers
Section titled “Supported Providers”These provider sections document the provider values covered by this guide and
the recognized model_id values for each provider.
OpenAI
Section titled “OpenAI”- id: "openai-primary" provider: "openai" model_id: "openai-gpt-4o" credentials: api_key: "sk-..." # Required org_id: "org-..." # Optional base_url: "https://api.openai.com/v1" # Optional headers: # Optional X-Custom-Header: "value"Recognized model_id values:
openai-gpt-4oopenai-gpt-4o-miniopenai-gpt-4o-mini-text-onlyopenai-gpt-4o-mini-multimodalopenai-gpt-4-1openai-gpt-4-1-miniopenai-gpt-4-1-nanoopenai-gpt-5openai-gpt-5-miniopenai-gpt-5-nanoopenai-gpt-5-2openai-gpt-5-4openai-gpt-5-4-miniopenai-gpt-5-4-nanoopenai-text-embedding-3-smallopenai-text-embedding-3-large
The openai-gpt-4o-mini-text-only and openai-gpt-4o-mini-multimodal ids both
route to the gpt-4o-mini model; they select the text-only vs screenshot-based
cost_effective parsing path, while the bare
openai-gpt-4o-mini remains the legacy text-only binding.
Anthropic
Section titled “Anthropic”- id: "anthropic-primary" provider: "anthropic" model_id: "anthropic-sonnet-4.5" credentials: api_key: "sk-ant-..." # Required base_url: "https://api.anthropic.com" # Optional headers: # Optional X-Custom-Header: "value"Recognized model_id values:
anthropic-sonnet-4.6anthropic-sonnet-4.5anthropic-sonnet-4.0anthropic-sonnet-3.7anthropic-sonnet-3.5anthropic-sonnet-3.5-v2anthropic-haiku-4.5anthropic-haiku-3.5anthropic-opus-4.6anthropic-opus-4.5
Azure OpenAI
Section titled “Azure OpenAI”- id: "azure-sweden" provider: "azure" model_id: "openai-gpt-4o" credentials: api_key: "..." # Required endpoint: "https://your-resource.openai.azure.com" # Required deployment_id: "gpt-4o" # Optional api_version: "2024-08-06" # Optional headers: # Optional X-Custom-Header: "value"Recognized model_id values:
openai-gpt-4oopenai-gpt-4o-miniopenai-gpt-4o-mini-text-onlyopenai-gpt-4o-mini-multimodalopenai-gpt-4-1openai-gpt-4-1-miniopenai-gpt-4-1-nanoopenai-gpt-5openai-gpt-5-miniopenai-gpt-5-nanoopenai-gpt-5-2openai-gpt-5-4openai-gpt-5-4-miniopenai-gpt-5-4-nanoopenai-text-embedding-3-smallopenai-text-embedding-3-large
The openai-gpt-4o-mini-text-only and openai-gpt-4o-mini-multimodal ids both
route to the gpt-4o-mini model (default deployment name gpt-4o-mini); they
select the text-only vs screenshot-based cost_effective parsing path, while
the bare openai-gpt-4o-mini remains the legacy text-only binding.
For Azure-hosted embedding models, set model_id to the LlamaParse model identifier and credentials.deployment_id to your Azure deployment name.
Google Gemini
Section titled “Google Gemini”- id: "gemini-primary" provider: "gemini" model_id: "gemini-2.5-flash" credentials: api_key: "AIza..." # Required base_url: "https://generativelanguage.googleapis.com" # Optional headers: # Optional X-Custom-Header: "value"Recognized model_id values:
gemini-3.1-progemini-3.1-flash-litegemini-3.0-progemini-3.0-flashgemini-2.5-progemini-2.5-flashgemini-2.5-flash-litegemini-2.0-flashgemini-2.0-flash-lite
Google Vertex AI
Section titled “Google Vertex AI”Vertex AI serves both Gemini and Anthropic Claude models. By default, Vertex
configs use service-account authentication with project_id + location +
credentials (JSON-serialised service account key). Set
credentials_type: "proxy" when a configured gateway handles Vertex
authentication.
- id: "vertex-primary" provider: "vertexai" model_id: "gemini-2.5-flash" credentials: project_id: "your-gcp-project-id" # Required location: "us-central1" # Required credentials: |- # Required (service account key JSON) { "type": "service_account", "project_id": "your-gcp-project-id", "private_key_id": "...", "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n", "client_email": "...@your-gcp-project-id.iam.gserviceaccount.com", "client_id": "..." } base_url: "https://us-central1-aiplatform.googleapis.com" # Optional headers: # Optional X-Custom-Header: "value"Recognized model_id values (Gemini on Vertex):
gemini-3.1-progemini-3.1-flash-litegemini-3.0-progemini-3.0-flashgemini-2.5-progemini-2.5-flashgemini-2.5-flash-litegemini-2.0-flashgemini-2.0-flash-lite
Recognized model_id values (Anthropic on Vertex):
anthropic-sonnet-4.6anthropic-sonnet-4.5anthropic-sonnet-4.0anthropic-sonnet-3.7anthropic-haiku-4.5anthropic-opus-4.6anthropic-opus-4.5
Controlling Parsing Model Order (per tier)
Section titled “Controlling Parsing Model Order (per tier)”Parsing runs each tier against an ordered list of models: it tries
the first model and falls back to the next whenever a model is unavailable or
fails. On a self-hosted / BYOC deployment you control which models a tier uses,
and in what order, by tagging provider config entries and setting their
priority.
This section only affects parsing tier membership and ordering. It does not control embedding, chat, or extraction model selection.
Tiers you can configure
Section titled “Tiers you can configure”| Tag | Tier | Used for |
|---|---|---|
llamaparse-tier:cost_effective | Cost Effective | the low-cost parsing tier |
llamaparse-tier:agentic | Agentic | the default agentic transcription |
llamaparse-tier:agentic_plus | Agentic Plus | agentic transcription with the self-critique judges |
llamaparse-tier:specialized_chart_parsing | Chart parsing | the dedicated chart-to-table model |
How ordering works
Section titled “How ordering works”- Add
llamaparse-tier:<tier>to thetagsof every config entry that should be part of that tier’s list. Only entries carrying the tag join the list. - Within a tier, models are tried in descending
priorityorder — the highestpriorityvalue is the primary model, the next is the first fallback, and so on.prioritydefaults to100, so give each entry in a tier a distinct value to make the order explicit. - A model still only runs when its credentials validate at startup and the provider is reachable; an unavailable model is skipped and the next one is tried.
- If a tier has no tagged entries, the deployment falls back to the built-in default model list for that tier in the deployed application version. So you only need to tag the tiers you want to override.
- The same entry can carry multiple tier tags (e.g. a model used by both
agenticandagentic_plus).
Example: ordering the BYOC agentic tier
Section titled “Example: ordering the BYOC agentic tier”This makes Sonnet 4.5 the primary agentic model, GPT-4.1 the first fallback, and Haiku 4.5 the last resort:
config: llms: providerConfigs: - id: "agentic-primary-sonnet" provider: "anthropic" model_id: "anthropic-sonnet-4.5" enabled: true tags: - "llamaparse-tier:agentic" - "llamaparse-tier:agentic_plus" priority: 30 # highest → tried first (primary) credentials: api_key: "sk-ant-..."
- id: "agentic-fallback-gpt" provider: "openai" model_id: "openai-gpt-4-1" enabled: true tags: - "llamaparse-tier:agentic" priority: 20 # middle → first fallback credentials: api_key: "sk-..."
- id: "agentic-fallback-haiku" provider: "anthropic" model_id: "anthropic-haiku-4.5" enabled: true tags: - "llamaparse-tier:agentic" priority: 10 # lowest → last fallback credentials: api_key: "sk-ant-..."Resulting agentic order: anthropic-sonnet-4.5 → openai-gpt-4-1 →
anthropic-haiku-4.5. The agentic_plus tier here resolves to just
anthropic-sonnet-4.5 (the only entry tagged for it); add more
llamaparse-tier:agentic_plus entries with their own priorities to extend it.
Example: vision-capable GPT-4o-mini for the cost_effective tier
Section titled “Example: vision-capable GPT-4o-mini for the cost_effective tier”The GPT-4o-mini runner-binding variants let a deployment run the
cost_effective tier with the screenshot-based transcription path as primary
and the text-only path as last resort — useful when documents contain
rasterized tables that are invisible to text-only transcription:
config: llms: providerConfigs: - id: "ce-gpt-4o-mini-vision" provider: "openai" # or "azure" model_id: "openai-gpt-4o-mini-multimodal" enabled: true tags: - "llamaparse-tier:cost_effective" priority: 20 # highest → tried first (primary) credentials: api_key: "sk-..."
- id: "ce-gpt-4o-mini-text" provider: "openai" model_id: "openai-gpt-4o-mini-text-only" enabled: true tags: - "llamaparse-tier:cost_effective" priority: 10 # lowest → last fallback credentials: api_key: "sk-..."Resulting cost_effective order: openai-gpt-4o-mini-multimodal →
openai-gpt-4o-mini-text-only. With only these two entries tagged, the tier’s
built-in default list is fully replaced — other default models are not
attempted.
Common Use Cases
Section titled “Common Use Cases”Custom API Gateway
Section titled “Custom API Gateway”Use a custom API gateway or proxy (e.g., Bifrost, Portkey, LiteLLM):
config: llms: providerConfigs: - id: "portkey-openai" provider: "openai" model_id: "openai-gpt-4o-mini" provider_model_name: "@openai/gpt-4o-mini" enabled: true credentials: api_key: "your-portkey-api-key" base_url: "https://api.portkey.ai/v1" headers: x-portkey-api-key: "your-portkey-api-key"
- id: "portkey-anthropic" provider: "anthropic" model_id: "anthropic-sonnet-4.5" provider_model_name: "@anthropic/claude-sonnet-4-5" enabled: true credentials: api_key: "your-portkey-api-key" base_url: "https://api.portkey.ai" headers: x-portkey-api-key: "your-portkey-api-key" x-portkey-strict-open-ai-compliance: "False"
- id: "portkey-vertex-gemini" provider: "vertexai" model_id: "gemini-2.5-flash" provider_model_name: "gemini-2.5-flash" enabled: true credentials: credentials_type: "proxy" project_id: "your-gcp-project-id" location: "us-central1" base_url: "https://api.portkey.ai/v1" headers: x-portkey-api-key: "your-portkey-api-key" x-portkey-provider: "@your-portkey-vertex-provider-id" x-portkey-strict-open-ai-compliance: "false"Verification
Section titled “Verification”After configuration, verify your setup:
-
Verify in Admin UI: Check the LlamaCloud admin interface for available models
-
Test parsing: Upload a document to LlamaParse to confirm the configured providers are working