Migration Guide: Parse Upload Endpoint v1 to v2
This guide will help you migrate from the v1 Parse upload endpoint to the new v2 endpoint, which introduces a structured configuration approach and improved organization of parsing options.
⚠️ Alpha Version Warning: The v2 endpoint is currently in alpha (
v2alpha1
) and is subject to breaking changes until the stable release. We recommend testing thoroughly and being prepared for potential API changes during development.
Overview of Changes
Section titled “Overview of Changes”The v2 endpoint replaces individual form parameters with a single JSON configuration string, providing:
- Better organization: Related options are grouped into logical sections
- Type safety: Structured validation with clear schemas
- Parse mode separation: Only relevant options for your chosen parse mode are required
- Extensibility: Easier to add new features without endpoint bloat
- Validation: Better error messages and configuration validation
Key Differences
Section titled “Key Differences”v1 Endpoint
Section titled “v1 Endpoint”POST /api/v1/parsing/uploadContent-Type: multipart/form-data
- 70+ individual form parameters- Flat parameter structure- All parameters available regardless of parse mode
v2 Endpoint
Section titled “v2 Endpoint”POST /api/v2alpha1/parse/uploadContent-Type: multipart/form-data
- Single 'configuration' JSON string parameter- Hierarchical, structured configuration- Parse mode-specific options- Strict validation with clear error messages
Migration Steps
Section titled “Migration Steps”1. Update the Endpoint URL
Section titled “1. Update the Endpoint URL”Before (v1):
POST https://api.cloud.llamaindex.ai/api/v1/parsing/upload
After (v2):
POST https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload
2. Replace Form Parameters with Configuration JSON
Section titled “2. Replace Form Parameters with Configuration JSON”Instead of sending individual form parameters, you now send a single configuration
parameter containing a JSON string.
Note: The
file
parameter remains unchanged - you still upload files the same way using multipart form data. Only the configuration approach has changed.
3. Migration Checklist
Section titled “3. Migration Checklist”Before migrating, review this checklist:
- Check for always-enabled parameters:
adaptive_long_table
,high_res_ocr
,merge_tables_across_pages_in_markdown
,outlined_table_extraction
are always enabled in v2 - Update page indexing: Change
target_pages
from 0-based to 1-based indexing - Replace deprecated parameters: Remove
gpt4o_mode
,premium_mode
,fast_mode
, etc. - Move language parameter: Move
language
to parse mode specificocr_parameters
- Update cache parameters: Replace
invalidate_cache
+do_not_cache
with singledisable_cache
- Convert webhooks: Change from single
webhook_url
towebhook_configurations
array - Update prompts: Move prompt parameters to parse mode specific sections
- Test thoroughly: The alpha API may have additional breaking changes
Configuration Structure
Section titled “Configuration Structure”The v2 configuration follows this structure:
{ "client_name": "string (optional)", "parse_options": { "parse_mode": "preset|parse_with_llm|parse_with_agent|etc.", // Mode-specific options (see examples below) }, "source_url": { "url": "string (optional)", "http_proxy": "string (optional)" }, "webhook_configurations": [...], "input_options": {...}, "crop_box": {...}, "page_ranges": {...}, "disable_cache": "boolean (optional)", "output_options": {...}, "processing_control": {...}}
Parse Mode Options
Section titled “Parse Mode Options”Important: You can only include the sub-object that corresponds to your chosen parse mode. For example, if you choose parse_mode: "preset"
, you can only include preset_options
, not parse_with_llm_options
.
Preset Mode
Section titled “Preset Mode”v1 Example:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F "preset=scientific" \ -F "language=en,es" \ "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"
v2 Example:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F 'configuration={ "parse_options": { "parse_mode": "preset", "preset_options": { "preset": "scientific", "ocr_parameters": { "languages": ["en", "es"] } } } }' \ "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"
Parse with LLM Mode
Section titled “Parse with LLM Mode”v1 Example:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F "parse_mode=parse_page_with_llm" \ -F "model=gpt-4o" \ -F "user_prompt=Extract key information" \ -F "disable_ocr=true" \ "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"
v2 Example:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F 'configuration={ "parse_options": { "parse_mode": "parse_with_llm", "parse_with_llm_options": { "model": "gpt-4o", "prompts": { "user_prompt": "Extract key information" }, "ignore": { "ignore_text_in_image": true } } } }' \ "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"
External Provider Mode (Azure OpenAI)
Section titled “External Provider Mode (Azure OpenAI)”v1 Example:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F "parse_mode=parse_page_with_lvm" \ -F "azure_openai_endpoint=https://myresource.openai.azure.com/" \ -F "azure_openai_deployment_name=gpt-4-vision" \ -F "azure_openai_key=your-key" \ -F "azure_openai_api_version=2024-02-01" \ "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"
v2 Example:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F 'configuration={ "parse_options": { "parse_mode": "parse_with_external_provider", "parse_with_external_provider_options": { "azure_openai": { "endpoint": "https://myresource.openai.azure.com/", "deployment_name": "gpt-4-vision", "api_key": "your-key", "api_version": "2024-02-01" } } } }' \ "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"
Parameter Mapping Reference
Section titled “Parameter Mapping Reference”Basic Options
Section titled “Basic Options”v1 Parameter | v2 Location | Notes |
---|---|---|
input_url | source_url.url | Moved to structured source configuration |
http_proxy | source_url.http_proxy | Same functionality |
max_pages | page_ranges.max_pages | Same functionality |
target_pages | page_ranges.target_pages | Breaking change: Now uses 1-based indexing (user inputs “1,2,3” instead of “0,1,2”) |
invalidate_cache and do_not_cache | disable_cache | Breaking change: Single boolean combines both v1 parameters |
language | parse_options.{mode}_options.ocr_parameters.languages | Same functionality |
Important: In v1,
target_pages
used 0-based indexing (e.g., “0,1,2” for pages 1, 2, 3). In v2, it uses 1-based indexing (e.g., “1,2,3” for the same pages) to be homogenous with the rest of the platform.
Always Enabled in v2 (Breaking Changes)
Section titled “Always Enabled in v2 (Breaking Changes)”The following parameters are always enabled in v2 and cannot be disabled. We’re doing this to simplify calling LlamaParse and because these options give better results:
v1 Parameter | v2 Behavior | Breaking Change |
---|---|---|
adaptive_long_table | Always true | Breaking: Cannot be disabled in v2 |
high_res_ocr | Always true | Breaking: Cannot be disabled in v2 |
merge_tables_across_pages_in_markdown | Always true | Breaking: Cannot be disabled in v2 |
outlined_table_extraction | Always true | Breaking: Cannot be disabled in v2 |
Removed/Deprecated Parameters
Section titled “Removed/Deprecated Parameters”The following v1 parameters are not supported in v2:
v1 Parameter | v2 Status | Migration Path |
---|---|---|
use_vendor_multimodal_model | Removed (was deprecated) | Use parse_mode: "parse_with_external_provider" instead |
gpt4o_mode | Removed | Use parse_mode: "parse_with_llm" with model: "gpt-4o" |
gpt4o_api_key | Removed | Use parse_mode: "parse_with_external_provider" with appropriate provider config |
premium_mode | Removed | Use appropriate parse mode instead |
fast_mode | Removed | Use parse_mode: "parse_without_ai" for faster processing |
continuous_mode | Removed | No direct equivalent |
parsing_instruction | Renamed | Use parse_options.{mode}_options.prompts.user_prompt |
formatting_instruction | Renamed | Use parse_options.{mode}_options.prompts.user_prompt |
system_prompt | Renamed | Use parse_options.{mode}_options.prompts.system_prompt_append |
bounding_box | Renamed | Use crop_box object instead |
input_s3_path and input_s3_region | Removed | Not supported in v2alpha1 |
output_s3_path_prefix and output_s3_region | Removed | Not supported in v2alpha1 |
Webhook Configuration Breaking Changes
Section titled “Webhook Configuration Breaking Changes”v1 Parameter | v2 Location | Notes |
---|---|---|
webhook_url | webhook_configurations[0].webhook_url | Breaking: Now an array, but only first entry is used at the moment |
webhook_configurations (string) | webhook_configurations (array) | Breaking: Format changed from JSON string to structured array |
Not Yet Implemented in v2
Section titled “Not Yet Implemented in v2”The following options exist in the v2 schema but are not yet implemented:
ignore_strikethrough_text
(exists in schema but not processed)input_options.pdf.password
(placeholder for future implementation)
Crop Box Options
Section titled “Crop Box Options”v1 Parameter | v2 Location |
---|---|
bbox_top | crop_box.top |
bbox_bottom | crop_box.bottom |
bbox_left | crop_box.left |
bbox_right | crop_box.right |
Input Format Options
Section titled “Input Format Options”v1 Parameter | v2 Location |
---|---|
html_make_all_elements_visible | input_options.html.make_all_elements_visible |
html_remove_fixed_elements | input_options.html.remove_fixed_elements |
html_remove_navigation_elements | input_options.html.remove_navigation_elements |
disable_image_extraction | input_options.pdf.disable_image_extraction |
spreadsheet_extract_sub_tables | input_options.spreadsheet.detect_sub_tables_in_sheets |
Ignore Options (Parse Mode Specific)
Section titled “Ignore Options (Parse Mode Specific)”v1 Parameter | v2 Location | Available In Modes |
---|---|---|
skip_diagonal_text | parse_options.{mode}_options.ignore.ignore_diagonal_text | All modes except preset |
disable_ocr | parse_options.{mode}_options.ignore.ignore_text_in_image | All modes except preset |
Output Options
Section titled “Output Options”v1 Parameter | v2 Location |
---|---|
annotate_links | output_options.markdown.annotate_links |
page_prefix | output_options.markdown.pages.prefix |
page_separator | output_options.markdown.pages.custom_page_separator |
page_suffix | output_options.markdown.pages.suffix |
hide_headers | output_options.markdown.headers_footers.hide_headers |
hide_footers | output_options.markdown.headers_footers.hide_footers |
compact_markdown_table | output_options.markdown.tables.compact_markdown_tables |
output_tables_as_HTML | output_options.markdown.tables.output_tables_as_markdown (inverted) |
guess_xlsx_sheet_name | output_options.tables_as_spreadsheet.guess_sheet_name |
extract_layout | output_options.extract_layout.enable |
take_screenshot | output_options.screenshots.enable |
output_pdf_of_document | output_options.export_pdf.enable |
Processing Control
Section titled “Processing Control”v1 Parameter | v2 Location |
---|---|
job_timeout_in_seconds | processing_control.timeouts.base_in_seconds |
job_timeout_extra_time_per_page_in_seconds | processing_control.timeouts.extra_time_per_page_in_seconds |
page_error_tolerance | processing_control.job_failure_conditions.allowed_page_failure_ratio |
Complete Migration Examples
Section titled “Complete Migration Examples”Simple Document Parsing
Section titled “Simple Document Parsing”v1:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F "parse_mode=parse_page_with_agent" \ "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"
v2:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F 'configuration={ "parse_options": { "parse_mode": "parse_with_agent", "parse_with_agent_options": {} } } }' \ "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"
Complex Configuration with Custom Output
Section titled “Complex Configuration with Custom Output”v1:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F "parse_mode=parse_page_with_llm" \ -F "model=gpt-4o" \ -F "user_prompt=Extract financial data" \ -F "max_pages=10" \ -F "page_prefix=## Page " \ -F "hide_headers=true" \ -F "extract_layout=true" \ -F "webhook_url=https://example.com/webhook" \ "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"
v2:
curl -X POST \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -F "file=@document.pdf" \ -F 'configuration={ "parse_options": { "parse_mode": "parse_with_llm", "parse_with_llm_options": { "model": "gpt-4o", "prompts": { "user_prompt": "Extract financial data" } } }, "page_ranges": { "max_pages": 10 }, "output_options": { "markdown": { "pages": { "prefix": "## Page " }, "headers_footers": { "hide_headers": true } }, "extract_layout": { "enable": true } }, "webhook_configurations": [{ "webhook_url": "https://example.com/webhook", "webhook_events": ["parse.done"] }] }' \ "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"
Python SDK Migration
Section titled “Python SDK Migration”v1 (llama-parse):
Section titled “v1 (llama-parse):”from llama_cloud_services import LlamaParse
parser = LlamaParse( api_key="llx-...", result_type="markdown", parsing_instruction="Extract key information", max_pages=10)result = parser.load_data("document.pdf")
v2 (llama-cloud-services):
Section titled “v2 (llama-cloud-services):”from llama_cloud_services import LlamaParse
parser = LlamaParse( api_key="llx-...", preset="scientific", max_pages=10)result = parser.parse("document.pdf")
# Advanced configurationconfig = { "parse_options": { "parse_mode": "parse_with_llm", "parse_with_llm_options": { "prompts": { "user_prompt": "Extract key information" } } }, "page_ranges": {"max_pages": 10}}
parser = LlamaParse(api_key="llx-...", configuration=config)result = parser.parse("document.pdf")
Error Handling
Section titled “Error Handling”v2 provides more detailed error messages:
v1 Errors:
Section titled “v1 Errors:”400: Invalid parameter combination
v2 Errors:
Section titled “v2 Errors:”{ "detail": [ { "type": "value_error", "loc": ["parse_options", "parse_with_llm_options"], "msg": "parse_with_llm_options can only be used with parse_mode 'parse_with_llm'", "input": {...} } ]}
The v1 endpoint will remain available for the foreseeable future, so you can migrate at your own pace. However, new features and improvements will be focused on the v2 endpoint structure.