Skip to content
Guide
Parse
Guides

REST API Guide

Parse REST API endpoints, request/response shapes, expand values, and presigned URL patterns.

For the complete SDK and API documentation, see the Parse API Reference →

This page covers REST API usage patterns: endpoint overview, request examples, response shapes, and the expand parameter quick-reference.

MethodURLUse case
POST /api/v2/parseParse a file by ID or URL (JSON body)
GET /api/v2/parseList and filter parse jobs with pagination
GET /api/v2/parse/{job_id}Check job status and retrieve results

Required header: Authorization: Bearer YOUR_API_KEY on all endpoints.

Send a JSON body with file_id (or source_url), tier, version, and any options:

Terminal window
curl -X POST 'https://api.cloud.llamaindex.ai/api/v2/parse' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--data '{
"file_id": "<file_id>",
"tier": "agentic",
"version": "latest"
}'

The response includes a job.id. Use it to retrieve results with the expand query parameter (see Retrieving results below). See Configuring Parse for every available option.


Terminal window
curl 'https://api.cloud.llamaindex.ai/api/v2/parse?page_size=10&status=COMPLETED' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Query parameters:

ParameterTypeDescription
page_sizeintegerMax items per page (optional)
page_tokenstringToken for the next page (from a previous response)
statusstringFilter: PENDING, RUNNING, COMPLETED, FAILED, CANCELLED

Response:

{
"items": [
{
"id": "job-uuid-1",
"project_id": "project-uuid",
"status": "COMPLETED",
"error_message": null
}
],
"next_page_token": "eyJsYXN0X2lkIjogImpvYi...",
"total_size": 42
}

Terminal window
curl 'https://api.cloud.llamaindex.ai/api/v2/parse/{job_id}?expand=markdown,items,metadata' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

The response always includes a job object with id, project_id, status, and error_message. Additional fields depend on the expand parameter.

ValueWhat it returns
textPlain text per page
markdownMarkdown per page
itemsStructured JSON items (tables, headings, paragraphs per page)
metadataPage-level metadata (confidence, speaker notes, cost_optimized, etc.)
job_metadataUsage and processing details
text_fullFull plain text as a single string (all pages concatenated)
markdown_fullFull markdown as a single string (all pages concatenated)
text_content_metadataText file presigned download URL
markdown_content_metadataMarkdown file presigned download URL
items_content_metadataItems file presigned download URL
metadata_content_metadataMetadata file presigned download URL
text_full_content_metadataFull text file presigned download URL
markdown_full_content_metadataFull markdown file presigned download URL
xlsx_content_metadataXLSX file presigned download URL
output_pdf_content_metadataOutput PDF presigned download URL
images_content_metadataImages metadata with per-image presigned URLs

Combine multiple values: ?expand=markdown,items,images_content_metadata

For detailed guidance on choosing expand values, see Retrieving Results.

When requesting *_content_metadata expand values, the response includes presigned URLs for direct download:

XLSX and PDF:

{
"result_content_metadata": {
"xlsx": {
"size_bytes": 15234,
"exists": true,
"presigned_url": "https://s3.amazonaws.com/..."
},
"outputPDF": {
"size_bytes": 102400,
"exists": true,
"presigned_url": "https://s3.amazonaws.com/..."
}
}
}

Images:

{
"images_content_metadata": {
"total_count": 3,
"images": [
{
"index": 0,
"filename": "image_0.png",
"content_type": "image/png",
"size_bytes": 12345,
"presigned_url": "https://s3.amazonaws.com/..."
}
]
}
}

You can filter images with the image_filenames query parameter:

Terminal window
curl 'https://api.cloud.llamaindex.ai/api/v2/parse/{job_id}?expand=images_content_metadata&image_filenames=image_0.png,image_1.jpg' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Presigned URLs are temporary. Download files promptly after retrieving them, or call the endpoint again for fresh URLs.


v2 returns structured validation errors:

{
"detail": [
{
"type": "value_error",
"loc": ["tier"],
"msg": "Unsupported tier: invalid_tier. Must be one of: fast, cost_effective, agentic, agentic_plus",
"input": {}
}
]
}