Skip to content

Embedded Images Options

Embedded images options allow you to configure whether images found within documents are extracted and made available as separate files.

When enabled, this feature extracts images that are embedded within the document (such as charts, diagrams, photos, or illustrations) and makes them available as separate image files alongside the parsed text content.

In v2, embedded image extraction is controlled via the images_to_save array. Include "embedded" in the array to enable extraction of embedded images.

Enable extraction of embedded images from documents:

{
"output_options": {
"images_to_save": ["embedded"]
}
}

You can combine embedded images with other image types:

{
"output_options": {
"images_to_save": ["screenshot", "embedded", "layout"]
}
}

Available image categories:

  • "screenshot" - Full page screenshots
  • "embedded" - Images embedded within the document
  • "layout" - Cropped images from layout detection

Note: If images_to_save is not specified, no images (including embedded images) are saved by default.

Parse the Document with Embedded Images Extraction Enabled

Section titled “Parse the Document with Embedded Images Extraction Enabled”
Terminal window
curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/v2/parse' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--data '{
"file_id": "<file_id>",
"tier": "agentic",
"version": "latest",
"output_options": {
"images_to_save": ["embedded"]
}
}'

After parsing completes, retrieve image metadata using the images_content_metadata expand parameter. This returns presigned URLs for direct download:

Terminal window
curl -X 'GET' \
'https://api.cloud.llamaindex.ai/api/v2/parse/{job_id}?expand=images_content_metadata' \
-H 'Accept: application/json' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

The response includes metadata with presigned URLs:

{
"job": { ... },
"images_content_metadata": {
"total_count": 3,
"images": [
{
"filename": "image_0.png",
"content_type": "image/png",
"size_bytes": 12345,
"presigned_url": "https://s3.amazonaws.com/..."
},
{
"filename": "image_1.jpg",
"content_type": "image/jpeg",
"size_bytes": 23456,
"presigned_url": "https://s3.amazonaws.com/..."
}
]
}
}

You can also filter to specific image filenames:

Terminal window
curl -X 'GET' \
'https://api.cloud.llamaindex.ai/api/v2/parse/{job_id}?expand=images_content_metadata&image_filenames=image_0.png,image_1.jpg' \
-H 'Accept: application/json' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Use the presigned_url for each image to download directly. URLs are temporary and valid for a limited time.