Screenshots Options
Screenshots options allow you to configure whether page screenshots are generated during document parsing.
How It Works
Section titled âHow It WorksâWhen enabled, this feature takes a screenshot of each page in the document, providing a visual representation of the original page layout alongside the extracted text content.
Configuration
Section titled âConfigurationâIn v2, screenshot generation is controlled via the images_to_save array. Include "screenshot" in the array to enable screenshot generation.
Enable Screenshots
Section titled âEnable ScreenshotsâEnable screenshot generation for each page of the document:
{ "output_options": { "images_to_save": ["screenshot"] }}Combine with Other Image Types
Section titled âCombine with Other Image TypesâYou can combine screenshots with other image types:
{ "output_options": { "images_to_save": ["screenshot", "embedded", "layout"] }}Disable Screenshots
Section titled âDisable ScreenshotsâTo disable screenshots, simply omit "screenshot" from the array, or use an empty array to disable all image saving:
{ "output_options": { "images_to_save": [] }}Note: If
images_to_saveis not specified, no images (including screenshots) are saved by default.
Complete API Request Example
Section titled âComplete API Request ExampleâParse the Document with Screenshot Generation Enabled
Section titled âParse the Document with Screenshot Generation Enabledâcurl -X 'POST' \ 'https://api.cloud.llamaindex.ai/api/v2/parse' \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ --data '{ "file_id": "<file_id>", "tier": "cost_effective", "version": "latest", "output_options": { "images_to_save": ["screenshot"] } }'Retrieving Screenshots
Section titled âRetrieving ScreenshotsâAfter parsing completes, retrieve screenshot metadata using the images_content_metadata expand parameter. This returns presigned URLs for direct download:
curl -X 'GET' \ 'https://api.cloud.llamaindex.ai/api/v2/parse/{job_id}?expand=images_content_metadata' \ -H 'Accept: application/json' \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"The response includes metadata with presigned URLs:
{ "job": { ... }, "images_content_metadata": { "total_count": 5, "images": [ { "filename": "screenshot_page_0.png", "content_type": "image/png", "size_bytes": 45678, "presigned_url": "https://s3.amazonaws.com/..." }, { "filename": "screenshot_page_1.png", "content_type": "image/png", "size_bytes": 52341, "presigned_url": "https://s3.amazonaws.com/..." } ] }}Use the presigned_url for each image to download the screenshots directly. URLs are temporary and valid for a limited time.
import httpxfrom llama_cloud import LlamaCloud
client = LlamaCloud(api_key="LLAMA_CLOUD_API_KEY" )
# Upload and parse a document, requdesting image content metadataresult = await client.parsing.parse( upload_file="example_file.pdf", tier="agentic", version="latest", output_options={ "screenshots": {"enable": True}, }, expand=["images_content_metadata"],)
# Download extracted images using presigned URLsfor image in result.images_content_metadata.images: print(f"Downloading {image.filename}, {image.size_bytes} bytes") with open(f"{image.filename}", "wb") as img_file: with httpx.Client() as http_client: response = http_client.get(image.presigned_url) img_file.write(response.content)import fs from "fs";import { LlamaCloud } from "@llamaindex/llama-cloud";
const client = new LlamaCloud({ apiKey: "LLAMA_CLOUD_API_KEY",});
const result = await client.parsing.parse({ upload_file: fs.createReadStream('example_file.pdf'), tier: "agentic", version: "latest", expand: ["images_content_metadata"], output_options: { screenshots: { enable: true } }});
for (const image of result.images_content_metadata!.images) { console.log(`Downloading ${image.filename}, ${image.size_bytes} bytes`); const response = await fetch(image.presigned_url); const arrayBuffer = await response.arrayBuffer(); fs.writeFileSync(image.filename, Buffer.from(arrayBuffer));}