Getting Started

Quickly start parsing documents with LlamaParse—whether you prefer Python, TypeScript, or using the web UI. This guide walks you through creating an API key and running your first job.

Get your API Key

🔑 Before you begin: You’ll need an API key to access LlamaParse services.

Get your API key →

Choose Your Setup

Using LlamaParse in the Web UI

If you’re non-technical or just want to quickly sandbox LlamaParse, the web interface is the easiest way to get started.

Step-by-Step Workflow

Go to LlamaCloud
Choose a parsing Tier from Recommended Settings or switch to Advanced settings for a custom configuration
Upload your document
Click Parse and view your parsed results right in the browser

Choosing a Tier

LlamaParse offers four main tiers:

Cost Effective – Optimized for speed and cost. Best for text-heavy documents with minimal structure.
Agentic – Works well with documents that have images and diagrams, but may struggle with complex layouts.
Agentic Plus – Maximum fidelity. Best for complex layouts, tables, and visual structure.
Fast – A special mode that only outputs the spatial text of your documents, but not the markdown.

Install the package

pip install llama_cloud>=1.0

Parse in Python

Usage

from llama_cloud import LlamaCloud, AsyncLlamaCloud
import httpx
import re

client = AsyncLlamaCloud(api_key="llx-...")

# Upload and parse a document
file_obj = await client.files.create(file="./attention_is_all_you_need.pdf", purpose="parse")

result = await client.parsing.parse(
    file_id=file_obj.id,
    tier="agentic",
    version="latest",

    # Options specific to the input file type, e.g. html, spreadsheet, presentation, etc.
    input_options={},

    # Control the output structure and markdown styling
    output_options={
        "markdown": {
            "tables": {
                "output_tables_as_markdown": False,
            },
        },
        # Saving images for later retrieval
        "images_to_save": ["screenshot"],
    },

    # Options for controlling how we process the document
    processing_options={
        "ignore": {
            "ignore_diagonal_text": True,
        },
        "ocr_parameters": {
            "languages": ["fr"]
        }
    },

    # Parsed content to include in the returned response
    expand=["text", "markdown", "items", "images_content_metadata"],
)

print(result.markdown.pages[0].markdown)
print(result.text.pages[0].text)

# Iterate over page items to find tables
for page in result.items.pages:
    for item in page.items:
        if isinstance(item, ItemsPageStructuredResultPageItemTableItem):
            print(f"Table found on page {page.page_number} with {len(item.rows)} rows and {item.b_box} location")

def is_page_screenshot(image_name: str) -> bool:
  return re.match(r"^page_(\d+)\.jpg$", image_name) is not None

# Iterate over results looking for page screenshots
for image in result.images_content_metadata.images:
    if image.presigned_url is None or not is_page_screenshot(image.filename):
        continue

    print(f"Downloading {image.filename}, {image.size_bytes} bytes")
    with open(f"{image.filename}", "wb") as img_file:
        async with httpx.AsyncClient() as http_client:
            response = await http_client.get(image.presigned_url)
            img_file.write(response.content)

Install the package

npm install @llamaindex/llama-cloud

Parse in TypeScript

Let’s create a parse.ts file and put our dependencies in it:

import LlamaCloud from '@llamaindex/llama-cloud';
import fs from 'fs';

const client = new LlamaCloud({ apiKey: process.env.LLAMA_CLOUD_API_KEY });

console.log('Uploading file...');

// Upload and parse a document
const fileObj = await client.files.create({
    file: fs.createReadStream('./attention_is_all_you_need.pdf'),
    purpose: 'parse',
});

console.log('Parsing file...');

const result = await client.parsing.parse({
    file_id: fileObj.id,
    tier: 'agentic',
    version: 'latest',

    // Options specific to the input file type, e.g. html, spreadsheet, presentation, etc.
    input_options: {},

    // Control the output structure and markdown styling
    output_options: {
        markdown: {
            tables: {
                output_tables_as_markdown: false,
            },
        },
        // Saving images for later retrieval
        images_to_save: ['screenshot'],
    },

    // Options for controlling how we process the document
    processing_options: {
        ignore: {
            ignore_diagonal_text: true,
        },
        ocr_parameters: {
            languages: ['en'],
        },
    },

    // Parsed content to include in the returned response
    expand: ['text', 'markdown', 'items', 'images_content_metadata'],
});

console.log(result.markdown.pages[0].markdown);
console.log(result.text.pages[0].text);

// Iterate over page items to find tables
for (const page of result.items.pages) {
    for (const item of page.items) {
        if (item.type === 'table') {
            console.log(
                `Table found on page ${page.page_number} with ${item.rows.length} rows and ${JSON.stringify(item.bbox)} location`
            );
        }
    }
}

const isPageScreenshot = (imageName) => /^page_(\d+)\.jpg$/.test(imageName);

// Iterate over results looking for page screenshots
for (const image of result.images_content_metadata.images) {
    if (!image.presigned_url || !isPageScreenshot(image.filename)) {
        continue;
    }

    console.log(`Downloading ${image.filename}, ${image.size_bytes} bytes`);
    const response = await fetch(image.presigned_url);
    const buffer = Buffer.from(await response.arrayBuffer());
    fs.writeFileSync(image.filename, buffer);
}

Run the script with:

LLAMA_CLOUD_API_KEY="llx-..." npx ts-node parse.ts

Using the REST API

If you would prefer to use a raw API, the REST API lets you integrate parsing into any environment—no client required. Below are sample endpoints to help you get started.

1. Upload a file

Send a document to the File API:

curl -X POST \
  https://api.cloud.llamaindex.ai/api/v1/files/ \
  -H 'Accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Get the id from the response:

{
  "id": "cafe1337-e0dd-4762-b5f5-769fef112558",
  ...
}

2. Parse the file

Pass the File ID to LlamaParse as a file_id. Don’t forget to select a tier as well:

curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/v2/parse/' \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --data '{
    "file_id": "<file_id>",
    "tier": "agentic_plus",
    "version": "latest"
  }'

You’ll get a job_id as a response:

{
  "id": "c0defee1-76a0-42c3-bbed-094e4566b762",
  "status": "PENDING"
}

3. Check job status and retrieve results

Use the job_id returned from the parse step to check status and retrieve results. The result endpoint returns metadata (including status) along with the requested result types:

curl -X 'GET' \
  'https://api.cloud.llamaindex.ai/api/v2/parse/<job_id>/result?include_markdown=true' \
  -H 'accept: application/json' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

The response includes a metadata object with the job status (e.g., PENDING, SUCCESS, ERROR). Once status is SUCCESS, the requested result data will be included.

Available query parameters:

include_text=true - Include spatial text result
include_markdown=true - Include markdown result
include_items=true - Include items tree result
include_json_output=true - Include JSON output result

See more details in our API Reference

Example

Here is an example notebook for Raw API Usage

Resources

See Credit Pricing & Usage
Next steps? Check out LlamaExtract to extract structured data from unstructured documents!