---
title: Getting Started with Batches | Developer Documentation
description: Run Parse V2 or Extract V2 over every file in a directory, poll for completion, and inspect per-file results.
---

## Overview

Batches let you run the same product job over every file in a directory. A batch references:

1. A source directory containing the files to process.
2. A product configuration ID for the work to run on each file.

Use `parse_v2` with a built-in Parse preset or saved Parse configuration to parse every file. Use `extract_v2` with a saved Extract configuration to extract from every file.

Batch creation is limited to 1,000 source files.

## Prerequisites

- A [LlamaCloud account](https://cloud.llamaindex.ai) with a Pro or Enterprise plan

- An API key ([how to create one](/cloud/general/api_key/index.md))

- A configuration ID for the job to run:

  - For `parse_v2`, use a built-in preset such as `cfg-PARSE_AGENTIC`, or a saved Parse configuration ID.
  - For `extract_v2`, use a saved Extract configuration ID.

Tip

The easiest way to create configurations is from the Parse or Extract playground in the LlamaCloud UI. You can also create them through the [Product Configurations API](/llamaparse/extract/examples/using_saved_configurations/index.md).

## Create a Batch

For one-off uploads, create an `ephemeral` directory so the source directory is automatically eligible for cleanup. Then upload files into that directory and create the batch.

- [Python](#tab-panel-620)
- [TypeScript](#tab-panel-621)

```
import asyncio
from datetime import datetime, timedelta, timezone
from pathlib import Path


from llama_cloud import AsyncLlamaCloud


client = AsyncLlamaCloud(api_key="<your-api-key>")


configuration_id = "cfg-PARSE_AGENTIC"
expires_at = (datetime.now(timezone.utc) + timedelta(days=2)).isoformat()


directory = await client.beta.directories.create(
    name="invoice-batch",
    type="ephemeral",
    expires_at=expires_at,
)


for path in Path("./invoices").glob("*.pdf"):
    await client.beta.directories.files.upload(
        directory.id,
        upload_file=path,
        display_name=path.name,
    )


batch = await client.batches.create(
    source_directory_id=directory.id,
    config={
        "job": {
            "type": "parse_v2",
            "configuration_id": configuration_id,
        },
    },
)


print(batch.id, batch.status)
```

```
import fs from "fs";
import path from "path";
import LlamaCloud from "@llamaindex/llama-cloud";


const client = new LlamaCloud({
  apiKey: "<your-api-key>",
});


const configurationId = "cfg-PARSE_AGENTIC";
const expiresAt = new Date(Date.now() + 2 * 24 * 60 * 60 * 1000).toISOString();


const directory = await client.beta.directories.create({
  name: "invoice-batch",
  type: "ephemeral",
  expires_at: expiresAt,
});


for (const fileName of fs.readdirSync("./invoices")) {
  if (!fileName.endsWith(".pdf")) continue;


  await client.beta.directories.files.upload(directory.id, {
    upload_file: fs.createReadStream(path.join("./invoices", fileName)),
    display_name: fileName,
  });
}


let batch = await client.batches.create({
  source_directory_id: directory.id,
  config: {
    job: {
      type: "parse_v2",
      configuration_id: configurationId,
    },
  },
});


console.log(batch.id, batch.status);
```

To run extraction instead, use a saved `extract_v2` configuration ID and set `type` to `"extract_v2"`.

## Poll for Completion

Batch processing is asynchronous. Poll the batch status until it reaches a terminal state.

- [Python](#tab-panel-622)
- [TypeScript](#tab-panel-623)

```
terminal_statuses = {"COMPLETED", "FAILED", "CANCELLED"}


while batch.status not in terminal_statuses:
    await asyncio.sleep(10)
    batch = await client.batches.get(batch.id)


print(batch.status)
```

```
const terminalStatuses = new Set(["COMPLETED", "FAILED", "CANCELLED"]);


while (!terminalStatuses.has(batch.status)) {
  await new Promise((resolve) => setTimeout(resolve, 10_000));
  batch = await client.batches.get(batch.id);
}


console.log(batch.status);
```

## Inspect Per-File Results

Use `expand=results` on the get endpoint to include the source-file to product-job mappings. Results may be `null` while the batch is still running. When available, `results` contains one entry per source file in the batch. Per-file failures are returned in `results[*].error_message`; successful files include a `job_reference` for the underlying Parse or Extract job.

A trimmed response can include successful and failed file-level entries in the same batch:

```
{
  "id": "bat-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "status": "COMPLETED",
  "results": [
    {
      "source_directory_file_id": "dfl-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
      "job_reference": {
        "type": "parse_v2",
        "id": "pjb-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
      }
    },
    {
      "source_directory_file_id": "dfl-bbbbbbbb-cccc-dddd-eeee-ffffffffffff",
      "error_message": "Unable to process source file."
    }
  ]
}
```

- [Python](#tab-panel-624)
- [TypeScript](#tab-panel-625)

```
batch = await client.batches.get(batch.id, expand=["results"])


if batch.status == "FAILED":
    raise RuntimeError("Batch orchestration failed")


for result in batch.results or []:
    if result.error_message:
        print(result.source_directory_file_id, "failed:", result.error_message)
        continue


    if result.job_reference is None:
        print(result.source_directory_file_id, "has no job yet")
        continue


    print(
        result.source_directory_file_id,
        result.job_reference.type,
        result.job_reference.id,
    )
```

```
const detail = await client.batches.get(batch.id, {
  expand: ["results"],
});


if (detail.status === "FAILED") {
  throw new Error("Batch orchestration failed");
}


for (const result of detail.results ?? []) {
  if (result.error_message) {
    console.log(result.source_directory_file_id, "failed:", result.error_message);
    continue;
  }


  if (!result.job_reference) {
    console.log(result.source_directory_file_id, "has no job yet");
    continue;
  }


  console.log(
    result.source_directory_file_id,
    result.job_reference.type,
    result.job_reference.id,
  );
}
```

`results` contains references to the underlying Parse or Extract jobs. Fetch those jobs through their product endpoints to inspect job status and outputs.

- [Python](#tab-panel-626)
- [TypeScript](#tab-panel-627)

```
for result in batch.results or []:
    ref = result.job_reference
    if ref is None:
        continue


    if ref.type == "parse_v2":
        job = await client.parsing.get(ref.id)
    else:
        job = await client.extract.get(ref.id)


    print(ref.id, job.status)
```

```
for (const result of detail.results ?? []) {
  const ref = result.job_reference;
  if (!ref) continue;


  const job =
    ref.type === "parse_v2"
      ? await client.parsing.get(ref.id)
      : await client.extract.get(ref.id);


  console.log(ref.id, job.status);
}
```

## List Batches

You can list batches for the current project and filter by status or source directory.

- [Python](#tab-panel-628)
- [TypeScript](#tab-panel-629)

```
async for item in client.batches.list(
    status="RUNNING",
    source_directory_id=directory.id,
):
    print(item.id, item.status)
```

```
for await (const item of client.batches.list({
  status: "RUNNING",
  source_directory_id: directory.id,
})) {
  console.log(item.id, item.status);
}
```

## Failure Semantics

Batch-level `FAILED` means the orchestration failed and the batch cannot provide a reliable per-file result set.

Per-file failures are represented in `results[*].error_message` when a source file could not be processed or mapped to a product job. If a result has a `job_reference`, use the referenced Parse or Extract job endpoint for the underlying job status and output.

## REST Endpoints

| Operation    | Endpoint                         |
| ------------ | -------------------------------- |
| Create batch | `POST /api/v2/batches`           |
| List batches | `GET /api/v2/batches`            |
| Get batch    | `GET /api/v2/batches/{batch_id}` |
