Create Batch Job
Create a batch processing job.
Processes files from a directory or a specific list of item IDs. Supports batch parsing and classification operations.
Provide either directory_id to process all files in a directory,
or item_ids for specific items. The job runs asynchronously —
poll GET /batch/{job_id} for progress.
ParametersExpand Collapse
params: BatchCreateParams { job_config, organization_id, project_id, 5 more }
job_config: BatchParseJobRecordCreate { correlation_id, job_name, parameters, 6 more } | ClassifyJob { id, project_id, rules, 9 more }
Body param: Job configuration — either a parse or classify config
BatchParseJobRecordCreate { correlation_id, job_name, parameters, 6 more }
Batch-specific parse job record for batch processing.
This model contains the metadata and configuration for a batch parse job, but excludes file-specific information. It’s used as input to the batch parent workflow and combined with DirectoryFile data to create full ParseJobRecordCreate instances for each file.
Attributes: job_name: Must be PARSE_RAW_FILE partitions: Partitions for job output location parameters: Generic parse configuration (BatchParseJobConfig) session_id: Upstream request ID for tracking correlation_id: Correlation ID for cross-service tracking parent_job_execution_id: Parent job execution ID if nested user_id: User who created the job project_id: Project this job belongs to webhook_url: Optional webhook URL for job completion notifications
correlation_id?: string | null
The correlation ID for this job. Used for tracking the job across services.
parameters?: Parameters | null
Generic parse job configuration for batch processing.
This model contains the parsing configuration that applies to all files in a batch, but excludes file-specific fields like file_name, file_id, etc. Those file-specific fields are populated from DirectoryFile data when creating individual ParseJobRecordCreate instances for each file.
The fields in this model should be generic settings that apply uniformly to all files being processed in the batch.
output_s3_path_prefix?: string | null
If specified, llamaParse will save the output to the specified path. All output file will use this ‘prefix’ should be a valid s3:// url
webhook_configurations?: Array<WebhookConfiguration> | null
Outbound webhook endpoints to notify on job status changes
webhook_events?: Array<"extract.pending" | "extract.success" | "extract.error" | 14 more> | null
Events to subscribe to (e.g. ‘parse.success’, ‘extract.error’). If null, all events are delivered.
webhook_headers?: Record<string, string> | null
Custom HTTP headers sent with each webhook request (e.g. auth tokens)
partitions?: Record<string, string>
The partitions for this execution. Used for determining where to save job output.
continue_as_new_threshold?: number | null
Body param: Maximum files to process per execution cycle in directory mode. Defaults to page_size.
item_ids?: Array<string> | null
Body param: List of specific item IDs to process. Either this or directory_id must be provided.
Create Batch Job
import LlamaCloud from '@llamaindex/llama-cloud';
const client = new LlamaCloud({
apiKey: process.env['LLAMA_CLOUD_API_KEY'], // This is the default and can be omitted
});
const batch = await client.beta.batch.create({ job_config: {} });
console.log(batch.id);{
"id": "bjb-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"job_type": "parse",
"project_id": "proj-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"status": "pending",
"total_items": 0,
"completed_at": "2019-12-27T18:11:19.117Z",
"created_at": "2019-12-27T18:11:19.117Z",
"directory_id": "dir-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"effective_at": "2019-12-27T18:11:19.117Z",
"error_message": "error_message",
"failed_items": 0,
"job_record_id": "job_record_id",
"processed_items": 0,
"skipped_items": 0,
"started_at": "2019-12-27T18:11:19.117Z",
"updated_at": "2019-12-27T18:11:19.117Z",
"workflow_id": "workflow_id"
}Returns Examples
{
"id": "bjb-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"job_type": "parse",
"project_id": "proj-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"status": "pending",
"total_items": 0,
"completed_at": "2019-12-27T18:11:19.117Z",
"created_at": "2019-12-27T18:11:19.117Z",
"directory_id": "dir-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"effective_at": "2019-12-27T18:11:19.117Z",
"error_message": "error_message",
"failed_items": 0,
"job_record_id": "job_record_id",
"processed_items": 0,
"skipped_items": 0,
"started_at": "2019-12-27T18:11:19.117Z",
"updated_at": "2019-12-27T18:11:19.117Z",
"workflow_id": "workflow_id"
}