## Create Batch Job

`BatchCreateResponse beta().batch().create(BatchCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())`

**post** `/api/v1/beta/batch-processing`

Create a batch processing job.

Processes files from a directory or a specific list of item IDs.
Supports batch parsing and classification operations.

Provide either `directory_id` to process all files in a directory,
or `item_ids` for specific items. The job runs asynchronously —
poll `GET /batch/{job_id}` for progress.

### Parameters

- `BatchCreateParams params`

  - `Optional<String> organizationId`

  - `Optional<String> projectId`

  - `Optional<String> temporalNamespace`

  - `JobConfig jobConfig`

    Job configuration — either a parse or classify config

    - `class BatchParseJobRecordCreate:`

      Batch-specific parse job record for batch processing.

      This model contains the metadata and configuration for a batch parse job,
      but excludes file-specific information. It's used as input to the batch
      parent workflow and combined with DirectoryFile data to create full
      ParseJobRecordCreate instances for each file.

      Attributes:
      job_name: Must be PARSE_RAW_FILE
      partitions: Partitions for job output location
      parameters: Generic parse configuration (BatchParseJobConfig)
      session_id: Upstream request ID for tracking
      correlation_id: Correlation ID for cross-service tracking
      parent_job_execution_id: Parent job execution ID if nested
      user_id: User who created the job
      project_id: Project this job belongs to
      webhook_url: Optional webhook URL for job completion notifications

      - `Optional<String> correlationId`

        The correlation ID for this job. Used for tracking the job across services.

      - `Optional<JobName> jobName`

        - `PARSE_RAW_FILE_JOB("parse_raw_file_job")`

      - `Optional<Parameters> parameters`

        Generic parse job configuration for batch processing.

        This model contains the parsing configuration that applies to all files
        in a batch, but excludes file-specific fields like file_name, file_id, etc.
        Those file-specific fields are populated from DirectoryFile data when
        creating individual ParseJobRecordCreate instances for each file.

        The fields in this model should be generic settings that apply uniformly
        to all files being processed in the batch.

        - `Optional<Boolean> adaptiveLongTable`

        - `Optional<Boolean> aggressiveTableExtraction`

        - `Optional<Boolean> annotateLinks`

        - `Optional<Boolean> autoMode`

        - `Optional<String> autoModeConfigurationJson`

        - `Optional<Boolean> autoModeTriggerOnImageInPage`

        - `Optional<String> autoModeTriggerOnRegexpInPage`

        - `Optional<Boolean> autoModeTriggerOnTableInPage`

        - `Optional<String> autoModeTriggerOnTextInPage`

        - `Optional<String> azureOpenAIApiVersion`

        - `Optional<String> azureOpenAIDeploymentName`

        - `Optional<String> azureOpenAIEndpoint`

        - `Optional<String> azureOpenAIKey`

        - `Optional<Double> bboxBottom`

        - `Optional<Double> bboxLeft`

        - `Optional<Double> bboxRight`

        - `Optional<Double> bboxTop`

        - `Optional<String> boundingBox`

        - `Optional<Boolean> compactMarkdownTable`

        - `Optional<String> complementalFormattingInstruction`

        - `Optional<String> contentGuidelineInstruction`

        - `Optional<Boolean> continuousMode`

        - `Optional<CustomMetadata> customMetadata`

          The custom metadata to attach to the documents.

        - `Optional<Boolean> disableImageExtraction`

        - `Optional<Boolean> disableOcr`

        - `Optional<Boolean> disableReconstruction`

        - `Optional<Boolean> doNotCache`

        - `Optional<Boolean> doNotUnrollColumns`

        - `Optional<Boolean> enableCostOptimizer`

        - `Optional<Boolean> extractCharts`

        - `Optional<Boolean> extractLayout`

        - `Optional<Boolean> extractPrintedPageNumber`

        - `Optional<Boolean> fastMode`

        - `Optional<String> formattingInstruction`

        - `Optional<String> gpt4oApiKey`

        - `Optional<Boolean> gpt4oMode`

        - `Optional<Boolean> guessXlsxSheetName`

        - `Optional<Boolean> hideFooters`

        - `Optional<Boolean> hideHeaders`

        - `Optional<Boolean> highResOcr`

        - `Optional<Boolean> htmlMakeAllElementsVisible`

        - `Optional<Boolean> htmlRemoveFixedElements`

        - `Optional<Boolean> htmlRemoveNavigationElements`

        - `Optional<String> httpProxy`

        - `Optional<Boolean> ignoreDocumentElementsForLayoutDetection`

        - `Optional<List<ImagesToSave>> imagesToSave`

          - `SCREENSHOT("screenshot")`

          - `EMBEDDED("embedded")`

          - `LAYOUT("layout")`

        - `Optional<Boolean> inlineImagesInMarkdown`

        - `Optional<String> inputS3Path`

        - `Optional<String> inputS3Region`

          The region for the input S3 bucket.

        - `Optional<String> inputUrl`

        - `Optional<Boolean> internalIsScreenshotJob`

        - `Optional<Boolean> invalidateCache`

        - `Optional<Boolean> isFormattingInstruction`

        - `Optional<Double> jobTimeoutExtraTimePerPageInSeconds`

        - `Optional<Double> jobTimeoutInSeconds`

        - `Optional<Boolean> keepPageSeparatorWhenMergingTables`

        - `Optional<String> lang`

          The language.

        - `Optional<List<ParsingLanguages>> languages`

          - `AF("af")`

          - `AZ("az")`

          - `BS("bs")`

          - `CS("cs")`

          - `CY("cy")`

          - `DA("da")`

          - `DE("de")`

          - `EN("en")`

          - `ES("es")`

          - `ET("et")`

          - `FR("fr")`

          - `GA("ga")`

          - `HR("hr")`

          - `HU("hu")`

          - `ID("id")`

          - `IS("is")`

          - `IT("it")`

          - `KU("ku")`

          - `LA("la")`

          - `LT("lt")`

          - `LV("lv")`

          - `MI("mi")`

          - `MS("ms")`

          - `MT("mt")`

          - `NL("nl")`

          - `NO("no")`

          - `OC("oc")`

          - `PI("pi")`

          - `PL("pl")`

          - `PT("pt")`

          - `RO("ro")`

          - `RS_LATIN("rs_latin")`

          - `SK("sk")`

          - `SL("sl")`

          - `SQ("sq")`

          - `SV("sv")`

          - `SW("sw")`

          - `TL("tl")`

          - `TR("tr")`

          - `UZ("uz")`

          - `VI("vi")`

          - `AR("ar")`

          - `FA("fa")`

          - `UG("ug")`

          - `UR("ur")`

          - `BN("bn")`

          - `AS("as")`

          - `MNI("mni")`

          - `RU("ru")`

          - `RS_CYRILLIC("rs_cyrillic")`

          - `BE("be")`

          - `BG("bg")`

          - `UK("uk")`

          - `MN("mn")`

          - `ABQ("abq")`

          - `ADY("ady")`

          - `KBD("kbd")`

          - `AVA("ava")`

          - `DAR("dar")`

          - `INH("inh")`

          - `CHE("che")`

          - `LBE("lbe")`

          - `LEZ("lez")`

          - `TAB("tab")`

          - `TJK("tjk")`

          - `HI("hi")`

          - `MR("mr")`

          - `NE("ne")`

          - `BH("bh")`

          - `MAI("mai")`

          - `ANG("ang")`

          - `BHO("bho")`

          - `MAH("mah")`

          - `SCK("sck")`

          - `NEW("new")`

          - `GOM("gom")`

          - `SA("sa")`

          - `BGC("bgc")`

          - `TH("th")`

          - `CH_SIM("ch_sim")`

          - `CH_TRA("ch_tra")`

          - `JA("ja")`

          - `KO("ko")`

          - `TA("ta")`

          - `TE("te")`

          - `KN("kn")`

        - `Optional<Boolean> layoutAware`

        - `Optional<Boolean> lineLevelBoundingBox`

        - `Optional<String> markdownTableMultilineHeaderSeparator`

        - `Optional<Long> maxPages`

        - `Optional<Long> maxPagesEnforced`

        - `Optional<Boolean> mergeTablesAcrossPagesInMarkdown`

        - `Optional<String> model`

        - `Optional<Boolean> outlinedTableExtraction`

        - `Optional<Boolean> outputPdfOfDocument`

        - `Optional<String> outputS3PathPrefix`

          If specified, llamaParse will save the output to the specified path. All output file will use this 'prefix' should be a valid s3:// url

        - `Optional<String> outputS3Region`

          The region for the output S3 bucket.

        - `Optional<Boolean> outputTablesAsHtml`

        - `Optional<String> outputBucket`

          The output bucket.

        - `Optional<Double> pageErrorTolerance`

        - `Optional<String> pageFooterPrefix`

        - `Optional<String> pageFooterSuffix`

        - `Optional<String> pageHeaderPrefix`

        - `Optional<String> pageHeaderSuffix`

        - `Optional<String> pagePrefix`

        - `Optional<String> pageSeparator`

        - `Optional<String> pageSuffix`

        - `Optional<ParsingMode> parseMode`

          Enum for representing the mode of parsing to be used.

          - `PARSE_PAGE_WITHOUT_LLM("parse_page_without_llm")`

          - `PARSE_PAGE_WITH_LLM("parse_page_with_llm")`

          - `PARSE_PAGE_WITH_LVM("parse_page_with_lvm")`

          - `PARSE_PAGE_WITH_AGENT("parse_page_with_agent")`

          - `PARSE_PAGE_WITH_LAYOUT_AGENT("parse_page_with_layout_agent")`

          - `PARSE_DOCUMENT_WITH_LLM("parse_document_with_llm")`

          - `PARSE_DOCUMENT_WITH_LVM("parse_document_with_lvm")`

          - `PARSE_DOCUMENT_WITH_AGENT("parse_document_with_agent")`

        - `Optional<String> parsingInstruction`

        - `Optional<String> pipelineId`

          The pipeline ID.

        - `Optional<Boolean> preciseBoundingBox`

        - `Optional<Boolean> premiumMode`

        - `Optional<Boolean> presentationOutOfBoundsContent`

        - `Optional<Boolean> presentationSkipEmbeddedData`

        - `Optional<Boolean> preserveLayoutAlignmentAcrossPages`

        - `Optional<Boolean> preserveVerySmallText`

        - `Optional<String> preset`

        - `Optional<Priority> priority`

          The priority for the request. This field may be ignored or overwritten depending on the organization tier.

          - `LOW("low")`

          - `MEDIUM("medium")`

          - `HIGH("high")`

          - `CRITICAL("critical")`

        - `Optional<String> projectId`

        - `Optional<Boolean> removeHiddenText`

        - `Optional<FailPageMode> replaceFailedPageMode`

          Enum for representing the different available page error handling modes.

          - `RAW_TEXT("raw_text")`

          - `BLANK_PAGE("blank_page")`

          - `ERROR_MESSAGE("error_message")`

        - `Optional<String> replaceFailedPageWithErrorMessagePrefix`

        - `Optional<String> replaceFailedPageWithErrorMessageSuffix`

        - `Optional<ResourceInfo> resourceInfo`

          The resource info about the file

        - `Optional<Boolean> saveImages`

        - `Optional<Boolean> skipDiagonalText`

        - `Optional<Boolean> specializedChartParsingAgentic`

        - `Optional<Boolean> specializedChartParsingEfficient`

        - `Optional<Boolean> specializedChartParsingPlus`

        - `Optional<Boolean> specializedImageParsing`

        - `Optional<Boolean> spreadsheetExtractSubTables`

        - `Optional<Boolean> spreadsheetForceFormulaComputation`

        - `Optional<Boolean> spreadsheetIncludeHiddenSheets`

        - `Optional<Boolean> strictModeBuggyFont`

        - `Optional<Boolean> strictModeImageExtraction`

        - `Optional<Boolean> strictModeImageOcr`

        - `Optional<Boolean> strictModeReconstruction`

        - `Optional<Boolean> structuredOutput`

        - `Optional<String> structuredOutputJsonSchema`

        - `Optional<String> structuredOutputJsonSchemaName`

        - `Optional<String> systemPrompt`

        - `Optional<String> systemPromptAppend`

        - `Optional<Boolean> takeScreenshot`

        - `Optional<String> targetPages`

        - `Optional<String> tier`

        - `Optional<Type> type`

          - `PARSE("parse")`

        - `Optional<Boolean> useVendorMultimodalModel`

        - `Optional<String> userPrompt`

        - `Optional<String> vendorMultimodalApiKey`

        - `Optional<String> vendorMultimodalModelName`

        - `Optional<String> version`

        - `Optional<List<WebhookConfiguration>> webhookConfigurations`

          Outbound webhook endpoints to notify on job status changes

          - `Optional<List<WebhookEvent>> webhookEvents`

            Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered.

            - `EXTRACT_PENDING("extract.pending")`

            - `EXTRACT_SUCCESS("extract.success")`

            - `EXTRACT_ERROR("extract.error")`

            - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")`

            - `EXTRACT_CANCELLED("extract.cancelled")`

            - `PARSE_PENDING("parse.pending")`

            - `PARSE_RUNNING("parse.running")`

            - `PARSE_SUCCESS("parse.success")`

            - `PARSE_ERROR("parse.error")`

            - `PARSE_PARTIAL_SUCCESS("parse.partial_success")`

            - `PARSE_CANCELLED("parse.cancelled")`

            - `CLASSIFY_PENDING("classify.pending")`

            - `CLASSIFY_RUNNING("classify.running")`

            - `CLASSIFY_SUCCESS("classify.success")`

            - `CLASSIFY_ERROR("classify.error")`

            - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")`

            - `CLASSIFY_CANCELLED("classify.cancelled")`

            - `SHEETS_PENDING("sheets.pending")`

            - `SHEETS_SUCCESS("sheets.success")`

            - `SHEETS_ERROR("sheets.error")`

            - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")`

            - `SHEETS_CANCELLED("sheets.cancelled")`

            - `UNMAPPED_EVENT("unmapped_event")`

          - `Optional<WebhookHeaders> webhookHeaders`

            Custom HTTP headers sent with each webhook request (e.g. auth tokens)

          - `Optional<String> webhookOutputFormat`

            Response format sent to the webhook: 'string' (default) or 'json'

          - `Optional<String> webhookUrl`

            URL to receive webhook POST notifications

        - `Optional<String> webhookUrl`

      - `Optional<String> parentJobExecutionId`

        The ID of the parent job execution.

      - `Optional<Partitions> partitions`

        The partitions for this execution. Used for determining where to save job output.

      - `Optional<String> projectId`

        The ID of the project this job belongs to.

      - `Optional<String> sessionId`

        The upstream request ID that created this job. Used for tracking the job across services.

      - `Optional<String> userId`

        The ID of the user that created this job

      - `Optional<String> webhookUrl`

        The URL that needs to be called at the end of the parsing job.

    - `class ClassifyJob:`

      A classify job.

      - `String id`

        Unique identifier

      - `String projectId`

        The ID of the project

      - `List<ClassifierRule> rules`

        The rules to classify the files

        - `String description`

          Natural language description of what to classify. Be specific about the content characteristics that identify this document type.

        - `String type`

          The document type to assign when this rule matches (e.g., 'invoice', 'receipt', 'contract')

      - `StatusEnum status`

        The status of the classify job

        - `PENDING("PENDING")`

        - `SUCCESS("SUCCESS")`

        - `ERROR("ERROR")`

        - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")`

        - `CANCELLED("CANCELLED")`

      - `String userId`

        The ID of the user

      - `Optional<LocalDateTime> createdAt`

        Creation datetime

      - `Optional<LocalDateTime> effectiveAt`

      - `Optional<String> errorMessage`

        Error message for the latest job attempt, if any.

      - `Optional<String> jobRecordId`

        The job record ID associated with this status, if any.

      - `Optional<Mode> mode`

        The classification mode to use

        - `FAST("FAST")`

        - `MULTIMODAL("MULTIMODAL")`

      - `Optional<ClassifyParsingConfiguration> parsingConfiguration`

        The configuration for the parsing job

        - `Optional<ParsingLanguages> lang`

          The language to parse the files in

          - `AF("af")`

          - `AZ("az")`

          - `BS("bs")`

          - `CS("cs")`

          - `CY("cy")`

          - `DA("da")`

          - `DE("de")`

          - `EN("en")`

          - `ES("es")`

          - `ET("et")`

          - `FR("fr")`

          - `GA("ga")`

          - `HR("hr")`

          - `HU("hu")`

          - `ID("id")`

          - `IS("is")`

          - `IT("it")`

          - `KU("ku")`

          - `LA("la")`

          - `LT("lt")`

          - `LV("lv")`

          - `MI("mi")`

          - `MS("ms")`

          - `MT("mt")`

          - `NL("nl")`

          - `NO("no")`

          - `OC("oc")`

          - `PI("pi")`

          - `PL("pl")`

          - `PT("pt")`

          - `RO("ro")`

          - `RS_LATIN("rs_latin")`

          - `SK("sk")`

          - `SL("sl")`

          - `SQ("sq")`

          - `SV("sv")`

          - `SW("sw")`

          - `TL("tl")`

          - `TR("tr")`

          - `UZ("uz")`

          - `VI("vi")`

          - `AR("ar")`

          - `FA("fa")`

          - `UG("ug")`

          - `UR("ur")`

          - `BN("bn")`

          - `AS("as")`

          - `MNI("mni")`

          - `RU("ru")`

          - `RS_CYRILLIC("rs_cyrillic")`

          - `BE("be")`

          - `BG("bg")`

          - `UK("uk")`

          - `MN("mn")`

          - `ABQ("abq")`

          - `ADY("ady")`

          - `KBD("kbd")`

          - `AVA("ava")`

          - `DAR("dar")`

          - `INH("inh")`

          - `CHE("che")`

          - `LBE("lbe")`

          - `LEZ("lez")`

          - `TAB("tab")`

          - `TJK("tjk")`

          - `HI("hi")`

          - `MR("mr")`

          - `NE("ne")`

          - `BH("bh")`

          - `MAI("mai")`

          - `ANG("ang")`

          - `BHO("bho")`

          - `MAH("mah")`

          - `SCK("sck")`

          - `NEW("new")`

          - `GOM("gom")`

          - `SA("sa")`

          - `BGC("bgc")`

          - `TH("th")`

          - `CH_SIM("ch_sim")`

          - `CH_TRA("ch_tra")`

          - `JA("ja")`

          - `KO("ko")`

          - `TA("ta")`

          - `TE("te")`

          - `KN("kn")`

        - `Optional<Long> maxPages`

          The maximum number of pages to parse

        - `Optional<List<Long>> targetPages`

          The pages to target for parsing (0-indexed, so first page is at 0)

      - `Optional<LocalDateTime> updatedAt`

        Update datetime

  - `Optional<Long> continueAsNewThreshold`

    Maximum files to process per execution cycle in directory mode. Defaults to page_size.

  - `Optional<String> directoryId`

    ID of the directory containing files to process

  - `Optional<List<String>> itemIds`

    List of specific item IDs to process. Either this or directory_id must be provided.

  - `Optional<Long> pageSize`

    Number of files to process per batch when using directory mode

### Returns

- `class BatchCreateResponse:`

  Response schema for a batch processing job.

  - `String id`

    Unique identifier for the batch job

  - `JobType jobType`

    Type of processing operation (parse or classify)

    - `PARSE("parse")`

    - `EXTRACT("extract")`

    - `CLASSIFY("classify")`

  - `String projectId`

    Project this job belongs to

  - `Status status`

    Current job status

    - `PENDING("pending")`

    - `RUNNING("running")`

    - `DISPATCHED("dispatched")`

    - `COMPLETED("completed")`

    - `FAILED("failed")`

    - `CANCELLED("cancelled")`

  - `long totalItems`

    Total number of items in the job

  - `Optional<LocalDateTime> completedAt`

    Timestamp when job completed

  - `Optional<LocalDateTime> createdAt`

    Creation datetime

  - `Optional<String> directoryId`

    Directory being processed

  - `Optional<LocalDateTime> effectiveAt`

  - `Optional<String> errorMessage`

    Error message for the latest job attempt, if any.

  - `Optional<Long> failedItems`

    Number of items that failed processing

  - `Optional<String> jobRecordId`

    The job record ID associated with this status, if any.

  - `Optional<Long> processedItems`

    Number of items processed so far

  - `Optional<Long> skippedItems`

    Number of items skipped (already processed or size limit)

  - `Optional<LocalDateTime> startedAt`

    Timestamp when job processing started

  - `Optional<LocalDateTime> updatedAt`

    Update datetime

  - `Optional<String> workflowId`

    Async job tracking ID

### Example

```java
package com.llamacloud_prod.api.example;

import com.llamacloud_prod.api.client.LlamaCloudClient;
import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient;
import com.llamacloud_prod.api.models.beta.batch.BatchCreateParams;
import com.llamacloud_prod.api.models.beta.batch.BatchCreateResponse;

public final class Main {
    private Main() {}

    public static void main(String[] args) {
        LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv();

        BatchCreateParams params = BatchCreateParams.builder()
            .jobConfig(BatchCreateParams.JobConfig.BatchParseJobRecordCreate.builder().build())
            .build();
        BatchCreateResponse batch = client.beta().batch().create(params);
    }
}
```

#### Response

```json
{
  "id": "bjb-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "job_type": "parse",
  "project_id": "proj-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "status": "pending",
  "total_items": 0,
  "completed_at": "2019-12-27T18:11:19.117Z",
  "created_at": "2019-12-27T18:11:19.117Z",
  "directory_id": "dir-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "effective_at": "2019-12-27T18:11:19.117Z",
  "error_message": "error_message",
  "failed_items": 0,
  "job_record_id": "job_record_id",
  "processed_items": 0,
  "skipped_items": 0,
  "started_at": "2019-12-27T18:11:19.117Z",
  "updated_at": "2019-12-27T18:11:19.117Z",
  "workflow_id": "workflow_id"
}
```
