# Split

## Create Split Job

`SplitCreateResponse beta().split().create(SplitCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())`

**post** `/api/v1/beta/split/jobs`

Create a document split job.

### Parameters

- `SplitCreateParams params`

  - `Optional<String> organizationId`

  - `Optional<String> projectId`

  - `SplitDocumentInput documentInput`

    Document to be split.

  - `Optional<Configuration> configuration`

    Split configuration with categories and splitting strategy.

    - `List<SplitCategory> categories`

      Categories to split documents into.

      - `String name`

        Name of the category.

      - `Optional<String> description`

        Optional description of what content belongs in this category.

    - `Optional<SplittingStrategy> splittingStrategy`

      Strategy for splitting documents.

      - `Optional<AllowUncategorized> allowUncategorized`

        Controls handling of pages that don't match any category. 'include': pages can be grouped as 'uncategorized' and included in results. 'forbid': all pages must be assigned to a defined category. 'omit': pages can be classified as 'uncategorized' but are excluded from results.

        - `INCLUDE("include")`

        - `FORBID("forbid")`

        - `OMIT("omit")`

  - `Optional<String> configurationId`

    Saved split configuration ID.

### Returns

- `class SplitCreateResponse:`

  Beta response — uses nested document_input object.

  - `String id`

    Unique identifier for the split job.

  - `List<SplitCategory> categories`

    Categories used for splitting.

    - `String name`

      Name of the category.

    - `Optional<String> description`

      Optional description of what content belongs in this category.

  - `SplitDocumentInput documentInput`

    Document that was split.

    - `String type`

      Type of document input. Valid values are: file_id

    - `String value`

      Document identifier.

  - `String projectId`

    Project ID this job belongs to.

  - `String status`

    Current status of the job. Valid values are: pending, processing, completed, failed, cancelled.

  - `String userId`

    User ID who created this job.

  - `Optional<String> configurationId`

    Split configuration ID used for this job.

  - `Optional<LocalDateTime> createdAt`

    Creation datetime

  - `Optional<String> errorMessage`

    Error message if the job failed.

  - `Optional<SplitResultResponse> result`

    Result of a completed split job.

    - `List<SplitSegmentResponse> segments`

      List of document segments.

      - `String category`

        Category name this split belongs to.

      - `String confidenceCategory`

        Categorical confidence level. Valid values are: high, medium, low.

      - `List<long> pages`

        1-indexed page numbers in this split.

  - `Optional<LocalDateTime> updatedAt`

    Update datetime

### Example

```java
package com.llamacloud_prod.api.example;

import com.llamacloud_prod.api.client.LlamaCloudClient;
import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient;
import com.llamacloud_prod.api.models.beta.split.SplitCreateParams;
import com.llamacloud_prod.api.models.beta.split.SplitCreateResponse;
import com.llamacloud_prod.api.models.beta.split.SplitDocumentInput;

public final class Main {
    private Main() {}

    public static void main(String[] args) {
        LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv();

        SplitCreateParams params = SplitCreateParams.builder()
            .documentInput(SplitDocumentInput.builder()
                .type("type")
                .value("value")
                .build())
            .build();
        SplitCreateResponse split = client.beta().split().create(params);
    }
}
```

#### Response

```json
{
  "id": "id",
  "categories": [
    {
      "name": "x",
      "description": "x"
    }
  ],
  "document_input": {
    "type": "type",
    "value": "value"
  },
  "project_id": "project_id",
  "status": "status",
  "user_id": "user_id",
  "configuration_id": "configuration_id",
  "created_at": "2019-12-27T18:11:19.117Z",
  "error_message": "error_message",
  "result": {
    "segments": [
      {
        "category": "category",
        "confidence_category": "confidence_category",
        "pages": [
          0
        ]
      }
    ]
  },
  "updated_at": "2019-12-27T18:11:19.117Z"
}
```

## List Split Jobs

`SplitListPage beta().split().list(SplitListParamsparams = SplitListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())`

**get** `/api/v1/beta/split/jobs`

List document split jobs.

### Parameters

- `SplitListParams params`

  - `Optional<LocalDateTime> createdAtOnOrAfter`

    Include items created at or after this timestamp (inclusive)

  - `Optional<LocalDateTime> createdAtOnOrBefore`

    Include items created at or before this timestamp (inclusive)

  - `Optional<List<String>> jobIds`

    Filter by specific job IDs

  - `Optional<String> organizationId`

  - `Optional<Long> pageSize`

  - `Optional<String> pageToken`

  - `Optional<String> projectId`

  - `Optional<Status> status`

    Filter by job status (pending, processing, completed, failed, cancelled)

    - `PENDING("pending")`

    - `PROCESSING("processing")`

    - `COMPLETED("completed")`

    - `FAILED("failed")`

    - `CANCELLED("cancelled")`

### Returns

- `class SplitListResponse:`

  Beta response — uses nested document_input object.

  - `String id`

    Unique identifier for the split job.

  - `List<SplitCategory> categories`

    Categories used for splitting.

    - `String name`

      Name of the category.

    - `Optional<String> description`

      Optional description of what content belongs in this category.

  - `SplitDocumentInput documentInput`

    Document that was split.

    - `String type`

      Type of document input. Valid values are: file_id

    - `String value`

      Document identifier.

  - `String projectId`

    Project ID this job belongs to.

  - `String status`

    Current status of the job. Valid values are: pending, processing, completed, failed, cancelled.

  - `String userId`

    User ID who created this job.

  - `Optional<String> configurationId`

    Split configuration ID used for this job.

  - `Optional<LocalDateTime> createdAt`

    Creation datetime

  - `Optional<String> errorMessage`

    Error message if the job failed.

  - `Optional<SplitResultResponse> result`

    Result of a completed split job.

    - `List<SplitSegmentResponse> segments`

      List of document segments.

      - `String category`

        Category name this split belongs to.

      - `String confidenceCategory`

        Categorical confidence level. Valid values are: high, medium, low.

      - `List<long> pages`

        1-indexed page numbers in this split.

  - `Optional<LocalDateTime> updatedAt`

    Update datetime

### Example

```java
package com.llamacloud_prod.api.example;

import com.llamacloud_prod.api.client.LlamaCloudClient;
import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient;
import com.llamacloud_prod.api.models.beta.split.SplitListPage;
import com.llamacloud_prod.api.models.beta.split.SplitListParams;

public final class Main {
    private Main() {}

    public static void main(String[] args) {
        LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv();

        SplitListPage page = client.beta().split().list();
    }
}
```

#### Response

```json
{
  "items": [
    {
      "id": "id",
      "categories": [
        {
          "name": "x",
          "description": "x"
        }
      ],
      "document_input": {
        "type": "type",
        "value": "value"
      },
      "project_id": "project_id",
      "status": "status",
      "user_id": "user_id",
      "configuration_id": "configuration_id",
      "created_at": "2019-12-27T18:11:19.117Z",
      "error_message": "error_message",
      "result": {
        "segments": [
          {
            "category": "category",
            "confidence_category": "confidence_category",
            "pages": [
              0
            ]
          }
        ]
      },
      "updated_at": "2019-12-27T18:11:19.117Z"
    }
  ],
  "next_page_token": "next_page_token",
  "total_size": 0
}
```

## Get Split Job

`SplitGetResponse beta().split().get(SplitGetParamsparams = SplitGetParams.none(), RequestOptionsrequestOptions = RequestOptions.none())`

**get** `/api/v1/beta/split/jobs/{split_job_id}`

Get a document split job.

### Parameters

- `SplitGetParams params`

  - `Optional<String> splitJobId`

  - `Optional<String> organizationId`

  - `Optional<String> projectId`

### Returns

- `class SplitGetResponse:`

  Beta response — uses nested document_input object.

  - `String id`

    Unique identifier for the split job.

  - `List<SplitCategory> categories`

    Categories used for splitting.

    - `String name`

      Name of the category.

    - `Optional<String> description`

      Optional description of what content belongs in this category.

  - `SplitDocumentInput documentInput`

    Document that was split.

    - `String type`

      Type of document input. Valid values are: file_id

    - `String value`

      Document identifier.

  - `String projectId`

    Project ID this job belongs to.

  - `String status`

    Current status of the job. Valid values are: pending, processing, completed, failed, cancelled.

  - `String userId`

    User ID who created this job.

  - `Optional<String> configurationId`

    Split configuration ID used for this job.

  - `Optional<LocalDateTime> createdAt`

    Creation datetime

  - `Optional<String> errorMessage`

    Error message if the job failed.

  - `Optional<SplitResultResponse> result`

    Result of a completed split job.

    - `List<SplitSegmentResponse> segments`

      List of document segments.

      - `String category`

        Category name this split belongs to.

      - `String confidenceCategory`

        Categorical confidence level. Valid values are: high, medium, low.

      - `List<long> pages`

        1-indexed page numbers in this split.

  - `Optional<LocalDateTime> updatedAt`

    Update datetime

### Example

```java
package com.llamacloud_prod.api.example;

import com.llamacloud_prod.api.client.LlamaCloudClient;
import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient;
import com.llamacloud_prod.api.models.beta.split.SplitGetParams;
import com.llamacloud_prod.api.models.beta.split.SplitGetResponse;

public final class Main {
    private Main() {}

    public static void main(String[] args) {
        LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv();

        SplitGetResponse split = client.beta().split().get("split_job_id");
    }
}
```

#### Response

```json
{
  "id": "id",
  "categories": [
    {
      "name": "x",
      "description": "x"
    }
  ],
  "document_input": {
    "type": "type",
    "value": "value"
  },
  "project_id": "project_id",
  "status": "status",
  "user_id": "user_id",
  "configuration_id": "configuration_id",
  "created_at": "2019-12-27T18:11:19.117Z",
  "error_message": "error_message",
  "result": {
    "segments": [
      {
        "category": "category",
        "confidence_category": "confidence_category",
        "pages": [
          0
        ]
      }
    ]
  },
  "updated_at": "2019-12-27T18:11:19.117Z"
}
```

## Domain Types

### Split Category

- `class SplitCategory:`

  Category definition for document splitting.

  - `String name`

    Name of the category.

  - `Optional<String> description`

    Optional description of what content belongs in this category.

### Split Document Input

- `class SplitDocumentInput:`

  Document input specification for beta API.

  - `String type`

    Type of document input. Valid values are: file_id

  - `String value`

    Document identifier.

### Split Result Response

- `class SplitResultResponse:`

  Result of a completed split job.

  - `List<SplitSegmentResponse> segments`

    List of document segments.

    - `String category`

      Category name this split belongs to.

    - `String confidenceCategory`

      Categorical confidence level. Valid values are: high, medium, low.

    - `List<long> pages`

      1-indexed page numbers in this split.

### Split Segment Response

- `class SplitSegmentResponse:`

  A segment of the split document.

  - `String category`

    Category name this split belongs to.

  - `String confidenceCategory`

    Categorical confidence level. Valid values are: high, medium, low.

  - `List<long> pages`

    1-indexed page numbers in this split.
