Skip to content

Split

Create Split Job
beta.split.create(SplitCreateParams**kwargs) -> SplitCreateResponse
POST/api/v1/beta/split/jobs
List Split Jobs
beta.split.list(SplitListParams**kwargs) -> SyncPaginatedCursor[SplitListResponse]
GET/api/v1/beta/split/jobs
Get Split Job
beta.split.get(strsplit_job_id, SplitGetParams**kwargs) -> SplitGetResponse
GET/api/v1/beta/split/jobs/{split_job_id}
ModelsExpand Collapse
class SplitCategory:

Category definition for document splitting.

name: str

Name of the category.

maxLength200
minLength1
description: Optional[str]

Optional description of what content belongs in this category.

maxLength2000
minLength1
class SplitDocumentInput:

Document input specification for beta API.

type: str

Type of document input. Valid values are: file_id

value: str

Document identifier.

class SplitResultResponse:

Result of a completed split job.

segments: List[SplitSegmentResponse]

List of document segments.

category: str

Category name this split belongs to.

confidence_category: str

Categorical confidence level. Valid values are: high, medium, low.

pages: List[int]

1-indexed page numbers in this split.

class SplitSegmentResponse:

A segment of the split document.

category: str

Category name this split belongs to.

confidence_category: str

Categorical confidence level. Valid values are: high, medium, low.

pages: List[int]

1-indexed page numbers in this split.

class SplitCreateResponse:

Beta response — uses nested document_input object.

id: str

Unique identifier for the split job.

categories: List[SplitCategory]

Categories used for splitting.

name: str

Name of the category.

maxLength200
minLength1
description: Optional[str]

Optional description of what content belongs in this category.

maxLength2000
minLength1
document_input: SplitDocumentInput

Document that was split.

type: str

Type of document input. Valid values are: file_id

value: str

Document identifier.

project_id: str

Project ID this job belongs to.

status: str

Current status of the job. Valid values are: pending, processing, completed, failed, cancelled.

user_id: str

User ID who created this job.

configuration_id: Optional[str]

Split configuration ID used for this job.

created_at: Optional[datetime]

Creation datetime

formatdate-time
error_message: Optional[str]

Error message if the job failed.

result: Optional[SplitResultResponse]

Result of a completed split job.

segments: List[SplitSegmentResponse]

List of document segments.

category: str

Category name this split belongs to.

confidence_category: str

Categorical confidence level. Valid values are: high, medium, low.

pages: List[int]

1-indexed page numbers in this split.

updated_at: Optional[datetime]

Update datetime

formatdate-time
class SplitListResponse:

Beta response — uses nested document_input object.

id: str

Unique identifier for the split job.

categories: List[SplitCategory]

Categories used for splitting.

name: str

Name of the category.

maxLength200
minLength1
description: Optional[str]

Optional description of what content belongs in this category.

maxLength2000
minLength1
document_input: SplitDocumentInput

Document that was split.

type: str

Type of document input. Valid values are: file_id

value: str

Document identifier.

project_id: str

Project ID this job belongs to.

status: str

Current status of the job. Valid values are: pending, processing, completed, failed, cancelled.

user_id: str

User ID who created this job.

configuration_id: Optional[str]

Split configuration ID used for this job.

created_at: Optional[datetime]

Creation datetime

formatdate-time
error_message: Optional[str]

Error message if the job failed.

result: Optional[SplitResultResponse]

Result of a completed split job.

segments: List[SplitSegmentResponse]

List of document segments.

category: str

Category name this split belongs to.

confidence_category: str

Categorical confidence level. Valid values are: high, medium, low.

pages: List[int]

1-indexed page numbers in this split.

updated_at: Optional[datetime]

Update datetime

formatdate-time
class SplitGetResponse:

Beta response — uses nested document_input object.

id: str

Unique identifier for the split job.

categories: List[SplitCategory]

Categories used for splitting.

name: str

Name of the category.

maxLength200
minLength1
description: Optional[str]

Optional description of what content belongs in this category.

maxLength2000
minLength1
document_input: SplitDocumentInput

Document that was split.

type: str

Type of document input. Valid values are: file_id

value: str

Document identifier.

project_id: str

Project ID this job belongs to.

status: str

Current status of the job. Valid values are: pending, processing, completed, failed, cancelled.

user_id: str

User ID who created this job.

configuration_id: Optional[str]

Split configuration ID used for this job.

created_at: Optional[datetime]

Creation datetime

formatdate-time
error_message: Optional[str]

Error message if the job failed.

result: Optional[SplitResultResponse]

Result of a completed split job.

segments: List[SplitSegmentResponse]

List of document segments.

category: str

Category name this split belongs to.

confidence_category: str

Categorical confidence level. Valid values are: high, medium, low.

pages: List[int]

1-indexed page numbers in this split.

updated_at: Optional[datetime]

Update datetime

formatdate-time