Skip to content

Parsing

Parse File
client.parsing.create(ParsingCreateParams { tier, version, organization_id, 15 more } params, RequestOptionsoptions?): ParsingCreateResponse { id, project_id, status, 5 more }
POST/api/v2/parse
Get Parse Job
client.parsing.get(stringjobID, ParsingGetParams { expand, image_filenames, organization_id, project_id } query?, RequestOptionsoptions?): ParsingGetResponse { job, images_content_metadata, items, 8 more }
GET/api/v2/parse/{job_id}
List Parse Jobs
client.parsing.list(ParsingListParams { created_at_on_or_after, created_at_on_or_before, job_ids, 5 more } query?, RequestOptionsoptions?): PaginatedCursor<ParsingListResponse { id, project_id, status, 5 more } >
GET/api/v2/parse
ModelsExpand Collapse
BBox { h, w, x, 5 more }

Bounding box with coordinates and optional metadata.

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

CodeItem { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

language?: string | null

Programming language identifier

type?: "code"

Code block item type

FailPageMode = "raw_text" | "blank_page" | "error_message"

Enum for representing the different available page error handling modes.

One of the following:
"raw_text"
"blank_page"
"error_message"

List of items within the footer

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

HeadingItem { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "heading"

Heading item type

ListItem { items, md, ordered, 2 more }
items: Array<TextItem { md, value, bbox, type } | ListItem { items, md, ordered, 2 more } >

List of nested text or list items

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "list"

List item type

CodeItem { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

language?: string | null

Programming language identifier

type?: "code"

Code block item type

TableItem { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: Array<Array<string | number | null>>

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

merged_from_pages?: Array<number> | null

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page?: number | null

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns?: Array<ParseConcern> | null

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type?: "table"

Table item type

ImageItem { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Link item type

Markdown representation preserving formatting

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Page footer container

HeaderItem { items, md, bbox, type }
items: Array<TextItem { md, value, bbox, type } | HeadingItem { level, md, value, 2 more } | ListItem { items, md, ordered, 2 more } | 4 more>

List of items within the header

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

HeadingItem { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "heading"

Heading item type

ListItem { items, md, ordered, 2 more }
items: Array<TextItem { md, value, bbox, type } | ListItem { items, md, ordered, 2 more } >

List of nested text or list items

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "list"

List item type

CodeItem { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

language?: string | null

Programming language identifier

type?: "code"

Code block item type

TableItem { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: Array<Array<string | number | null>>

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

merged_from_pages?: Array<number> | null

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page?: number | null

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns?: Array<ParseConcern> | null

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type?: "table"

Table item type

ImageItem { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Link item type

md: string

Markdown representation preserving formatting

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "header"

Page header container

HeadingItem { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "heading"

Heading item type

ImageItem { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Link item type

ListItem { items, md, ordered, 2 more }
items: Array<TextItem { md, value, bbox, type } | ListItem { items, md, ordered, 2 more } >

List of nested text or list items

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

ListItem = ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "list"

List item type

LlamaParseSupportedFileExtensions = ".pdf" | ".abw" | ".awt" | 141 more

Enum for supported file extensions.

One of the following:
".pdf"
".abw"
".awt"
".cgm"
".cwk"
".doc"
".docm"
".docx"
".dot"
".dotm"
".dotx"
".fodg"
".fodp"
".fopd"
".fodt"
".fb2"
".hwp"
".lwp"
".mcw"
".mw"
".mwd"
".odf"
".odt"
".otg"
".ott"
".pages"
".pbd"
".psw"
".rtf"
".sda"
".sdd"
".sdp"
".sdw"
".sgl"
".std"
".stw"
".sxd"
".sxg"
".sxm"
".sxw"
".uof"
".uop"
".uot"
".vor"
".wpd"
".wps"
".wpt"
".wri"
".wn"
".xml"
".zabw"
".key"
".odp"
".odg"
".otp"
".pot"
".potm"
".potx"
".ppt"
".pptm"
".pptx"
".sti"
".sxi"
".vsd"
".vsdm"
".vsdx"
".vdx"
".bmp"
".gif"
".jpg"
".jpeg"
".png"
".svg"
".tif"
".tiff"
".webp"
".htm"
".html"
".xhtm"
".csv"
".dbf"
".dif"
".et"
".eth"
".fods"
".numbers"
".ods"
".ots"
".prn"
".qpw"
".slk"
".stc"
".sxc"
".sylk"
".tsv"
".uos1"
".uos2"
".uos"
".wb1"
".wb2"
".wb3"
".wk1"
".wk2"
".wk3"
".wk4"
".wks"
".wq1"
".wq2"
".xlr"
".xls"
".xlsb"
".xlsm"
".xlsx"
".xlw"
".azw"
".azw3"
".azw4"
".cb7"
".cbc"
".cbr"
".cbz"
".chm"
".djvu"
".epub"
".fbz"
".htmlz"
".lit"
".lrf"
".md"
".mobi"
".pdb"
".pml"
".prc"
".rb"
".snb"
".tcr"
".txtz"
".m4a"
".mp3"
".mp4"
".mpeg"
".mpga"
".wav"
".webm"
ParsingJob { id, status, error_code, error_message }

A parse job (v1).

id: string

Unique parse job identifier

status: StatusEnum

Current job status

One of the following:
"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
error_code?: string | null

Machine-readable error code when failed

error_message?: string | null

Human-readable error details when failed

ParsingLanguages = "af" | "az" | "bs" | 83 more

Enum for representing the languages supported by the parser.

One of the following:
"af"
"az"
"bs"
"cs"
"cy"
"da"
"de"
"en"
"es"
"et"
"fr"
"ga"
"hr"
"hu"
"id"
"is"
"it"
"ku"
"la"
"lt"
"lv"
"mi"
"ms"
"mt"
"nl"
"no"
"oc"
"pi"
"pl"
"pt"
"ro"
"rs_latin"
"sk"
"sl"
"sq"
"sv"
"sw"
"tl"
"tr"
"uz"
"vi"
"ar"
"fa"
"ug"
"ur"
"bn"
"as"
"mni"
"ru"
"rs_cyrillic"
"be"
"bg"
"uk"
"mn"
"abq"
"ady"
"kbd"
"ava"
"dar"
"inh"
"che"
"lbe"
"lez"
"tab"
"tjk"
"hi"
"mr"
"ne"
"bh"
"mai"
"ang"
"bho"
"mah"
"sck"
"new"
"gom"
"sa"
"bgc"
"th"
"ch_sim"
"ch_tra"
"ja"
"ko"
"ta"
"te"
"kn"
ParsingMode = "parse_page_without_llm" | "parse_page_with_llm" | "parse_page_with_lvm" | 5 more

Enum for representing the mode of parsing to be used.

One of the following:
"parse_page_without_llm"
"parse_page_with_llm"
"parse_page_with_lvm"
"parse_page_with_agent"
"parse_page_with_layout_agent"
"parse_document_with_llm"
"parse_document_with_lvm"
"parse_document_with_agent"
StatusEnum = "PENDING" | "SUCCESS" | "ERROR" | 2 more

Enum for representing the status of a job

One of the following:
"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
TableItem { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: Array<Array<string | number | null>>

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

merged_from_pages?: Array<number> | null

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page?: number | null

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns?: Array<ParseConcern> | null

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type?: "table"

Table item type

TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

ParsingCreateResponse { id, project_id, status, 5 more }

A parse job.

id: string

Unique parse job identifier

project_id: string

Project this job belongs to

status: "PENDING" | "RUNNING" | "COMPLETED" | 2 more

Current job status: PENDING, RUNNING, COMPLETED, FAILED, or CANCELLED

One of the following:
"PENDING"
"RUNNING"
"COMPLETED"
"FAILED"
"CANCELLED"
created_at?: string | null

Creation datetime

formatdate-time
error_message?: string | null

Error details when status is FAILED

name?: string | null

Optional display name for this parse job

tier?: string | null

Parsing tier used for this job

updated_at?: string | null

Update datetime

formatdate-time
ParsingGetResponse { job, images_content_metadata, items, 8 more }

Parse result response with job status and optional content or metadata.

The job field is always included. Other fields are included based on expand parameters.

job: Job { id, project_id, status, 5 more }

Parse job status and metadata

id: string

Unique parse job identifier

project_id: string

Project this job belongs to

status: "PENDING" | "RUNNING" | "COMPLETED" | 2 more

Current job status: PENDING, RUNNING, COMPLETED, FAILED, or CANCELLED

One of the following:
"PENDING"
"RUNNING"
"COMPLETED"
"FAILED"
"CANCELLED"
created_at?: string | null

Creation datetime

formatdate-time
error_message?: string | null

Error details when status is FAILED

name?: string | null

Optional display name for this parse job

tier?: string | null

Parsing tier used for this job

updated_at?: string | null

Update datetime

formatdate-time
images_content_metadata?: ImagesContentMetadata | null

Metadata for all extracted images.

images: Array<Image>

List of image metadata with presigned URLs

filename: string

Image filename (e.g., ‘image_0.png’)

index: number

Index of the image in the extraction order

bbox?: Bbox | null

Bounding box for an image on its page.

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

category?: "screenshot" | "embedded" | "layout" | null

Image category: ‘screenshot’ (full page), ‘embedded’ (images in document), or ‘layout’ (cropped from layout detection)

One of the following:
"screenshot"
"embedded"
"layout"
content_type?: string | null

MIME type of the image

presigned_url?: string | null

Presigned URL to download the image

Deprecatedsize_bytes?: number | null

Deprecated: always returns None. Will be removed in a future release.

total_count: number

Total number of extracted images

items?: Items | null

Structured JSON result (if requested)

pages: Array<StructuredResultPage { items, page_height, page_number, 2 more } | FailedStructuredPage { error, page_number, success } >

List of structured pages or failed page entries

One of the following:
StructuredResultPage { items, page_height, page_number, 2 more }
items: Array<TextItem { md, value, bbox, type } | HeadingItem { level, md, value, 2 more } | ListItem { items, md, ordered, 2 more } | 6 more>

List of structured items on the page

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

HeadingItem { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "heading"

Heading item type

ListItem { items, md, ordered, 2 more }
items: Array<TextItem { md, value, bbox, type } | ListItem { items, md, ordered, 2 more } >

List of nested text or list items

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "list"

List item type

CodeItem { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

language?: string | null

Programming language identifier

type?: "code"

Code block item type

TableItem { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: Array<Array<string | number | null>>

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

merged_from_pages?: Array<number> | null

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page?: number | null

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns?: Array<ParseConcern> | null

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type?: "table"

Table item type

ImageItem { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Link item type

HeaderItem { items, md, bbox, type }
items: Array<TextItem { md, value, bbox, type } | HeadingItem { level, md, value, 2 more } | ListItem { items, md, ordered, 2 more } | 4 more>

List of items within the header

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

HeadingItem { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "heading"

Heading item type

ListItem { items, md, ordered, 2 more }
items: Array<TextItem { md, value, bbox, type } | ListItem { items, md, ordered, 2 more } >

List of nested text or list items

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "list"

List item type

CodeItem { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

language?: string | null

Programming language identifier

type?: "code"

Code block item type

TableItem { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: Array<Array<string | number | null>>

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

merged_from_pages?: Array<number> | null

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page?: number | null

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns?: Array<ParseConcern> | null

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type?: "table"

Table item type

ImageItem { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Link item type

md: string

Markdown representation preserving formatting

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "header"

Page header container

List of items within the footer

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

HeadingItem { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "heading"

Heading item type

ListItem { items, md, ordered, 2 more }
items: Array<TextItem { md, value, bbox, type } | ListItem { items, md, ordered, 2 more } >

List of nested text or list items

One of the following:
TextItem { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "list"

List item type

CodeItem { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

language?: string | null

Programming language identifier

type?: "code"

Code block item type

TableItem { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: Array<Array<string | number | null>>

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

merged_from_pages?: Array<number> | null

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page?: number | null

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns?: Array<ParseConcern> | null

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type?: "table"

Table item type

ImageItem { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox?: Array<BBox { h, w, x, 5 more } > | null

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

type?: "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Link item type

Markdown representation preserving formatting

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence?: number | null

Confidence score

end_index?: number | null

End index in the text

label?: string | null

Label for the bounding box

start_index?: number | null

Start index in the text

Page footer container

page_height: number

Height of the page in points

page_number: number

Page number of the document

page_width: number

Width of the page in points

success: true

Success indicator

FailedStructuredPage { error, page_number, success }
error: string

Error message describing the failure

page_number: number

Page number of the document

success: false

Failure indicator

job_metadata?: Record<string, unknown> | null

Job execution metadata (if requested)

markdown?: Markdown | null

Markdown result (if requested)

pages: Array<MarkdownResultPage { markdown, page_number, success, 2 more } | FailedMarkdownPage { error, page_number, success } >

List of markdown pages or failed page entries

One of the following:
MarkdownResultPage { markdown, page_number, success, 2 more }
markdown: string

Markdown content of the page

page_number: number

Page number of the document

success: true

Success indicator

Footer of the page in markdown

header?: string | null

Header of the page in markdown

FailedMarkdownPage { error, page_number, success }
error: string

Error message describing the failure

page_number: number

Page number of the document

success: false

Failure indicator

markdown_full?: string | null

Full raw markdown content (if requested)

metadata?: Metadata | null

Result containing metadata (page level and general) for the parsed document.

pages: Array<Page>

List of page metadata entries

page_number: number

Page number of the document

confidence?: number | null

Confidence score for the page parsing (0-1)

cost_optimized?: boolean | null

Whether cost-optimized parsing was used for the page

original_orientation_angle?: number | null

Original orientation angle of the page in degrees

printed_page_number?: string | null

Printed page number as it appears in the document

slide_section_name?: string | null

Section name from presentation slides

speaker_notes?: string | null

Speaker notes from presentation slides

triggered_auto_mode?: boolean | null

Whether auto mode was triggered for the page

raw_parameters?: Record<string, unknown> | null
result_content_metadata?: Record<string, ResultContentMetadata> | null

Metadata including size, existence, and presigned URLs for result files

size_bytes: number

Size of the result file in bytes

exists?: boolean

Whether the result file exists in S3

presigned_url?: string | null

Presigned URL to download the result file

text?: Text | null

Plain text result (if requested)

pages: Array<Page>

List of text pages

page_number: number

Page number of the document

text: string

Plain text content of the page

text_full?: string | null

Full raw text content (if requested)

ParsingListResponse { id, project_id, status, 5 more }

A parse job.

id: string

Unique parse job identifier

project_id: string

Project this job belongs to

status: "PENDING" | "RUNNING" | "COMPLETED" | 2 more

Current job status: PENDING, RUNNING, COMPLETED, FAILED, or CANCELLED

One of the following:
"PENDING"
"RUNNING"
"COMPLETED"
"FAILED"
"CANCELLED"
created_at?: string | null

Creation datetime

formatdate-time
error_message?: string | null

Error details when status is FAILED

name?: string | null

Optional display name for this parse job

tier?: string | null

Parsing tier used for this job

updated_at?: string | null

Update datetime

formatdate-time