Skip to content

Parsing

Parse File
POST/api/v2/parse
Get Parse Job
GET/api/v2/parse/{job_id}
List Parse Jobs
GET/api/v2/parse
ModelsExpand Collapse
BBox = object { h, w, x, 5 more }

Bounding box with coordinates and optional metadata.

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

CodeItem = object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

FailPageMode = "raw_text" or "blank_page" or "error_message"

Enum for representing the different available page error handling modes.

One of the following:
"raw_text"
"blank_page"
"error_message"

List of items within the footer

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

HeadingItem = object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

ListItem = object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

CodeItem = object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

TableItem = object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

ImageItem = object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

Markdown representation preserving formatting

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Page footer container

HeaderItem = object { items, md, bbox, type }
items: array of TextItem { md, value, bbox, type } or HeadingItem { level, md, value, 2 more } or ListItem { items, md, ordered, 2 more } or 4 more

List of items within the header

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

HeadingItem = object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

ListItem = object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

CodeItem = object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

TableItem = object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

ImageItem = object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

md: string

Markdown representation preserving formatting

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "header"

Page header container

HeadingItem = object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

ImageItem = object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

ListItem = object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

ListItem = object { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

LlamaParseSupportedFileExtensions = ".pdf" or ".abw" or ".awt" or 141 more

Enum for supported file extensions.

One of the following:
".pdf"
".abw"
".awt"
".cgm"
".cwk"
".doc"
".docm"
".docx"
".dot"
".dotm"
".dotx"
".fodg"
".fodp"
".fopd"
".fodt"
".fb2"
".hwp"
".lwp"
".mcw"
".mw"
".mwd"
".odf"
".odt"
".otg"
".ott"
".pages"
".pbd"
".psw"
".rtf"
".sda"
".sdd"
".sdp"
".sdw"
".sgl"
".std"
".stw"
".sxd"
".sxg"
".sxm"
".sxw"
".uof"
".uop"
".uot"
".vor"
".wpd"
".wps"
".wpt"
".wri"
".wn"
".xml"
".zabw"
".key"
".odp"
".odg"
".otp"
".pot"
".potm"
".potx"
".ppt"
".pptm"
".pptx"
".sti"
".sxi"
".vsd"
".vsdm"
".vsdx"
".vdx"
".bmp"
".gif"
".jpg"
".jpeg"
".png"
".svg"
".tif"
".tiff"
".webp"
".htm"
".html"
".xhtm"
".csv"
".dbf"
".dif"
".et"
".eth"
".fods"
".numbers"
".ods"
".ots"
".prn"
".qpw"
".slk"
".stc"
".sxc"
".sylk"
".tsv"
".uos1"
".uos2"
".uos"
".wb1"
".wb2"
".wb3"
".wk1"
".wk2"
".wk3"
".wk4"
".wks"
".wq1"
".wq2"
".xlr"
".xls"
".xlsb"
".xlsm"
".xlsx"
".xlw"
".azw"
".azw3"
".azw4"
".cb7"
".cbc"
".cbr"
".cbz"
".chm"
".djvu"
".epub"
".fbz"
".htmlz"
".lit"
".lrf"
".md"
".mobi"
".pdb"
".pml"
".prc"
".rb"
".snb"
".tcr"
".txtz"
".m4a"
".mp3"
".mp4"
".mpeg"
".mpga"
".wav"
".webm"
ParsingJob = object { id, status, error_code, error_message }

A parse job (v1).

id: string

Unique parse job identifier

status: StatusEnum

Current job status

One of the following:
"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
error_code: optional string

Machine-readable error code when failed

error_message: optional string

Human-readable error details when failed

ParsingLanguages = "af" or "az" or "bs" or 83 more

Enum for representing the languages supported by the parser.

One of the following:
"af"
"az"
"bs"
"cs"
"cy"
"da"
"de"
"en"
"es"
"et"
"fr"
"ga"
"hr"
"hu"
"id"
"is"
"it"
"ku"
"la"
"lt"
"lv"
"mi"
"ms"
"mt"
"nl"
"no"
"oc"
"pi"
"pl"
"pt"
"ro"
"rs_latin"
"sk"
"sl"
"sq"
"sv"
"sw"
"tl"
"tr"
"uz"
"vi"
"ar"
"fa"
"ug"
"ur"
"bn"
"as"
"mni"
"ru"
"rs_cyrillic"
"be"
"bg"
"uk"
"mn"
"abq"
"ady"
"kbd"
"ava"
"dar"
"inh"
"che"
"lbe"
"lez"
"tab"
"tjk"
"hi"
"mr"
"ne"
"bh"
"mai"
"ang"
"bho"
"mah"
"sck"
"new"
"gom"
"sa"
"bgc"
"th"
"ch_sim"
"ch_tra"
"ja"
"ko"
"ta"
"te"
"kn"
ParsingMode = "parse_page_without_llm" or "parse_page_with_llm" or "parse_page_with_lvm" or 5 more

Enum for representing the mode of parsing to be used.

One of the following:
"parse_page_without_llm"
"parse_page_with_llm"
"parse_page_with_lvm"
"parse_page_with_agent"
"parse_page_with_layout_agent"
"parse_document_with_llm"
"parse_document_with_lvm"
"parse_document_with_agent"
StatusEnum = "PENDING" or "SUCCESS" or "ERROR" or 2 more

Enum for representing the status of a job

One of the following:
"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
TableItem = object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

ParsingCreateResponse = object { id, project_id, status, 5 more }

A parse job.

id: string

Unique parse job identifier

project_id: string

Project this job belongs to

status: "PENDING" or "RUNNING" or "COMPLETED" or 2 more

Current job status: PENDING, RUNNING, COMPLETED, FAILED, or CANCELLED

One of the following:
"PENDING"
"RUNNING"
"COMPLETED"
"FAILED"
"CANCELLED"
created_at: optional string

Creation datetime

formatdate-time
error_message: optional string

Error details when status is FAILED

name: optional string

Optional display name for this parse job

tier: optional string

Parsing tier used for this job

updated_at: optional string

Update datetime

formatdate-time
ParsingGetResponse = object { job, images_content_metadata, items, 8 more }

Parse result response with job status and optional content or metadata.

The job field is always included. Other fields are included based on expand parameters.

job: object { id, project_id, status, 5 more }

Parse job status and metadata

id: string

Unique parse job identifier

project_id: string

Project this job belongs to

status: "PENDING" or "RUNNING" or "COMPLETED" or 2 more

Current job status: PENDING, RUNNING, COMPLETED, FAILED, or CANCELLED

One of the following:
"PENDING"
"RUNNING"
"COMPLETED"
"FAILED"
"CANCELLED"
created_at: optional string

Creation datetime

formatdate-time
error_message: optional string

Error details when status is FAILED

name: optional string

Optional display name for this parse job

tier: optional string

Parsing tier used for this job

updated_at: optional string

Update datetime

formatdate-time
images_content_metadata: optional object { images, total_count }

Metadata for all extracted images.

images: array of object { filename, index, bbox, 4 more }

List of image metadata with presigned URLs

filename: string

Image filename (e.g., ‘image_0.png’)

index: number

Index of the image in the extraction order

bbox: optional object { h, w, x, y }

Bounding box for an image on its page.

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

category: optional "screenshot" or "embedded" or "layout"

Image category: ‘screenshot’ (full page), ‘embedded’ (images in document), or ‘layout’ (cropped from layout detection)

One of the following:
"screenshot"
"embedded"
"layout"
content_type: optional string

MIME type of the image

presigned_url: optional string

Presigned URL to download the image

Deprecatedsize_bytes: optional number

Deprecated: always returns None. Will be removed in a future release.

total_count: number

Total number of extracted images

items: optional object { pages }

Structured JSON result (if requested)

pages: array of object { items, page_height, page_number, 2 more } or object { error, page_number, success }

List of structured pages or failed page entries

One of the following:
StructuredResultPage = object { items, page_height, page_number, 2 more }
items: array of TextItem { md, value, bbox, type } or HeadingItem { level, md, value, 2 more } or ListItem { items, md, ordered, 2 more } or 6 more

List of structured items on the page

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

HeadingItem = object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

ListItem = object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

CodeItem = object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

TableItem = object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

ImageItem = object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

HeaderItem = object { items, md, bbox, type }
items: array of TextItem { md, value, bbox, type } or HeadingItem { level, md, value, 2 more } or ListItem { items, md, ordered, 2 more } or 4 more

List of items within the header

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

HeadingItem = object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

ListItem = object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

CodeItem = object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

TableItem = object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

ImageItem = object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

md: string

Markdown representation preserving formatting

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "header"

Page header container

List of items within the footer

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

HeadingItem = object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

ListItem = object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

One of the following:
TextItem = object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

ListItem { items, md, ordered, 2 more }
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

CodeItem = object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

TableItem = object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

One of the following:
string
number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

ImageItem = object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

Markdown representation preserving formatting

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Page footer container

page_height: number

Height of the page in points

page_number: number

Page number of the document

page_width: number

Width of the page in points

success: true

Success indicator

FailedStructuredPage = object { error, page_number, success }
error: string

Error message describing the failure

page_number: number

Page number of the document

success: false

Failure indicator

job_metadata: optional map[unknown]

Job execution metadata (if requested)

markdown: optional object { pages }

Markdown result (if requested)

pages: array of object { markdown, page_number, success, 2 more } or object { error, page_number, success }

List of markdown pages or failed page entries

One of the following:
MarkdownResultPage = object { markdown, page_number, success, 2 more }
markdown: string

Markdown content of the page

page_number: number

Page number of the document

success: true

Success indicator

Footer of the page in markdown

header: optional string

Header of the page in markdown

FailedMarkdownPage = object { error, page_number, success }
error: string

Error message describing the failure

page_number: number

Page number of the document

success: false

Failure indicator

markdown_full: optional string

Full raw markdown content (if requested)

metadata: optional object { pages }

Result containing metadata (page level and general) for the parsed document.

pages: array of object { page_number, confidence, cost_optimized, 5 more }

List of page metadata entries

page_number: number

Page number of the document

confidence: optional number

Confidence score for the page parsing (0-1)

cost_optimized: optional boolean

Whether cost-optimized parsing was used for the page

original_orientation_angle: optional number

Original orientation angle of the page in degrees

printed_page_number: optional string

Printed page number as it appears in the document

slide_section_name: optional string

Section name from presentation slides

speaker_notes: optional string

Speaker notes from presentation slides

triggered_auto_mode: optional boolean

Whether auto mode was triggered for the page

raw_parameters: optional map[unknown]
result_content_metadata: optional map[object { size_bytes, exists, presigned_url } ]

Metadata including size, existence, and presigned URLs for result files

size_bytes: number

Size of the result file in bytes

exists: optional boolean

Whether the result file exists in S3

presigned_url: optional string

Presigned URL to download the result file

text: optional object { pages }

Plain text result (if requested)

pages: array of object { page_number, text }

List of text pages

page_number: number

Page number of the document

text: string

Plain text content of the page

text_full: optional string

Full raw text content (if requested)

ParsingListResponse = object { id, project_id, status, 5 more }

A parse job.

id: string

Unique parse job identifier

project_id: string

Project this job belongs to

status: "PENDING" or "RUNNING" or "COMPLETED" or 2 more

Current job status: PENDING, RUNNING, COMPLETED, FAILED, or CANCELLED

One of the following:
"PENDING"
"RUNNING"
"COMPLETED"
"FAILED"
"CANCELLED"
created_at: optional string

Creation datetime

formatdate-time
error_message: optional string

Error details when status is FAILED

name: optional string

Optional display name for this parse job

tier: optional string

Parsing tier used for this job

updated_at: optional string

Update datetime

formatdate-time