Skip to content
LiteParse

CLI Reference

Complete reference for all LiteParse CLI commands and options.

LiteParse provides the lit CLI with three commands: parse, batch-parse, and screenshot.

Parse a single document.

lit parse [options] <file>
ArgumentDescription
filePath to the document file
OptionDescriptionDefault
-o, --output <file>Write output to a file instead of stdout
--format <format>Output format: json or texttext
--ocr-server-url <url>HTTP OCR server URL— (uses Tesseract)
--no-ocrDisable OCR entirely
--ocr-language <lang>OCR language codeen
--num-workers <n>Pages to OCR in parallelCPU cores - 1
--max-pages <n>Maximum pages to parse10000
--target-pages <pages>Pages to parse (e.g., "1-5,10")— (all pages)
--dpi <dpi>Rendering DPI150
--no-precise-bboxDisable precise bounding boxes
--preserve-small-textKeep very small text
--config <file>JSON config file path
-q, --quietSuppress progress output
Terminal window
# Basic text parsing
lit parse report.pdf
# JSON output with bounding boxes
lit parse report.pdf --format json -o report.json
# Parse pages 1-5 only, no OCR
lit parse report.pdf --target-pages "1-5" --no-ocr
# High-DPI rendering with French OCR
lit parse report.pdf --dpi 300 --ocr-language fra
# Use an external OCR server
lit parse report.pdf --ocr-server-url http://localhost:8828/ocr
# Pipe output to another tool
lit parse report.pdf -q | wc -l

Parse multiple documents in a directory.

lit batch-parse [options] <input-dir> <output-dir>
ArgumentDescription
input-dirDirectory containing documents to parse
output-dirDirectory for output files
OptionDescriptionDefault
--format <format>Output format: json or texttext
--ocr-server-url <url>HTTP OCR server URL— (uses Tesseract)
--no-ocrDisable OCR entirely
--ocr-language <lang>OCR language codeen
--num-workers <n>Pages to OCR in parallelCPU cores - 1
--max-pages <n>Maximum pages per file10000
--dpi <dpi>Rendering DPI150
--no-precise-bboxDisable precise bounding boxes
--recursiveSearch subdirectories
--extension <ext>Only process this extension (e.g., ".pdf")— (all supported)
--config <file>JSON config file path
-q, --quietSuppress progress output
Terminal window
# Parse all supported files in a directory
lit batch-parse ./documents ./output
# Recursively parse only PDFs
lit batch-parse ./documents ./output --recursive --extension ".pdf"
# Batch parse with JSON output and no OCR
lit batch-parse ./documents ./output --format json --no-ocr
# Use a config file for consistent settings
lit batch-parse ./documents ./output --config liteparse.config.json

Generate page images from a PDF.

lit screenshot [options] <file>
ArgumentDescription
filePath to the PDF file
OptionDescriptionDefault
-o, --output-dir <dir>Output directory./screenshots
--target-pages <pages>Pages to screenshot (e.g., "1,3,5" or "1-5")— (all pages)
--dpi <dpi>Rendering DPI150
--format <format>Image format: png or jpgpng
--config <file>JSON config file path
-q, --quietSuppress progress output
Terminal window
# Screenshot all pages
lit screenshot document.pdf -o ./pages
# First 5 pages at high DPI
lit screenshot document.pdf --pages "1-5" --dpi 300 -o ./pages
# JPG format for smaller files
lit screenshot document.pdf --format jpg -o ./pages
# Specific pages only
lit screenshot document.pdf --pages "1,5,10" -o ./pages

These options are available on all commands:

OptionDescription
-h, --helpShow help for a command
-V, --versionShow version number