Getting Started
Classify lets you automatically categorize documents into types you define (for example: invoice, receipt, contract) using natural-language rules.
Classification is currently in beta and is subject to breaking changes.
Use Cases
Section titled “Use Cases”- Use as a pre-processing step
- Before extraction: Classify first, then run schema-specific extraction (e.g., invoice vs. contract) with different LlamaExtract agents to improve accuracy and reduce cost.
- Before parsing: Classify first, then run LlamaParse over labeled files with finely tuned parse settings for each classified category to improve accuracy and reduce cost.
- Before indexing: Classify first, then send classified files into appropriate LlamaCloud indices with tailored chunking, metadata, and access controls to improve retrieval quality.
 
- Intake routing for back-office documents: Auto-separate invoices, receipts, purchase orders, and bank statements to the right queues, storage buckets, or approval workflows.
- Dataset curation: Auto-tag large archives into meaningful categories to create labeled subsets for model training.
Concepts
Section titled “Concepts”- 
Rule: A content-based criterion for a document type. Each rule has: - type: the label to assign (e.g., “invoice”).
- description: a natural-language description of the content that should match this type.
 
- 
Parsing configuration (optional): Controls how we parse documents before classification (e.g., language, page limits). Useful for speed/accuracy tradeoffs. 
- 
Results: For each file you get a type(predicted),confidence(0.0–1.0), andreasoning(step-by-step explanation).
Typical flow
Section titled “Typical flow”- 
Upload your files to LlamaCloud 
- 
Create rules for your target classes 
- 
Create a classify job with the file ids and rules 
- 
Fetch results and consume the predictions 
Next steps
Section titled “Next steps”- Make sure you have an API key: Get an API key
- Jump straight to the SDK guide to run your first job: Python SDK
- For use with other languages, see our API reference