Skip to content

6. Adding LlamaParse

Complicated PDFs can be very tricky for LLMs to understand. To help with this, LlamaIndex provides LlamaParse, a hosted service that parses complex documents including PDFs. To use it, get a LLAMA_CLOUD_API_KEY by signing up for LlamaCloud (itโ€™s free for up to 1000 pages/day) and adding it to your .env file just as you did for your OpenAI key:

Terminal window
LLAMA_CLOUD_API_KEY=llx-XXXXXXXXXXXXXXXX

Then replace SimpleDirectoryReader with LlamaParseReader:

const reader = new LlamaParseReader({ resultType: "markdown" });
const documents = await reader.loadData("../data/sf_budget_2023_2024.pdf");

Now you will be able to ask more complicated questions of the same PDF and get better results. You can find this code in our repo.

Next up, letโ€™s persist our embedded data so we donโ€™t have to re-parse every time by using a vector store.