Structured data extraction
Make sure you have installed LlamaIndex.TS and have an OpenAI key. If you haven’t, check out the installation guide.
You can use other LLMs via their APIs; if you would prefer to use local models check out our local LLM example.
Set up
Section titled “Set up”In a new folder:
npm initnpm i -D typescript @types/nodenpm i @llamaindex/openai zod
Extract data
Section titled “Extract data”Create the file example.ts
. This code will:
- Set up an LLM connection to GPT-4
- Give an example of the data structure we wish to generate
- Prompt the LLM with instructions and the example, plus a sample transcript
To run the code:
npx tsx example.ts
You should expect output something like:
{ "summary": "Sarah from XYZ Company called John to introduce the XYZ Widget, a tool designed to automate tasks and improve productivity. John expressed interest and requested case studies and a product demo. Sarah agreed to send the information and follow up to schedule the demo.", "products": ["XYZ Widget"], "rep_name": "Sarah", "prospect_name": "John", "action_items": [ "Send case studies and additional product information to John", "Follow up with John to schedule a product demo" ]}
Using the exec
method
Section titled “Using the exec method”Many LLMs do not natively support structured output, and often rely exclusively on prompt or context engineering.
In this sense, we proved you with an alternative for structured data extraction, using the exec
method with responseFormat
.
For example, you can, in a new folder, install our Anthropic integration and zod
v3:
npm initnpm i -D typescript @types/nodenpm i @llamaindex/anthropic zod@3.25.76
And then try extracting data with this code:
The output should look like this:
{ "title": "La Divina Commedia", "author": "Dante Alighieri", "year": 1321}