Metadata Extraction
You can use LLMs to automate metadata extraction with our Metadata Extractor
modules.
Our metadata extractor modules include the following “feature extractors”:
SummaryExtractor
- automatically extracts a summary over a set of NodesQuestionsAnsweredExtractor
- extracts a set of questions that each Node can answerTitleExtractor
- extracts a title over the context of each Node by document and combine themKeywordExtractor
- extracts keywords over the context of each Node
Then you can chain the Metadata Extractors
with the IngestionPipeline
to extract metadata from a set of documents.
import { Document, IngestionPipeline, TitleExtractor, QuestionsAnsweredExtractor } from "llamaindex";import { OpenAI } from "@llamaindex/openai";
async function main() { const pipeline = new IngestionPipeline({ transformations: [ new TitleExtractor(), new QuestionsAnsweredExtractor({ questions: 5, }), ], });
const nodes = await pipeline.run({ documents: [ new Document({ text: "I am 10 years old. John is 20 years old." }), ], });
for (const node of nodes) { console.log(node.metadata); }}
main().then(() => console.log("done"));