Metadata Filtering
Metadata filtering is a way to filter the documents that are returned by a query based on the metadata associated with the documents. This is useful when you want to filter the documents based on some metadata that is not part of the document text.
You can also check our multi-tenancy blog post to see how metadata filtering can be used in a multi-tenant environment. [https://blog.llamaindex.ai/building-multi-tenancy-rag-system-with-llamaindex-0d6ab4e0c44b] (the article uses the Python version of LlamaIndex, but the concepts are the same).
Firstly if you haven’t already, you need to install the llamaindex
package:
npm i llamaindex @llamaindex/openai @llamaindex/chroma
Then you can import the necessary modules from llamaindex
:
import { Document, VectorStoreIndex, storageContextFromDefaults } from "llamaindex";import { ChromaVectorStore } from "@llamaindex/chroma";
const collectionName = "dog_colors";
Creating documents with metadata
Section titled “Creating documents with metadata”You can create documents with metadata using the Document
class:
const docs = [ new Document({ text: "The dog is brown", metadata: { color: "brown", dogId: "1", }, }), new Document({ text: "The dog is red", metadata: { color: "red", dogId: "2", }, }),];
Creating a ChromaDB vector store
Section titled “Creating a ChromaDB vector store”You can create a ChromaVectorStore
to store the documents:
const chromaVS = new ChromaVectorStore({ collectionName });
const storageContext = await storageContextFromDefaults({ vectorStore: chromaVS,});
const index = await VectorStoreIndex.fromDocuments(docs, { storageContext: storageContext,});
Querying the index with metadata filtering
Section titled “Querying the index with metadata filtering”Now you can query the index with metadata filtering using the preFilters
option:
const queryEngine = index.asQueryEngine({ preFilters: { filters: [ { key: "dogId", value: "2", operator: "==", }, ], },});
const response = await queryEngine.query({ query: "What is the color of the dog?",});
console.log(response.toString());
Besides using the equal operator (==
), you can also use a whole set of different operators to filter your documents.
Full Code
Section titled “Full Code”import { Document, VectorStoreIndex, storageContextFromDefaults } from "llamaindex";import { ChromaVectorStore } from "@llamaindex/chroma";
const collectionName = "dog_colors";
async function main() { try { const docs = [ new Document({ text: "The dog is brown", metadata: { color: "brown", dogId: "1", }, }), new Document({ text: "The dog is red", metadata: { color: "red", dogId: "2", }, }), ];
console.log("Creating ChromaDB vector store"); const chromaVS = new ChromaVectorStore({ collectionName }); const ctx = await storageContextFromDefaults({ vectorStore: chromaVS });
console.log("Embedding documents and adding to index"); const index = await VectorStoreIndex.fromDocuments(docs, { storageContext: ctx, });
console.log("Querying index"); const queryEngine = index.asQueryEngine({ preFilters: { filters: [ { key: "dogId", value: "2", operator: "==", }, ], }, }); const response = await queryEngine.query({ query: "What is the color of the dog?", }); console.log(response.toString()); } catch (e) { console.error(e); }}
main();