Custom Model Per Request
There are scenarios, such as the case of a multi-tenant backend API, where it may be required to handle each request with a custom model.
In such a scenario, modifying the Settings
object directly as follows is not recommended:
import { Settings } from 'llamaindex';import { OpenAIEmbedding } from '@llamaindex/embeddings-openai';
Settings.embedModel = new OpenAIEmbedding({ apiKey: 'CLIENT_API_KEY' });Settings.llm = openai({ apiKey: key, model: 'gpt-4o' })
Setting llm
and embedModel
directly will lead to unpredictable responses, since Settings
is global and mutable.
This can lead to race conditions, as each request modifies Settings.embedModel
or Settings.llm
.
The recommended approach is to use Settings.withEmbedModel
or Settings.withLLM
as follows:
const embedModel = new OpenAIEmbedding({ apiKey: process.env.OPENAI_API_KEY,});const llm = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const llmResponse = await Settings.withEmbedModel(embedModel, async () => { return Settings.withLLM(llm, async () => { const path = "node_modules/llamaindex/examples/abramov.txt"; const essay = await fs.readFile(path, "utf-8"); // Create Document object with essay const document = new Document({ text: essay, id_: path }); // Split text and create embeddings. Store them in a VectorStoreIndex const index = await VectorStoreIndex.fromDocuments([document]); // Query the index const queryEngine = index.asQueryEngine(); const { message, sourceNodes } = await queryEngine.query({ query: "What did the author do in college?", }); // Return response with sources return message.content; });});
The full example can be found here.