Custom Model Per Request
There are scenarios, such as the case of a multi-tenant backend API, where it may be required to handle each request with a custom model.
In such a scenario, modifying the Settings object directly as follows is not recommended:
import { Settings } from 'llamaindex';import { OpenAIEmbedding } from '@llamaindex/embeddings-openai';
Settings.embedModel = new OpenAIEmbedding({ apiKey: 'CLIENT_API_KEY' });Settings.llm = openai({ apiKey: key, model: 'gpt-4o' })Setting llm and embedModel directly will lead to unpredictable responses, since Settings is global and mutable.
This can lead to race conditions, as each request modifies Settings.embedModel or Settings.llm.
The recommended approach is to use Settings.withEmbedModel or Settings.withLLM as follows:
const embedModel = new OpenAIEmbedding({ apiKey: process.env.OPENAI_API_KEY,});const llm = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const llmResponse = await Settings.withEmbedModel(embedModel, async () => { return Settings.withLLM(llm, async () => { const path = "node_modules/llamaindex/examples/abramov.txt"; const essay = await fs.readFile(path, "utf-8"); // Create Document object with essay const document = new Document({ text: essay, id_: path }); // Split text and create embeddings. Store them in a VectorStoreIndex const index = await VectorStoreIndex.fromDocuments([document]); // Query the index const queryEngine = index.asQueryEngine(); const { message, sourceNodes } = await queryEngine.query({ query: "What did the author do in college?", }); // Return response with sources return message.content; });});The full example can be found here.