Moss + LlamaIndex

Moss is a search runtime for voice agents, copilots, and multimodal apps. Sub-10ms lookups, zero infrastructure. Built in Rust and WebAssembly. Moss runs search inside your agent runtime. Sub-10ms lookups, always-current data, without rebuilding your infra.

%pip install llama-index-tools-moss llama-index-core llama-index-llms-openai

Setup

Import the necessary classes and initialize the Moss client with your project credentials. We’ll configure query options to tune the search behavior (top_k for result count, alpha for the semantic/keyword blend).

import os
from llama_index.tools.moss import MossToolSpec, QueryOptions
from inferedge_moss import MossClient, DocumentInfo

# Initialize Moss Client
MOSS_PROJECT_KEY = os.getenv("MOSS_PROJECT_KEY")
MOSS_PROJECT_ID = os.getenv("MOSS_PROJECT_ID")
client = MossClient(project_id=MOSS_PROJECT_ID, project_key=MOSS_PROJECT_KEY)

# 2. Configure query settings - Instantiate QueryOptions (Optional)
# If skipped, the tool will use its own defaults.
query_options = QueryOptions(top_k=3, alpha=0.5)
# 3. Initialize Tool
print("Initializing MossToolSpec...")
moss_tool = MossToolSpec(
    client=client, index_name="moss-cookbook", query_options=query_options
)

Initializing MossToolSpec...

Indexing Data

Create sample documents with DocumentInfo (each with id, text, and metadata). Then call index_docs() to build the Moss index — this creates or replaces the index with your documents and makes them searchable.

docs = [
    DocumentInfo(
        id="123",
        text="LlamaIndex connects your data to LLMs.",
        metadata={"topic": "AI"},
    ),
    DocumentInfo(
        id="456",
        text="Moss provides fast semantic search integration.",
        metadata={"topic": "Search"},
    ),
]

# Index the documents
await moss_tool.index_docs(docs)
await moss_tool._load_index()

Using with an Agent

Now we’ll expose the Moss tool’s methods (query, list_indexes, delete_index) to a ReActAgent. The agent will autonomously decide which tool to call based on the user’s question and the search results it retrieves.

from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

# Convert to tool list
tools = moss_tool.to_tool_list()

# Create an agent (Using OpenAI llm for demonstration)
# os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')
llm = OpenAI(model="gpt-4.1-mini", api_key=api_key)
agent = ReActAgent(tools=tools, llm=llm, verbose=True)

# Chat with the agent
response = await agent.run(user_msg="What is your return policy?")
print(response)

Generally, a return policy allows customers to return products within a specified period, usually 30 days, for a refund, exchange, or store credit, provided the items are in their original condition and packaging. However, specific return policies can vary by company, so it's best to check the exact terms on the company's website or contact their customer service for detailed information.