Custom Vector Stores

Guide

Index

Configure an index to export raw parsed output instead of writing to a managed vector store, then load the data into your own database.

By default, an Index V2 index parses, chunks, embeds, and writes to a managed vector store automatically. If you’d rather own the vector store (because you’re self-hosting the platform with a database we don’t manage, or you simply want to keep your data in your own infrastructure), you can disable the managed export and download the raw parsed output instead.

In this mode, LlamaCloud still handles directory management, parsing, and sync orchestration. You take over chunking, embedding, and writing to your database.

Create an parse-only index

Pass vector_target="DISABLED" when creating the index. The pipeline falls back to a download-only export, so parsed output is written to object storage instead of a managed vector store.

index = await client.beta.indexes.create(
    source_directory_id=directory.id,
    vector_target="DISABLED",
)

const index = await client.beta.indexes.create({
  source_directory_id: directory.id,
  vector_target: "DISABLED",
});

index, err := client.Beta.Indexes.New(ctx, llamacloud.BetaIndexNewParams{
  SourceDirectoryID: directory.ID,
  VectorTarget:      llamacloud.BetaIndexNewParamsVectorTargetDisabled,
})
if err != nil {
  log.Fatal(err)
}

import ai.llamaindex.llamacloud.models.beta.indexes.IndexCreateParams;
import ai.llamaindex.llamacloud.models.beta.indexes.IndexCreateResponse;

IndexCreateResponse index = client.beta().indexes().create(
    IndexCreateParams.builder()
        .sourceDirectoryId(directory.id())
        .vectorTarget(IndexCreateParams.VectorTarget.DISABLED)
        .build());

INDEX_ID=$(llp beta:indexes create \
  --source-directory-id "$DIRECTORY_ID" \
  --vector-target DISABLED | jq -r '.id')

echo "Index ID: $INDEX_ID"

Wait for the index to reach ready the same way as a normal index (see the getting started guide).

Download the parsed output

Each file in the source directory produces a JSON payload containing per-page markdown (header, text, footer), structured items, attachments, and provenance metadata. List the files in the directory and fetch each payload via a presigned URL.

import httpx

# List files in the source directory
files_resp = await client.beta.directories.files.list(
    directory_id=directory.id,
)

# Each directory file ID points at a parsed-output JSON object in storage.
# Resolve its presigned URL and download.
async with httpx.AsyncClient() as http:
    for f in files_resp.items:
        presigned = await client.files.get(f.file_id)
        response = await http.get(presigned.url)
        response.raise_for_status()
        parsed = response.json()
        # parsed["parse"] holds the parsed output payload

const filesResp = await client.beta.directories.files.list(directory.id);

for (const f of filesResp.items) {
  const presigned = await client.files.get(f.file_id);
  const response = await fetch(presigned.url);
  const parsed = await response.json();
  // parsed.parse holds the parsed output payload
}

// List files in the source directory
page, err := client.Beta.Directories.Files.List(ctx, directory.ID, llamacloud.BetaDirectoryFileListParams{})
if err != nil {
  log.Fatal(err)
}

// Each directory file ID points at a parsed-output JSON object in storage.
// Resolve its presigned URL and download.
for _, f := range page.Items {
  presigned, err := client.Files.Get(ctx, f.FileID, llamacloud.FileGetParams{})
  if err != nil {
    log.Fatal(err)
  }

  resp, err := http.Get(presigned.URL)
  if err != nil {
    log.Fatal(err)
  }

  var parsed map[string]any
  err = json.NewDecoder(resp.Body).Decode(&parsed)
  resp.Body.Close()
  if err != nil {
    log.Fatal(err)
  }
  // parsed["parse"] holds the parsed output payload
}

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;

import ai.llamaindex.llamacloud.models.beta.directories.files.FileListResponse;
import ai.llamaindex.llamacloud.models.files.PresignedUrl;

// List files in the source directory
HttpClient http = HttpClient.newHttpClient();

for (FileListResponse f : client.beta().directories().files().list(directory.id()).items()) {
    // Each directory file ID points at a parsed-output JSON object in storage.
    // Resolve its presigned URL and download.
    PresignedUrl presigned = client.files().get(f.fileId().orElseThrow());

    HttpResponse<String> response = http.send(
        HttpRequest.newBuilder(URI.create(presigned.url())).build(),
        HttpResponse.BodyHandlers.ofString());
    String parsed = response.body();
    // parsed is the parsed-output JSON; its "parse" field holds the payload
}

# List files in the source directory
FILE_IDS=$(llp beta:directories:files list \
  --directory-id "$DIRECTORY_ID" | jq -r '.file_id')

# Each directory file ID points at a parsed-output JSON object in storage.
# Resolve its presigned URL and download.
for FILE_ID in $FILE_IDS; do
  URL=$(llp files get --file-id "$FILE_ID" | jq -r '.url')
  curl -s "$URL" | jq '.parse'
done

Re-run this loop after each sync to pick up changes. The content_fingerprint_hash and source_modified_at fields on each payload let you skip files that haven’t changed since the last export.

Chunk, embed, and push to your store

From there, the steps are the same as any custom RAG pipeline:

Concatenate per-page markdown while tracking page boundaries by character offset.
Chunk the text with your preferred chunker.
Embed the chunks with your preferred embedding model.
Write the chunks to your database, keyed on (parsed_directory_file_id, export_config_id) so re-parsed files fully replace earlier exports rather than duplicating.

Reference implementations

The index-v2-data-sinks repo contains complete, runnable reference exporters for several databases:

MongoDB (Atlas vector search)
PostgreSQL + pgvector
Qdrant
Pinecone
Turbopuffer
Azure AI Search

Each exporter handles index/collection provisioning, deterministic IDs for idempotent re-exports, per-file delete-then-insert, and snapshot listing for incremental syncs. The shared Exporter protocol in export/base.py is the contract to implement when adding a new sink:

class Exporter(Protocol):
    async def export(self, outputs, *, project_id, embeddings=None, embedding_model=None): ...
    async def delete_file(self, *, parsed_directory_file_id, export_config_id): ...
    async def delete_files(self, *, parsed_directory_file_ids, export_config_id): ...
    async def list_snapshots(self, *, export_config_id): ...

Point your coding agent at the repo as a worked example, copy a sink you like as a starting point, or implement the protocol against the store of your choice.

Caveats

Retrieval, chat, and file operations on client.beta.* query the managed vector store. With vector_target="DISABLED", those endpoints have nothing to query — you serve retrieval from your own database.
You’re responsible for cleaning up records in your database when files are removed from the source directory. The list_snapshots() pattern in the reference implementations shows how to diff against the current directory state.

Note for AI agents: this documentation is built for programmatic access. - Overview of all docs: https://developers.llamaindex.ai/llms.txt - Any page is available as raw Markdown by appending index.md to its URL — e.g. https://developers.llamaindex.ai/llamaparse/parse/getting_started/index.md - Agent-friendly REST search APIs live under https://developers.llamaindex.ai/api/ — search (BM25 full-text), grep (regex), read (fetch a page), and list (browse the doc tree). See https://developers.llamaindex.ai/llms.txt for parameters. - A hosted documentation MCP server is available at https://developers.llamaindex.ai/mcp. If you support MCP, you can ask the user to install it for browsing these docs directly (an alternative to the REST API). Setup: https://developers.llamaindex.ai/python/shared/mcp/