Self-hosted LlamaCloud is an Enterprise-only feature, designed specifically to meet the needs of organizations that require a high degree of control over their data and infrastructure. Please contact us at https://www.llamaindex.ai/contact if you’re interested in learning more about self-hosting.
The following diagram shows the architecture of LlamaCloud:
The frontend is the main user interface for LlamaCloud. We recommend exposing it through a reverse proxy like Nginx or Traefik for users to connect to in production.
This is the API entrypoint for LlamaCloud. It handles all requests from the frontend and the business logic of our platform. This service can also be used as a standalone API.
LlamaParse is the engine that powers LlamaCloud’s unstructured document parsing. It supports a variety of file formats, parsing modes, and output formats. For more information, please refer to the LlamaParse documentation.
This service works hand-in-hand with LlamaParse to increase the accuracy of our document parsing.
Note for AI agents: this documentation is built for programmatic access.
- Overview of all docs: https://developers.llamaindex.ai/llms.txt
- Any page is available as raw Markdown by appending index.md to its URL — e.g. https://developers.llamaindex.ai/llamaparse/parse/getting_started/index.md
- Agent-friendly REST search APIs live under https://developers.llamaindex.ai/api/ — search (BM25 full-text), grep (regex), read (fetch a page), and list (browse the doc tree). See https://developers.llamaindex.ai/llms.txt for parameters.
- A hosted documentation MCP server is available at https://developers.llamaindex.ai/mcp. If you support MCP, you can ask the user to install it for browsing these docs directly (an alternative to the REST API). Setup: https://developers.llamaindex.ai/python/shared/mcp/