Skip to content
LlamaIndex Framework
Component Guides
Loading
Connector

Data Connectors (LlamaHub)

A data connector (aka Reader) ingest data from different data sources and data formats into a simple Document representation (text and simple metadata).

Our data connectors are offered through LlamaHub 🦙. LlamaHub is an open-source repository containing data loaders that you can easily plug and play into any LlamaIndex application.

Get started with:

from llama_index.core import download_loader
from llama_index.readers.google import GoogleDocsReader
loader = GoogleDocsReader()
documents = loader.load_data(document_ids=[...])

See the full usage pattern guide for more details.

Some sample data connectors:

  • local file directory (SimpleDirectoryReader). Can support parsing a wide range of file types: .pdf, .jpg, .png, .docx, etc.
  • Notion (NotionPageReader)
  • Google Docs (GoogleDocsReader)
  • Slack (SlackReader)
  • Discord (DiscordReader)
  • Apify Actors (ApifyActor). Can crawl the web, scrape webpages, extract text content, download files including .pdf, .jpg, .png, .docx, etc.

See the modules guide for more details.

Note for AI agents: this documentation is built for programmatic access. - Overview of all docs: https://developers.llamaindex.ai/llms.txt - Any page is available as raw Markdown by appending index.md to its URL — e.g. https://developers.llamaindex.ai/llamaparse/parse/getting_started/index.md - Agent-friendly REST search APIs live under https://developers.llamaindex.ai/api/ — search (BM25 full-text), grep (regex), read (fetch a page), and list (browse the doc tree). See https://developers.llamaindex.ai/llms.txt for parameters. - A hosted documentation MCP server is available at https://developers.llamaindex.ai/mcp. If you support MCP, you can ask the user to install it for browsing these docs directly (an alternative to the REST API). Setup: https://developers.llamaindex.ai/python/shared/mcp/