Workflows API

LlamaDeploy runs your LlamaIndex workflows locally and in the cloud. Author your workflows, then export them so the app server can discover and expose them as HTTP APIs.

Learn the basics (LlamaIndex Workflows)

If you’re new to LlamaIndex workflows, start here: LlamaIndex Workflows v1.

Key ideas:

Define workflows by subclassing Workflow and composing @step functions
Use StartEvent to pass inputs and StopEvent to end the run
Optionally define typed events and manage state/context across steps

Author a workflow (quick example)

from llama_index.core.workflow import Workflow, step, StartEvent, StopEvent

class JokeFlow(Workflow):
    @step
    async def generate(self, ev: StartEvent) -> StopEvent:
        topic = ev.topic
        return StopEvent(result=f"Joke about {topic}")

workflow = JokeFlow(timeout=60)

Export workflows so LlamaDeploy can serve them

You can expose workflows in two ways: map individual workflows, or register them in a WorkflowServer and export that as a single app entrypoint.

Option A: Map individual workflows (simple)

Define workflow instances in your code, then reference them from config. See examples in the Deployment Config Reference.

Option B: Bundle with WorkflowServer (advanced)

Aggregate multiple workflows and export a single WorkflowServer instance.

from workflows.server import WorkflowServer
from my_package.jokes import workflow as joke_workflow
from my_package.other import workflow as other_workflow

app = WorkflowServer()
app.add_workflow("joke", joke_workflow)
app.add_workflow("other", other_workflow)

Reference the app entrypoint in config (mutually exclusive with workflows); see Deployment Config Reference.

How serving works (locally and in the cloud)

llamactl serve discovers your config (pyproject.toml or llama_deploy.yaml). See llamactl serve.
The app server loads your workflows (or your WorkflowServer).
HTTP routes are exposed under /deployments/{name} for your app name.
If you configured a UI, it runs alongside your API (proxy in dev, static in preview). See UI build and dev integration.

To persist structured outputs from your workflows (extractions, events, etc.), use Agent Data Overview and language guides for TypeScript and Python.

Workflow HTTP API

When using a WorkflowServer, the app server exposes these routes (prefixed by /deployments/{name}):

GET /workflows: List registered workflow names
POST /workflows/{workflowName}/run: Run and wait; body supports:
- start_event: JSON string of a serialized StartEvent
- context: object for workflow context
- kwargs: object of keyword args passed to .run()
POST /workflows/{workflowName}/run-nowait: Start async; returns handler_id
GET /results/{handler_id}: Fetch result if finished; 202 if not ready
GET /events/{handler_id}?sse=true|false: Stream events as SSE (text/event-stream) or NDJSON (application/x-ndjson)
GET /health: App server health

Notes:

start_event uses LlamaIndex’s JSON serializer; pass a JSON string of the serialized event when you need a custom StartEvent
Event streams include serializer metadata (qualified name and value) to reconstruct types
All endpoints are namespaced per deployment at /deployments/{name}