Using the LlamaExtract Web UI
To get started, head to cloud.llamaindex.ai. Login with the method of your choice.
We support login using OAuth 2.0 (Google, Github, Microsoft) and Email.
You should now see our welcome screen.
Creating an Extraction Agent
Section titled âCreating an Extraction AgentâAn Extraction Agent is a reusable configuration for extracting data from a specific type of content. This includes the schema you want to extract and other settings that affect the extraction process.
To get to extraction, click âExtraction (beta)â on the homepage or in the sidebar.
You will now have an option to create a new Extraction Agent or see existing ones if previously created. Give a name to your agent that does not conflict with existing ones and click âCreateâ. This will take you to
Defining the Extraction Schema
Section titled âDefining the Extraction SchemaâThe schema is the core of your extraction agent. It defines the structure of the data you want to extract. We recommend starting with a simple schema and then iteratively improving it.
Using the Schema Builder
Section titled âUsing the Schema BuilderâThe simplest way to define a schema is to use the Schema Builder. The Schema Builder supports a subset of the allowable JSON schema specification but it is sufficient for a wide range of use cases. e.g. the Schema Builder allows for defining nested objects and arrays.
To get a sense for how a complex schema can be defined, you can use one of the pre-defined templates for extraction. Refer to Schema Design Tips for tips on designing a schema for your use case.
Click on the âTemplateâ dropdown and select the Technical Resume template:
Notice how location is a nested object within the Basics section.
Using the Raw Editor
Section titled âUsing the Raw EditorâThere are also cases where the Schema Builder is not sufficient (e.g. Union and Enum types are not supported in the Schema Builder), or you already have a JSON schema that you want to use. In these cases, you can simply paste your schema into the Raw Editor.
Saving the Extraction Agent
Section titled âSaving the Extraction AgentâTo save the configuration of your Extraction Agent, use the âPublish Configurationâ button at the bottom of the âAgent Configurationâ pane. This will convert the schema into a standardized format, save the current state of the Extraction Agent and make it available to the Python SDK. Note that any changes that you make without saving the Agent state, will be used only when running extractions with the âRun Extractionâ button.
Note that every time you publish a configuration, this will immediately result in the Python SDK using the new schema/settings for the Extraction Agent.
Restoring a previous version of the Extraction Agent Configuration
Section titled âRestoring a previous version of the Extraction Agent ConfigurationâEvery run of an Extraction Agent stores the configuration that was used. In case you want to use a schema or setting from a previous run, from the âExtraction Resultâ tab you can identify the specific run, click the âeyeâ icon in the âActionsâ column and click âEditâ in the detail view that will open. This will bring you back to the Agent Configuration section, populated with the schema and settings from the previous run.
Other Settings
Section titled âOther SettingsâRefer to Options for other Extraction Agent options that affect the extraction process.
Running an Extraction
Section titled âRunning an ExtractionâOnce you are satisfied with your schema, upload a document and click âRun Extractionâ. This can take a few seconds to minutes depending on the size of the document and the complexity of the schema.
Once the extraction is complete, you should be able to see the results in the middle pane.:
The first run on a given file will take additional time since we parse and cache the document. This might be noticeable for larger documents. Subsequent schema iteration should be faster.
Viewing Past extractions
Section titled âViewing Past extractionsâYou can also view past extractions for your agent by clicking on the âExtraction Resultsâ tab. This will show you all the extractions that have been run using this agent. You can view the schema/settings used for the extraction and edit it to run a new extraction.
Next steps
Section titled âNext stepsâThe web UI makes it easy to test and iterate on your schema. Once youâre happy with a schema, you can scalably run extractions via the Python client.