Skip to content

landing-ai/ade-51

Repository files navigation

LandingAI ADE Plugin

ai_landing.webm

A FiftyOne plugin that provides operators for parsing, extracting, and splitting documents using LandingAI's Agentic Document Extraction (ADE) API. Converts PDFs, images, spreadsheets, and Office files into structured Markdown with spatial bounding box grounding stored as native FiftyOne Detections.

Installation

fiftyone plugins download https://github.com/landing-ai/ade-51

Install the required dependencies:

fiftyone plugins requirements @landingai/ade --install

Configuration

Set your LandingAI API key as the VISION_AGENT_API_KEY environment variable or add it to FiftyOne secrets. You can obtain an API key from the LandingAI dashboard.

export VISION_AGENT_API_KEY="your-api-key-here"

If you keep credentials in a .env file, launch FiftyOne from that directory and set:

VISION_AGENT_API_KEY="your-api-key-here"

If your organization has Zero Data Retention (ZDR) enabled, the plugin uses that account-level setting automatically. Password-protected parsing is supported through the optional Document password input on operators that parse files.

Usage

  1. Launch the App with a dataset that contains documents or images:
import fiftyone as fo
from fiftyone.utils.huggingface import load_from_hub

dataset = load_from_hub(
    "Voxel51/scanned_receipts",
    overwrite=True,
    persistent=True,
    name="scanned_receipts",
    max_samples=5,
)

session = fo.launch_app(dataset)
  1. Press ` or click Browse operations to open the operator list.

  2. Search for ADE to see all available operators.

Operators

ade_parse_document

Parse documents into structured Markdown with spatial bounding box grounding.

Calls the ADE synchronous Parse API. Each page element (text block, table, figure, logo) becomes a fo.Detection with normalized coordinates visible in the FiftyOne App grid and modal.

How to use:

  1. Open the operator list and search for ADE: Parse Document
  2. Choose a model, output field names, and whether to store grounding
  3. Click Execute

Options:

Option Description Default
Model dpt-2-latest (full, 3 credits/page) or dpt-2-mini-latest (simple docs, 1.5 credits/page) dpt-2-latest
Region us or eu endpoint us
Document password Password for password-protected files; requires a ZDR-enabled account empty
Output field (Markdown) Field where the parsed Markdown text is stored ade_parse
Store spatial grounding Store element bounding boxes as fo.Detections True
Output field (Grounding) Field where grounding detections are stored ade_grounding

What gets stored on each sample:

Field Type Content
ade_parse StringField Full parsed Markdown
ade_grounding Detections One detection per document element with label (type), bounding box, chunk_id, and page
ade_parse_metadata DictField page_count, credit_usage, filename, duration_ms, version

ade_extract_fields

Extract typed named fields from documents using a form-based schema.

Define each field with a name, description, and type (string, number, or boolean). Values are stored as flat, properly-typed top-level sample fields so FiftyOne shows the right filter widget in the App sidebar — range slider for numbers, toggle for booleans, text search for strings.

When Parse document first is enabled, parsed Markdown and grounding boxes are optionally saved so you can re-run with a different schema without paying parse credits again.

How to use:

  1. Open the operator list and search for ADE: Extract Fields
  2. Choose whether to parse documents first or use an existing Markdown field
  3. Define your fields in the form (name + description + type per row)
  4. Click Execute

Options:

Option Description Default
Parse document first Call Parse API before extracting True
Document password Password for password-protected files when parsing first; requires a ZDR-enabled account empty
Save Markdown to field Persist parsed Markdown for future runs (only when parsing first) ade_parse
Save grounding to field Persist grounding detections from the parse step (only when parsing first) ade_grounding
Existing Markdown field Field to read from when not parsing first ade_parse
Grounding field Detection field used for bbox correlation when not parsing first ade_grounding
Model Parse model (only used when parsing first) dpt-2-latest
Extraction model Model version for the Extract API extract-latest
Region us or eu endpoint us
Fields to extract Form rows of name + description + type Invoice example fields
Output field prefix Prefix for all stored fields ade_extraction

Default schema fields:

Name Description Type
invoice_number The unique invoice identifier string
vendor_name Name of the vendor or supplier string
total_amount Total amount due including taxes number
invoice_date Date the invoice was issued string

What gets stored on each sample:

Field Type Content
ade_extraction_{field_name} StringField / FloatField / BooleanField One field per schema entry, typed to match
ade_extraction_grounding Detections Bounding boxes correlating each extracted value to its document location
ade_extraction_meta DictField credit_usage, version, fallback_model_version, schema_violation_error, warnings

When Parse document first is enabled, these additional fields are written if their save fields are non-empty:

Field Type Content
Save Markdown to field StringField Parsed Markdown (reusable for future Extract or Split runs)
{save_parse_field}_metadata DictField page_count, credit_usage, filename, duration_ms, version
Save grounding to field Detections Grounding boxes for the parse chunks

ade_split_document

Classify and split multi-document files by document type.

Identify and separate bundled documents (e.g., a PDF that mixes invoices, contracts, and receipts) by providing a list of document types to look for.

Note: The ADE Split API is currently in preview and is not recommended for production workloads. Results may vary.

How to use:

  1. Open the operator list and search for ADE: Split / Classify Document
  2. Define your document types in the form (name + description per row)
  3. Click Execute

Options:

Option Description Default
Parse document first Call Parse API before splitting True
Document password Password for password-protected files when parsing first; requires a ZDR-enabled account empty
Existing Markdown field Field to read from when not parsing first ade_parse
Model Parse model (only used when parsing first) dpt-2-latest
Split model Model version for the Split API split-latest
Region us or eu endpoint us
Document types to classify Form rows of name + description + optional identifier (max 19) Invoice / Contract / Receipt
Output field Field where split results are stored ade_splits

What gets stored on each sample:

Field Type Content
ade_splits ListField List of {classification, identifier, pages, page_count, markdown_preview} per split
ade_splits_count IntField Number of splits found
ade_splits_type Classification Primary document type (first split's classification)
ade_splits_all_types ListField Unique document types found across all splits
ade_splits_metadata DictField credit_usage, filename, page_count, duration_ms, job_id, version

Models

Model Credits/page Best for
dpt-2-latest 3 Complex docs, scanned PDFs, tables, non-English, figures
dpt-2-mini-latest 1.5 Simple digital docs, invoices, forms; not for scanned or complex tables

Supported File Types

Category Extensions
Documents .pdf .docx .doc .odt
Images .png .jpg .jpeg .bmp .tiff .tif .webp .gif .apng .dcx .dds .dib .gd .icns .jp2 .pcx .ppm .psd .tga
Spreadsheets .xlsx .csv
Presentations .ppt .pptx

Samples with unsupported extensions are silently skipped.


About

LandingAI Agentic Document Extraction and Voxel51 Plugin

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages