Skip to content

[Hold][WIP] Self-hosted plugins overview and tutorial #569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -482,6 +482,13 @@
"self-hosted/gcp/overview",
"self-hosted/gcp/onboard"
]
},
{
"group": "Plugins",
"pages": [
"self-hosted/plugins/overview",
"self-hosted/plugins/tutorial"
]
}
]
},
Expand Down
Binary file added img/ui/Plugins-DAG.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
39 changes: 39 additions & 0 deletions self-hosted/plugins/overview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: Unstructured self-hosted plugins overview
sidebarTitle: Overview
---

In Unstructured, a _plugin_ is a self-contained unit of code that can be used to add, change, or use data within the context of an Unstructured ETL+ workflow. Every
node in a workflow is itself a plugin. You can also create your own plugins to extend your organization's workflow capabilities.

Developing, deploying, and running your own custom plugins is available only for
the [Unstructured user interface](/ui/overview) (UI) that has already been deployed to
infrastructure that you maintain in your
[Amazon Web Services (AWS)](/self-hosted/aws/overview), [Azure](/self-hosted/azure/overview), or
[Google Cloud Platform (GCP)](/self-hosted/gcp/overview) account.

If you do not already have a self-hosted deployment of the Unstructured UI,
contact your Unstructured sales representative, email Unstructured Sales at [[email protected]](mailto:[email protected]), or fill out the
[contact form](https://unstructured.io/contact) on the Unstructured website, and a member of the Unstructured sales or support teams
will get back to you as soon as possible to discuss self-hosting options.

## Concepts

Plugins are rather straightforward in they accept a named input and emit a named output. The following diagram illustrates this concept:

![Conceptual programmatic flow of plugins](/img/ui/Plugins-DAG.png)

In the preceding diagram:

- The blue boxes represent the default plugins that come with Unstructured.
- The yellow circles describe what each default plugin does.
- The green box represents the indexer that gathers all of the source files.
- The red box represents the destination location.
- The arrows represent the flow of data between the plugins.
- The words within the arrows represent the programmatic names of the inputs and outputs of the plugins. For example,
the **Partitioner** plugin accepts its input, represented by the programmatic name `doc_path`, from the previous plugin.
The **Partitioner** plugin emits its output, represented by the programmatic name `element_dicts` to the next plugin.

## Getting started

To get started with eveloping, deploying, and running your own custom plugins, try out the [tutorial](/self-hosted/plugins/tutorial).
Loading