Skip to content

Commit 345f917

Browse files
authored
Merge branch 'Unstructured-IO:main' into main
2 parents 0344ee9 + 928e192 commit 345f917

File tree

684 files changed

+14792
-6468
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

684 files changed

+14792
-6468
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ Everyone is welcome to contribute, and you can do it in a couple of ways:
1717

1818
To contribute changes to the documentation:
1919

20-
1. Fork the repo.
21-
2. Create a new branch with a descriptive name.
20+
1. If you're not a member of Unstructured team, start by forking the repo. If you are part of the team, you can clone this repo.
21+
2. Create a new branch with a descriptive name.
2222
3. Add your changes. DO NOT MANUALLY EDIT THE CONTENTS OF THE `snippets/destination_connectors` and `snippets/source_connectors` FOLDERS. These are auto-generated from tested code examples.
2323
4. Check for grammatical and technical correctness of the changes.
2424
5. Preview your changes locally. [See how below](#previewing-documentation-changes-locally).

api-reference/api-services/accessing-unstructured-api.mdx

Lines changed: 23 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,33 @@
22
title: Overview
33
---
44

5-
Whether you're using the free Unstructured API, the SaaS Unstructured API, Unstructured API on Azure/AWS or your local
6-
deployment of Unstructured API, you have several methods of accessing it. The functionality is the same across different methods.
5+
To process an individual file, you can choose from several available methods, including a direct `POST` request, Python code, and JavaScript/TypeScript code.
6+
Whether you're using the Free Unstructured API, the Unstructured Serverless API, the Unstructured API on Azure/AWS, or your local deployment of the Unstructured API, the functionality is the same.
77

88
Choose your preferred method:
99

10-
* [Sending POST requests to Unstructured API](/api-reference/api-services/post-requests)
11-
* [Using Python SDK](/api-reference/api-services/python-sdk)
12-
* [Using JavaScript SDK](/api-reference/api-services/javascript-sdk)
13-
* [Using API by calling `partition_via_api()` from Python code](/api-reference/api-services/partition-via-api)
10+
* [Use the Unstructured Python SDK](/api-reference/api-services/sdk-python)
11+
* [Use the Unstructured JavaScript/TypeScript SDK](/api-reference/api-services/sdk-jsts)
12+
* [Use the Unstructured open source Python library](/api-reference/api-services/partition-via-api)
13+
* [Make a direct POST request](/api-reference/api-services/post-requests)
1414

1515
The API parameters for all these methods are documented on the [API parameters](/api-reference/api-services/api-parameters) page.
1616

17-
If you'd like to try out the Unstructured API interactively, you can do so via the [Swagger UI](https://api.unstructured.io/general/docs#/default/pipeline_1_general_v0_general_post).
18-
Make sure to have your API key on hand.
17+
import UseIngestInstead from '/snippets/general-shared-text/use-ingest-instead.mdx';
1918

20-
1. Go to [Swagger UI](https://api.unstructured.io/general/docs#/default/pipeline_1_general_v0_general_post).
21-
2. Click "Try it out" for interactive testing.
22-
3. Enter your API key in the "unstructured-api-key" field.
23-
4. Choose your parameters in the "Request body".
24-
5. Click "execute" to send the request.
25-
6. Download or view the JSON output.
19+
<UseIngestInstead />
20+
21+
If you'd like to try out the Unstructured API interactively by using the Free Unstructured API to process a single file, you can do so by using the [Swagger UI](https://api.unstructured.io/general/docs#/default/pipeline_1_general_v0_general_post).
22+
23+
1. Go to the [Swagger UI](https://api.unstructured.io/general/docs#/default/pipeline_1_general_v0_general_post).
24+
2. For **Servers**, select **https://api.unstructured.io - Hosted API Free**.
25+
2. Click **Authorize**.
26+
3. In the **Available authorizations** dialog box, for **Value**, enter your Free Unstructured API key. [Get a Free Unstructured API key](/api-reference/api-services/free-api#get-an-api-key).
27+
4. Click **Authorize**.
28+
5. Click **Close**.
29+
6. Expand the **POST** section.
30+
7. In the **Request body** section, next to **files**, click **Choose File**.
31+
8. Browse to and select a file for the Free Unstructured API to process.
32+
9. Enter any other settings as desired. [Learn how](/api-reference/api-services/api-parameters).
33+
10. At the end of the list of settings, click **Execute**.
34+
11. See the results in the **Responses** section below the **Execute** button.

api-reference/api-services/api-parameters.mdx

Lines changed: 53 additions & 32 deletions
Large diffs are not rendered by default.

api-reference/api-services/api-validation-errors.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: API Validation Errors
2+
title: API validation errors
33
description: This section details the structure of HTTP validation errors returned by the API.
44
---
55

api-reference/api-services/aws.mdx

Lines changed: 145 additions & 148 deletions
Large diffs are not rendered by default.

api-reference/api-services/azure.mdx

Lines changed: 155 additions & 101 deletions
Large diffs are not rendered by default.

api-reference/api-services/chunking.mdx

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,13 @@ text element that was too big to fit in one chunk and required splitting.
1919
* `Table`: A table element is not combined with other elements and if it fits within `max_characters` it will remain as is.
2020
* `TableChunk`: large tables that exceed `max_characters` chunk size are split into special `TableChunk` elements.
2121

22+
import SharedChunkingStrategyBasic from '/snippets/concepts/chunking-strategy-basic.mdx';
2223

23-
import SharedChunkingStrategies from '/snippets/concepts/chunking-strategies.mdx';
24+
<SharedChunkingStrategyBasic/>
2425

25-
<SharedChunkingStrategies/>
26+
import SharedChunkingStrategyByTitle from '/snippets/concepts/chunking-strategy-by-title.mdx';
27+
28+
<SharedChunkingStrategyByTitle/>
2629

2730
### "by_page" chunking strategy
2831

@@ -46,3 +49,9 @@ guarantee that two elements with low similarity will not be combined in a single
4649
You can control the level of topic similarity you require for elements to have by setting the `similarity_threshold` parameter.
4750
`similarity_threshold` expects a value between 0.0 and 1.0 specifying the minimum similarity text in consecutive elements
4851
must have to be included in the same chunk. The default is 0.5.
52+
53+
###
54+
55+
## Learn more
56+
57+
<Icon icon="blog" />&nbsp;&nbsp;[Chunking for RAG: best practices](https://unstructured.io/blog/chunking-for-rag-best-practices)

0 commit comments

Comments
 (0)