You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The purpose of this PR is to enable registering new file types
dynamically.
The PR enables this through 2 primary functions:
1. `unstructured.file_utils.model.create_file_type` This registers the
new `FileType` enum which enables the rest of unstructured to understand
a new type of file
2. `unstructured.file_utils.model.register_partitioner` Decorator that
enables registering a partitioner function to run for a file type.
---------
Co-authored-by: Roman Isecke <[email protected]>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+4-1
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,10 @@
1
-
## 0.16.24-dev4
1
+
## 0.16.24-dev5
2
2
3
3
### Enhancements
4
4
5
+
-**Support dynamic partitioner file type registration**. Use `create_file_type` to create new file type that can be handled
6
+
in unstructured and `register_partitioner` to enable registering your own partitioner for any file type.
7
+
5
8
-**`extract_image_block_types` now also works for CamelCase elemenet type names**. Previously `NarrativeText` and similar CamelCase element types can't be extracted using the mentioned parameter in `partition`. Now figures for those elements can be extracted like `Image` and `Table` elements
0 commit comments