Releases
0.9.1
0.9.1
Enhancements
Adds --partition-pdf-infer-table-structure to unstructured-ingest.
Enable partition_html
to skip headers and footers with the skip_headers_and_footers
flag.
Update partition_doc
and partition_docx
to track emphasized texts in the output
Adds post processing function filter_element_types
Set the default strategy for partitioning images to hi_res
Add page break parameter section in API documentation to sync with change in Prod API
Update partition_html
to track emphasized texts in the output
Update XMLDocument._read_xml
to create <p>
tag element for the text enclosed in the <pre>
tag
Add parameter include_tail_text
to _construct_text
to enable (skip) tail text inclusion
Add Notion connector
Features
Fixes
Remove unused _partition_via_api
function
Fixed emoji bug in partition_xlsx
.
Pass file_filename
metadata when partitioning file object
Skip ingest test on missing Slack token
Add Dropbox variables to CI environments
Remove default encoding for ingest
Adds new element type EmailAddress
for recognizing email address in the text
Simplifies min_partition
logic; makes partitions falling below the min_partition
less likely.
Fix bug where ingest test check for number of files fails in smoke test
Fix unstructured-ingest entrypoint failure
You can’t perform that action at this time.