Skip to content

Commit ab88e20

Browse files
chore: bump unstructured-inference 0.7.36 (#3275)
### Summary - bump unstructured-inference to `0.7.35` which fixed `ValueError` when converting cells to HTML in the table processing subpipeline - cut a release for `0.14.8` --------- Co-authored-by: Matt Robinson <[email protected]> Co-authored-by: Matt Robinson <[email protected]>
1 parent ce591e2 commit ab88e20

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+125
-105
lines changed

.github/workflows/docker-publish.yml

+1
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ jobs:
4747
password: ${{ secrets.QUAY_IO_ROBOT_TOKEN }}
4848
- name: Build images
4949
run: |
50+
ARCH=$(cut -d "/" -f2 <<< ${{ matrix.docker-platform }})
5051
DOCKER_BUILDKIT=1 docker buildx build --platform=${{ matrix.docker-platform }} --load \
5152
-f Dockerfile \
5253
--build-arg PIP_VERSION=$PIP_VERSION \

CHANGELOG.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## 0.14.8-dev4
1+
## 0.14.8
22

33
### Enhancements
44

@@ -8,6 +8,7 @@
88

99
### Fixes
1010

11+
* **Bump unstructured-inference==0.7.36** Fix `ValueError` when converting cells to html.
1112
* **`partition()` now forwards `strategy` arg to `partition_docx()`, `partition_ppt()`, and `partition_pptx()`.** A `strategy` argument passed to `partition()` (or the default value "auto" assigned by `partition()`) is now forwarded to `partition_docx()`, `partition_ppt()`, and `partition_pptx()` when those filetypes are detected.
1213

1314
* **Fix missing sensitive field markers** for embedders

requirements/base.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ unstructured-client==0.18.0
103103
# via
104104
# -c ././deps/constraints.txt
105105
# -r ./base.in
106-
urllib3==1.26.18
106+
urllib3==1.26.19
107107
# via
108108
# -c ././deps/constraints.txt
109109
# requests

requirements/deps/constraints.txt

+4
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,7 @@ wrapt>=1.14.0
6262

6363
# NOTE(robinson): for compatiblity with voyage embeddings
6464
langsmith==0.1.62
65+
66+
# NOTE(robinson): choma was pinned to importlib-metadata>=7.1.0 but 7.1.0 was installed
67+
# instead of 7.2.0. Need to investigate
68+
importlib-metadata==7.1.0

requirements/dev.txt

+5-4
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ executing==2.0.1
8282
# via stack-data
8383
fastjsonschema==2.20.0
8484
# via nbformat
85-
filelock==3.15.1
85+
filelock==3.15.4
8686
# via virtualenv
8787
fqdn==1.5.1
8888
# via jsonschema
@@ -104,6 +104,7 @@ idna==3.7
104104
# requests
105105
importlib-metadata==7.1.0
106106
# via
107+
# -c ././deps/constraints.txt
107108
# build
108109
# jupyter-client
109110
# jupyter-lsp
@@ -265,7 +266,7 @@ prompt-toolkit==3.0.47
265266
# via
266267
# ipython
267268
# jupyter-console
268-
psutil==5.9.8
269+
psutil==6.0.0
269270
# via ipykernel
270271
ptyprocess==0.7.0
271272
# via
@@ -400,13 +401,13 @@ typing-extensions==4.12.2
400401
# ipython
401402
uri-template==1.3.0
402403
# via jsonschema
403-
urllib3==1.26.18
404+
urllib3==1.26.19
404405
# via
405406
# -c ././deps/constraints.txt
406407
# -c ./base.txt
407408
# -c ./test.txt
408409
# requests
409-
virtualenv==20.26.2
410+
virtualenv==20.26.3
410411
# via pre-commit
411412
wcwidth==0.2.13
412413
# via prompt-toolkit

requirements/extra-markdown.txt

+3-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,9 @@
55
# pip-compile ./extra-markdown.in
66
#
77
importlib-metadata==7.1.0
8-
# via markdown
8+
# via
9+
# -c ././deps/constraints.txt
10+
# markdown
911
markdown==3.6
1012
# via -r ./extra-markdown.in
1113
zipp==3.19.2

requirements/extra-paddleocr.txt

+8-6
Original file line numberDiff line numberDiff line change
@@ -53,14 +53,16 @@ idna==3.7
5353
# via
5454
# -c ./base.txt
5555
# requests
56-
imageio==2.34.1
56+
imageio==2.34.2
5757
# via
5858
# imgaug
5959
# scikit-image
6060
imgaug==0.4.0
6161
# via unstructured-paddleocr
6262
importlib-metadata==7.1.0
63-
# via flask
63+
# via
64+
# -c ././deps/constraints.txt
65+
# flask
6466
importlib-resources==6.4.0
6567
# via matplotlib
6668
itsdangerous==2.2.0
@@ -151,7 +153,7 @@ protobuf==4.23.4
151153
# via
152154
# -c ././deps/constraints.txt
153155
# visualdl
154-
psutil==5.9.8
156+
psutil==6.0.0
155157
# via visualdl
156158
pyclipper==1.3.0.post5
157159
# via unstructured-paddleocr
@@ -181,7 +183,7 @@ requests==2.32.3
181183
# -c ./base.txt
182184
# premailer
183185
# visualdl
184-
scikit-image==0.22.0
186+
scikit-image==0.24.0
185187
# via
186188
# imgaug
187189
# unstructured-paddleocr
@@ -202,7 +204,7 @@ six==1.16.0
202204
# imgaug
203205
# python-dateutil
204206
# visualdl
205-
tifffile==2024.5.22
207+
tifffile==2024.6.18
206208
# via scikit-image
207209
tqdm==4.66.4
208210
# via
@@ -212,7 +214,7 @@ tzdata==2024.1
212214
# via pandas
213215
unstructured-paddleocr==2.6.1.3
214216
# via -r ./extra-paddleocr.in
215-
urllib3==1.26.18
217+
urllib3==1.26.19
216218
# via
217219
# -c ././deps/constraints.txt
218220
# -c ./base.txt

requirements/extra-pdf-image.in

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ google-cloud-vision
1212
effdet
1313
# Do not move to constraints.in, otherwise unstructured-inference will not be upgraded
1414
# when unstructured library is.
15-
unstructured-inference==0.7.35
15+
unstructured-inference==0.7.36
1616
# unstructured fork of pytesseract that provides an interface to allow for multiple output formats
1717
# from one tesseract call
1818
unstructured.pytesseract>=0.3.12

requirements/extra-pdf-image.txt

+6-6
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ deprecated==1.2.14
3232
# via pikepdf
3333
effdet==0.4.1
3434
# via -r ./extra-pdf-image.in
35-
filelock==3.15.1
35+
filelock==3.15.4
3636
# via
3737
# huggingface-hub
3838
# torch
@@ -167,9 +167,9 @@ pillow==10.3.0
167167
# unstructured-pytesseract
168168
pillow-heif==0.16.0
169169
# via -r ./extra-pdf-image.in
170-
portalocker==2.8.2
170+
portalocker==2.10.0
171171
# via iopath
172-
proto-plus==1.23.0
172+
proto-plus==1.24.0
173173
# via
174174
# google-api-core
175175
# google-cloud-vision
@@ -253,7 +253,7 @@ sympy==1.12.1
253253
# via
254254
# onnxruntime
255255
# torch
256-
timm==1.0.3
256+
timm==1.0.7
257257
# via
258258
# effdet
259259
# unstructured-inference
@@ -287,13 +287,13 @@ typing-extensions==4.12.2
287287
# torch
288288
tzdata==2024.1
289289
# via pandas
290-
unstructured-inference==0.7.35
290+
unstructured-inference==0.7.36
291291
# via -r ./extra-pdf-image.in
292292
unstructured-pytesseract==0.3.12
293293
# via
294294
# -c ././deps/constraints.txt
295295
# -r ./extra-pdf-image.in
296-
urllib3==1.26.18
296+
urllib3==1.26.19
297297
# via
298298
# -c ././deps/constraints.txt
299299
# -c ./base.txt

requirements/huggingface.txt

+2-2
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ click==8.1.7
1717
# via
1818
# -c ./base.txt
1919
# sacremoses
20-
filelock==3.15.1
20+
filelock==3.15.4
2121
# via
2222
# huggingface-hub
2323
# torch
@@ -107,7 +107,7 @@ typing-extensions==4.12.2
107107
# -c ./base.txt
108108
# huggingface-hub
109109
# torch
110-
urllib3==1.26.18
110+
urllib3==1.26.19
111111
# via
112112
# -c ././deps/constraints.txt
113113
# -c ./base.txt

requirements/ingest/airtable.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ typing-extensions==4.12.2
3737
# pyairtable
3838
# pydantic
3939
# pydantic-core
40-
urllib3==1.26.18
40+
urllib3==1.26.19
4141
# via
4242
# -c ./ingest/../base.txt
4343
# -c ./ingest/../deps/constraints.txt

requirements/ingest/astra.txt

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ anyio==3.7.1
88
# via
99
# -c ./ingest/../deps/constraints.txt
1010
# httpx
11-
astrapy==1.2.1
11+
astrapy==1.3.0
1212
# via -r ./ingest/astra.in
1313
bson==0.5.10
1414
# via astrapy
@@ -85,7 +85,7 @@ sniffio==1.3.1
8585
# httpx
8686
toml==0.10.2
8787
# via astrapy
88-
urllib3==1.26.18
88+
urllib3==1.26.19
8989
# via
9090
# -c ./ingest/../base.txt
9191
# -c ./ingest/../deps/constraints.txt

requirements/ingest/azure-cognitive-search.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ typing-extensions==4.12.2
3838
# via
3939
# -c ./ingest/../base.txt
4040
# azure-core
41-
urllib3==1.26.18
41+
urllib3==1.26.19
4242
# via
4343
# -c ./ingest/../base.txt
4444
# -c ./ingest/../deps/constraints.txt

requirements/ingest/azure.txt

+6-10
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ azure-core==1.30.2
2121
# azure-storage-blob
2222
azure-datalake-store==0.0.53
2323
# via adlfs
24-
azure-identity==1.16.1
24+
azure-identity==1.17.1
2525
# via adlfs
2626
azure-storage-blob==12.20.0
2727
# via adlfs
@@ -60,23 +60,18 @@ idna==3.7
6060
# yarl
6161
isodate==0.6.1
6262
# via azure-storage-blob
63-
msal==1.28.1
63+
msal==1.29.0
6464
# via
6565
# azure-datalake-store
6666
# azure-identity
6767
# msal-extensions
68-
msal-extensions==1.1.0
68+
msal-extensions==1.2.0
6969
# via azure-identity
7070
multidict==6.0.5
7171
# via
7272
# aiohttp
7373
# yarl
74-
packaging==23.2
75-
# via
76-
# -c ./ingest/../base.txt
77-
# -c ./ingest/../deps/constraints.txt
78-
# msal-extensions
79-
portalocker==2.8.2
74+
portalocker==2.10.0
8075
# via msal-extensions
8176
pycparser==2.22
8277
# via cffi
@@ -97,8 +92,9 @@ typing-extensions==4.12.2
9792
# via
9893
# -c ./ingest/../base.txt
9994
# azure-core
95+
# azure-identity
10096
# azure-storage-blob
101-
urllib3==1.26.18
97+
urllib3==1.26.19
10298
# via
10399
# -c ./ingest/../base.txt
104100
# -c ./ingest/../deps/constraints.txt

requirements/ingest/box.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ six==1.16.0
5151
# via
5252
# -c ./ingest/../base.txt
5353
# python-dateutil
54-
urllib3==1.26.18
54+
urllib3==1.26.19
5555
# via
5656
# -c ./ingest/../base.txt
5757
# -c ./ingest/../deps/constraints.txt

requirements/ingest/chroma.txt

+18-5
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ annotated-types==0.7.0
99
anyio==3.7.1
1010
# via
1111
# -c ./ingest/../deps/constraints.txt
12+
# httpx
1213
# starlette
1314
# watchfiles
1415
asgiref==3.8.1
@@ -27,6 +28,8 @@ certifi==2024.6.2
2728
# via
2829
# -c ./ingest/../base.txt
2930
# -c ./ingest/../deps/constraints.txt
31+
# httpcore
32+
# httpx
3033
# kubernetes
3134
# requests
3235
charset-normalizer==3.3.2
@@ -35,7 +38,7 @@ charset-normalizer==3.3.2
3538
# requests
3639
chroma-hnswlib==0.7.3
3740
# via chromadb
38-
chromadb==0.5.0
41+
chromadb==0.5.3
3942
# via -r ./ingest/chroma.in
4043
click==8.1.7
4144
# via
@@ -52,7 +55,7 @@ exceptiongroup==1.2.1
5255
# via anyio
5356
fastapi==0.110.3
5457
# via chromadb
55-
filelock==3.15.1
58+
filelock==3.15.4
5659
# via huggingface-hub
5760
flatbuffers==24.3.25
5861
# via onnxruntime
@@ -69,9 +72,15 @@ grpcio==1.64.1
6972
# chromadb
7073
# opentelemetry-exporter-otlp-proto-grpc
7174
h11==0.14.0
72-
# via uvicorn
75+
# via
76+
# httpcore
77+
# uvicorn
78+
httpcore==1.0.5
79+
# via httpx
7380
httptools==0.6.1
7481
# via uvicorn
82+
httpx==0.27.0
83+
# via chromadb
7584
huggingface-hub==0.23.4
7685
# via tokenizers
7786
humanfriendly==10.0
@@ -80,9 +89,11 @@ idna==3.7
8089
# via
8190
# -c ./ingest/../base.txt
8291
# anyio
92+
# httpx
8393
# requests
8494
importlib-metadata==7.1.0
8595
# via
96+
# -c ./ingest/../deps/constraints.txt
8697
# -r ./ingest/chroma.in
8798
# build
8899
# opentelemetry-api
@@ -214,7 +225,9 @@ six==1.16.0
214225
# posthog
215226
# python-dateutil
216227
sniffio==1.3.1
217-
# via anyio
228+
# via
229+
# anyio
230+
# httpx
218231
starlette==0.37.2
219232
# via fastapi
220233
sympy==1.12.1
@@ -247,7 +260,7 @@ typing-extensions==4.12.2
247260
# starlette
248261
# typer
249262
# uvicorn
250-
urllib3==1.26.18
263+
urllib3==1.26.19
251264
# via
252265
# -c ./ingest/../base.txt
253266
# -c ./ingest/../deps/constraints.txt

0 commit comments

Comments
 (0)