Commit 18c73ca
authored
fix: Do not hardcode file extension on temp files (#435)
This is a minor fix to improve our logging. When we buffer a file like
input to disk in `process_data_with_model`, we always use the name
`document.pdf`. This confused me when I found this in our logs:
```
2025-06-30 17:02:01,906 unstructured_inference INFO Reading image file: /var/folders/5k/frv076q97yl0ywybmzydhbsr0000gn/T/tmpc0uq7zde/document.pdf ...
2025-06-30 17:02:01,951 unstructured_api ERROR cannot identify image file '/private/var/folders/5k/frv076q97yl0ywybmzydhbsr0000gn/T/tmpc0uq7zde/document.pdf'
```
This path can be either pdfs or images, so let's just drop the extension
to save ourselves some confusion.
Also added a comment so we don't forget why it's using a temp dir, not a
temp file.1 parent 3abe07a commit 18c73ca
File tree
3 files changed
+12
-5
lines changed- unstructured_inference
- inference
3 files changed
+12
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
1 | 5 | | |
2 | 6 | | |
3 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
337 | 337 | | |
338 | 338 | | |
339 | 339 | | |
340 | | - | |
| 340 | + | |
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
344 | 347 | | |
345 | | - | |
| 348 | + | |
346 | 349 | | |
347 | 350 | | |
348 | 351 | | |
| |||
365 | 368 | | |
366 | 369 | | |
367 | 370 | | |
368 | | - | |
369 | | - | |
| 371 | + | |
| 372 | + | |
370 | 373 | | |
371 | 374 | | |
372 | 375 | | |
| |||
0 commit comments