Commit 2082d4f
authored
adding logic to trim pages that are too large to process (#211)
### Notes
Improves client logic when a PDF page is very long: trims the x/y
coordinates down to a reasonable size (hi-res only). Note: this does not
affect output of text: the reader is still able to process the entire
page for text.
### Testing
Manually tested changes on large file. Added integration test verifying
large pages now process successfully.1 parent a9b7b0b commit 2082d4f
File tree
3 files changed
+46
-0
lines changed- _sample_docs
- _test_unstructured_client/integration
- src/unstructured_client/_hooks/custom
3 files changed
+46
-0
lines changedBinary file not shown.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
185 | 185 | | |
186 | 186 | | |
187 | 187 | | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
188 | 199 | | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
189 | 203 | | |
190 | 204 | | |
191 | 205 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| 53 | + | |
| 54 | + | |
53 | 55 | | |
54 | 56 | | |
55 | 57 | | |
| |||
334 | 336 | | |
335 | 337 | | |
336 | 338 | | |
| 339 | + | |
| 340 | + | |
337 | 341 | | |
338 | 342 | | |
339 | 343 | | |
| |||
423 | 427 | | |
424 | 428 | | |
425 | 429 | | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
426 | 458 | | |
427 | 459 | | |
428 | 460 | | |
| |||
0 commit comments