Commit f8bf10d
authored
enhancement: add text sources (#449)
Added `IsExtracted` for tracking source of element text.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces `IsExtracted` across
`TextRegion`/`TextRegions`/`LayoutElement(s)` with refactored
auto-initialization, updates slicing/concat/cleaning/equality to carry
the flag and sources, and bumps version to 1.1.0.
>
> - **Core models**:
> - Add `IsExtracted` enum; extend `TextRegion` with `is_extracted` and
`from_coords(..., is_extracted=...)`.
> - Refactor `TextRegions.__post_init__` to auto-initialize optional
arrays from scalar fields (`source` → `sources`, `is_extracted` →
`is_extracted_array`).
> - Ensure slicing, iteration, and `from_list` preserve `sources` and
`is_extracted_array`.
> - **Layout elements**:
> - Propagate `is_extracted` through `LayoutElement`/`LayoutElements`
(`to_dict`, `from_region`, `from_coords`, `from_list`, `concatenate`,
`iter_elements`, `slice`, `__eq__`).
> - Include `is_extracted_array` in cleaning utilities
(`clean_layoutelements*`) and concatenation outputs.
> - **Tests**:
> - Update/expand tests to validate `sources` and `is_extracted`
propagation, slicing, `from_list`, and inheritance behavior.
> - **Release**:
> - Bump version to `1.1.0` and update `CHANGELOG.md`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
03af14e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->1 parent e6c8d19 commit f8bf10d
File tree
8 files changed
+299
-52
lines changed- test_unstructured_inference
- inference
- unstructured_inference
- inference
8 files changed
+299
-52
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
1 | 6 | | |
2 | 7 | | |
3 | 8 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
| |||
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
37 | | - | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
43 | 44 | | |
44 | 45 | | |
45 | 46 | | |
46 | | - | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
Lines changed: 31 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
2 | 3 | | |
3 | 4 | | |
4 | | - | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| 9 | + | |
8 | 10 | | |
9 | 11 | | |
10 | 12 | | |
| |||
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| 12 | + | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
| |||
56 | 58 | | |
57 | 59 | | |
58 | 60 | | |
59 | | - | |
| 61 | + | |
60 | 62 | | |
61 | 63 | | |
62 | 64 | | |
| |||
307 | 309 | | |
308 | 310 | | |
309 | 311 | | |
310 | | - | |
| 312 | + | |
311 | 313 | | |
312 | 314 | | |
313 | 315 | | |
| |||
408 | 410 | | |
409 | 411 | | |
410 | 412 | | |
411 | | - | |
412 | | - | |
| 413 | + | |
| 414 | + | |
413 | 415 | | |
414 | 416 | | |
415 | 417 | | |
416 | 418 | | |
417 | 419 | | |
418 | 420 | | |
419 | 421 | | |
420 | | - | |
| 422 | + | |
421 | 423 | | |
422 | 424 | | |
423 | 425 | | |
424 | 426 | | |
425 | 427 | | |
426 | 428 | | |
427 | | - | |
| 429 | + | |
428 | 430 | | |
429 | 431 | | |
430 | 432 | | |
431 | 433 | | |
432 | 434 | | |
433 | | - | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
434 | 441 | | |
435 | 442 | | |
436 | 443 | | |
| |||
449 | 456 | | |
450 | 457 | | |
451 | 458 | | |
452 | | - | |
453 | | - | |
| 459 | + | |
| 460 | + | |
454 | 461 | | |
455 | 462 | | |
456 | 463 | | |
| |||
463 | 470 | | |
464 | 471 | | |
465 | 472 | | |
466 | | - | |
467 | | - | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
468 | 477 | | |
469 | 478 | | |
470 | 479 | | |
| |||
479 | 488 | | |
480 | 489 | | |
481 | 490 | | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
10 | 16 | | |
11 | 17 | | |
12 | 18 | | |
| |||
0 commit comments