Skip to content

Feat: Extract Engine Upgrade#196

Open
rsteele5 wants to merge 8 commits into
g5from
feat/extract_engine_upgrade
Open

Feat: Extract Engine Upgrade#196
rsteele5 wants to merge 8 commits into
g5from
feat/extract_engine_upgrade

Conversation

@rsteele5
Copy link
Copy Markdown
Collaborator

@rsteele5 rsteele5 commented Jan 2, 2026

Goals:

  • Add ProcessingVisitors to the ComponentRegistry and utilize in DocumentExtractEngine
    • The visitor runtime config can be stored in base-config.yml under processing.visitors as a string list (.../config/extract/base/base-config.yml), and can be overridden by any child config like all other properties.
  • Add field one-to-many relationship with derived fields
    • If a field's transform function returns a list for one or more of the derived fields, a field is created for every element in the list. (i.e.: A service line number corresponding to multiple remark codes)
  • check and update documentation. See guide and ask the team.

FIX: Span, Cut-point, and Region issues #189

  • Resolves edge case bug when start and stop lines are equal, but start and end pages are not.
    • If left unresolved any span that meets the above conditions will incorrectly generate a Span of height 1 on the start page.
  • fix: Multiple regions within a MatchSection
  • fix: Region accepted as "in scope" if "Fully-contained"
  • fix: Final Stop selector now includes the last line instead of using it as a cut-point

rsteele added 8 commits November 18, 2025 17:11
* this also fixes table spanning multiple pages within a match section
* Consolidate Field conversion mapping
* fix: Empty sections do not make empty remit elements
…tExtractEngine

* Extract rendering_visitor.py into a separate repo.
* visitor config is stored in base-config.yml under processing.visitors as a string list.
* Update docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant