Open
Description
Description: As a developer, I want to prototype a solution using Python-based PDF parsing and HTML conversion libraries, so that I can compare its performance and output consistency with Ruby-based approaches.
Details on decision narratives here (including mock letter, and examination of 961 letter content).
Hypothetical implementation here.
Acceptance Criteria
- The service accepts a PDF file as input.
- Identify and select Python libraries for:
Text extraction.
Heading detection based on font size.
List and table extraction.
Image extraction with alt attributes. - Convert extracted content into structured, accessible HTML.
- The output is evaluated for consistency across multiple decision narrative PDFs.
- Compare this prototype to the others and make a recommendation
Metadata
Metadata
Assignees
Labels
No labels
Activity