Extract Alt Text and Handwritten Text from Images Using ML Model

We're building a smarter, memory-efficient document processing pipeline with the following goals:

- **Extract alt text and handwritten content** from images in documents using lightweight ML models.
- Use **lighter vision-language models** like `Janus-1.3B`, `Mistral-small`, or `Gemma 3B` (quantized where possible).
- Integrate **QLoRA fine-tuning** if a suitable dataset is available.
- Explore **reasoning-based summarization** pipelines:
  - Generate a full summary from PDFs or scanned docs using AI (0-loss goal).
  - Pass entire summaries directly into lightweight models for context handling.

**Pipeline Goals:**
- Alt text extraction using image captioning models.
- Handwriting extraction via OCR/HWR (e.g., TrOCR or PaddleOCR).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract Alt Text and Handwritten Text from Images Using ML Model #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Extract Alt Text and Handwritten Text from Images Using ML Model #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions