Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Newer LLMs are multi-modal and could leverage multi-modality to better understand documents
Describe the solution you'd like
A clear and concise description of what you want to happen.
I'd like to be able to extract information from scanned documents, or documents that contain tables and images
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.