A Streamlit app to extract text, tables, and images from PDFs — organized page-wise.
- Extracts text (.txt), images (.jpg), and tables (.csv)
- Saves per-page output into folders under
output/ - Displays extracted data in Streamlit UI
- Generates and downloads
metadata.json
pip install -r requirements.txt
streamlit run app.py