Extract embedded images and text, and convert pages to images from PDFs using Mojo.
- Mojo
- Magic environment with:
- pdfplumber
- pdf2image
# Install dependencies
magic add "pdfplumber"
magic add "pdf2image"
# Run the tool (converts ./extract/target.pdf)
mojo main.mojo
# Or specify a different PDF
mojo main.mojo --pdf=/path/to/your.pdf
If using magic:
magic shell
mojo main.mojo
magic exit
- Extracted images will be in
extracted_images/
- Converted page images will be in
converted_images/