A collection of command-line utilities in Python for working with images, videos, and documents.
Each utility module has its own requirements. Install dependencies for the specific module you want to use:
cd image_utils
pip install -r requirements.txtcd video_utils
pip install -r requirements.txtcd doc_to_md
pip install -r requirements.txtNote: Video utilities require FFmpeg to be installed on your system and accessible from the command line. PDF converters require Poppler (pdftotext) on Windows.
Find and move duplicate images:
cd image_utils
python move_duplicates.py "C:\Path\To\Images"Convert images to WebP:
cd image_utils
python image_converter.py "C:\Path\To\Images" -q 85Remove small images:
cd image_utils
python -c "from image_utils.image_cleaner import remove_small_images; remove_small_images('C:\\Path\\To\\Images')"See image_utils/README.md for detailed documentation.
Find and move duplicate videos:
cd video_utils
python -c "from video_utils.video_duplicates import move_duplicate_videos; move_duplicate_videos('C:\\Path\\To\\Videos')"Optimize videos:
python main.py --input_dir input_videosSee video_utils/README.md for detailed documentation.
Convert Word documents to Markdown:
cd doc_to_md
python convert_word_to_md.py "C:\Path\To\Document.docx" "C:\Path\To\Output"Convert PowerPoint presentations to Markdown:
cd doc_to_md
python convert_pptx_to_md.py "C:\Path\To\Presentation.pptx" "C:\Path\To\Output"Convert PDF files to Markdown:
cd doc_to_md
python convert_pdf_to_md.py "C:\Path\To\Document.pdf" "C:\Path\To\Output"Convert Excel files to Markdown:
cd doc_to_md
python convert_xlsx_to_md.py "C:\Path\To\Spreadsheet.xlsx" "C:\Path\To\Output"See doc_to_md/README.md for detailed documentation.
This project is licensed under the MIT License - see the LICENSE file for details.