Closed
Description
Description
The PaliGemma
module is currently incomplete and not ready for use. We aim to refactor the PaliGemma
codebase to align with the structure and functionality of the Florence_2
module. This involves updating the code structure, implementing a command-line interface (CLI) using typer
, documenting the code thoroughly, and ensuring the module is fully operational.
Acceptance Criteria
- Code Restructuring:
- Organize the PaliGemma codebase to mirror the Florence-2 module structure. Making appropriate changes if necessary.
- Create necessary submodules such as
core.py
,entrypoint.py
,checkpoints.py
, etc.
- Implement CLI with
typer
. - Update training and evaluation logic. Refactor training loops, data loaders, and model preparation to match Florence-2 tracking. Due to model-specific requirements, custom data loaders may be needed. Make sure they accept data in
JSONL
format. - Add Documentation:
- Write comprehensive docstrings for all user-facing classes, functions, and methods.
- Ensure code is well-commented and follows PEP 8 guidelines.
- Validate the training and evaluation processes end-to-end. Provide a notebook showing usage example.