- Python 3.11+
- Docker & Docker Compose
- Git
# Clone repository
git clone https://github.com/nolancacheux/AI-Product-Photo-Detector.git
cd AI-Product-Photo-Detector
# With uv (recommended)
uv venv
source .venv/bin/activate
uv pip install -e ".[dev,ui]"
# Or with pip
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,ui]"
# Install pre-commit hooks
pre-commit installWe use Ruff for linting and formatting:
# Check linting
ruff check src/ tests/
# Format code
ruff format src/ tests/We use mypy for static type checking:
mypy src/Pre-commit runs automatically on git commit. To run manually:
pre-commit run --all-files# Run all tests
pytest
# Run with coverage
pytest --cov=src --cov-report=html
# Run specific test file
pytest tests/test_model.py
# Run specific test
pytest tests/test_model.py::TestAIImageDetector::test_model_creation- Place tests in
tests/directory - Use
pytestfixtures for shared setup - Aim for >80% code coverage
- Test both success and error cases
feature/description- New featuresfix/description- Bug fixesdocs/description- Documentationrefactor/description- Code refactoring
Follow Conventional Commits:
type(scope): description
[optional body]
[optional footer]
Types:
feat: New featurefix: Bug fixdocs: Documentationstyle: Formattingrefactor: Code restructuringtest: Testschore: Maintenance
Examples:
feat(api): add batch prediction endpoint
fix(model): handle empty image input
docs(readme): update installation instructions
test(api): add tests for health endpoint
- Create a feature branch from
main - Make your changes with atomic commits
- Ensure tests pass:
pytest - Ensure linting passes:
ruff check - Update documentation if needed
- Submit PR with clear description
AI-Product-Photo-Detector/
├── .github/workflows/ # CI/CD pipelines (ci, cd, model-training, pr-preview)
├── src/
│ ├── data/ # Data download and validation
│ │ └── validate.py # Dataset validation
│ ├── inference/ # API server
│ │ ├── routes/ # API route handlers
│ │ │ ├── v1/ # Versioned API routes
│ │ │ ├── info.py # Info/health endpoints
│ │ │ ├── monitoring.py # Monitoring endpoints
│ │ │ └── predict.py # Prediction endpoints
│ │ ├── api.py # FastAPI application
│ │ ├── predictor.py # Model loading and inference
│ │ ├── explainer.py # Grad-CAM heatmap generation
│ │ ├── confidence.py # Confidence calibration
│ │ ├── auth.py # API key authentication
│ │ ├── validation.py # Input validation
│ │ ├── schemas.py # Pydantic models
│ │ ├── shadow.py # Shadow model comparison
│ │ ├── state.py # Application state
│ │ └── rate_limit.py # Rate limiting
│ ├── training/ # Model training
│ │ ├── train.py # Training loop with MLflow
│ │ ├── model.py # EfficientNet-B0 architecture
│ │ ├── dataset.py # PyTorch dataset
│ │ ├── augmentation.py # Data augmentation
│ │ ├── gcs.py # GCS integration
│ │ └── vertex_submit.py # Vertex AI job submission
│ ├── pipelines/ # Pipeline orchestration
│ │ ├── training_pipeline.py # End-to-end training pipeline
│ │ └── evaluate.py # Model evaluation pipeline
│ ├── monitoring/ # Observability
│ │ ├── drift.py # Data/model drift detection
│ │ └── metrics.py # Prometheus metrics
│ ├── ui/ # Streamlit web interface
│ │ └── app.py # Streamlit application
│ └── utils/ # Shared utilities
│ ├── config.py # Configuration management
│ ├── constants.py # Project constants
│ ├── logger.py # Structured logging
│ └── model_loader.py # Model loading utilities
├── tests/ # Unit and integration tests
├── configs/ # Configuration files
│ ├── train_config.yaml # Training hyperparameters
│ ├── inference_config.yaml # Inference settings
│ ├── pipeline_config.yaml # Pipeline configuration
│ ├── prometheus.yml # Prometheus scrape config
│ └── grafana/ # Grafana dashboards
├── docker/ # Dockerfiles (API, training, UI)
├── terraform/ # Infrastructure as Code
│ ├── modules/ # Reusable Terraform modules
│ │ ├── cloud-run/ # Cloud Run service
│ │ ├── storage/ # GCS buckets
│ │ ├── registry/ # Artifact Registry
│ │ ├── iam/ # Service accounts and roles
│ │ └── monitoring/ # Uptime checks and alerts
│ └── environments/ # Environment-specific configs (dev, prod)
├── scripts/ # Data download utilities
├── notebooks/ # Jupyter notebooks (Colab training)
├── dvc.yaml # DVC pipeline definition
├── docker-compose.yml # Local development stack
├── docker-compose.dev.yml # Dev-specific overrides
├── docker-compose.prod.yml # Prod-specific overrides
├── Makefile # Development commands
└── pyproject.toml # Python dependencies
make help # List all commands
make dev # Install dev dependencies + pre-commit
make lint # Ruff + mypy
make format # Auto-format code
make test # Run pytest with coverage
make data # Download CIFAKE dataset
make train # Train model
make serve # Start API (dev mode)
make docker-up # Start full stack (API + UI + MLflow)
make deploy # Trigger Cloud Run deploy via GitHub ActionsDataset files are tracked with DVC. Never commit raw data to Git.
# Pull existing data
dvc pull
# After adding new data
dvc add data/processed
git add data/processed.dvc
git commit -m "data: update processed dataset"
dvc push# Build images
make docker-build
# Run full stack
make docker-up
# Check logs
make docker-logs
# Tear down
make docker-downThe terraform/ directory provisions GCP resources using a modular structure. See INFRASTRUCTURE.md for full details.
# Choose environment
cd terraform/environments/dev # or prod
# Configure
cp terraform.tfvars.example terraform.tfvars # Edit with your project ID
# Deploy
terraform init
terraform plan
terraform apply- Update
README.mdfor user-facing changes - Update
docs/ARCHITECTURE.mdfor system design changes - Add docstrings to all public functions
- Use Google-style docstrings
- Update the version in
pyproject.tomlandsrc/__init__.py - Update
CHANGELOG.mdwith the new version and changes - Create a release tag:
git tag v1.x.x - Push the tag:
git push origin v1.x.x - The CI/CD pipeline automatically builds and deploys