appr-photos

Adversary-Adaptive Representation Learning for Privacy-Preserving Image Tasks

CS 750/850 Course Project — University of New Hampshire

Overview

This project learns image representations that preserve utility labels (for example, emotion or category) while suppressing sensitive biometric information (identity, gender, age).

Architecture

Image → CNN Feature Extractor
      → Privacy Filter (Conv1D + VIB)
      → Task Model (Attention Pooling + Classifier)  [utility]
      → GRL → Multi-Head Adversary                   [privacy]

Training objective: min_{θ,φ} max_{ψ} L_utility - λ · L_privacy

Environment Setup

bash scripts/setup_env.sh appr-photos 3.10 auto
conda activate appr-photos

The auto backend uses CPU wheels on Linux and standard PyTorch wheels on Apple Silicon. Use cpu when you want a small CPU-only environment explicitly:

bash scripts/setup_env.sh appr-photos 3.10 cpu
conda activate appr-photos

Use a CUDA wheel tag only when you want an accelerated PyTorch install:

bash scripts/setup_env.sh appr-photos 3.10 cuda:cu128
conda activate appr-photos

Dataset Download

Recommended practical dataset: CelebA.

This repo downloads CelebA through torchvision's official dataset integration and builds a repo-compatible metadata.csv automatically. The default setup uses:

utility label: Smiling vs not_smiling
privacy labels: identity and gender

bash scripts/download_data.sh celeba data/raw/celeba

This writes the prepared dataset to data/raw/celeba and creates metadata.csv. For a detailed setup, dataset, training, and comparison guide, see docs/setup_dataset_runbook.md. For a direct training command checklist, see docs/training_commands.md.

Custom Dataset Preparation

Organize images under data/raw/celeba:

data/raw/celeba/
  <class_name>/
    <speaker_id>/image_001.jpg
    <speaker_id>/image_002.jpg

Or place metadata.csv in data/raw/celeba/ with fields such as: filename,utility_label,speaker_id,gender,age.

Verification and metadata generation:

python scripts/prepare_datasets.py --verify --root data/raw/celeba
python scripts/prepare_datasets.py --build-metadata --root data/raw/celeba
python scripts/prepare_datasets.py --stats --root data/raw/celeba

Training

# CelebA baseline training
python scripts/train.py --config configs/experiment/celeba_baseline.yaml

# Larger-batch config for machines with enough accelerator or CPU memory
python scripts/train.py --config configs/experiment/celeba_accelerated.yaml

# Precompute features for faster training loops (optional)
python scripts/precompute_features.py --config configs/experiment/celeba_baseline.yaml
python scripts/train.py --config configs/experiment/celeba_cached.yaml

# Override config values from CLI
python scripts/train.py --config configs/experiment/celeba_baseline.yaml training.num_epochs=20 training.lambda_privacy=0.05

The CelebA configs use smiling classification as the utility task and identity/gender as the baseline privacy targets.

Multi-Attribute Comparison

Run the full comparison with separate utility runs and one combined multi-utility run:

python scripts/run_celeba_attribute_comparison.py \
  --mode both \
  --epochs 10 \
  --batch-size 128 \
  --num-workers 8

This uses CelebA-provided utility labels Smiling, Mouth_Slightly_Open, Eyeglasses, Wearing_Hat, and Blurry, with privacy heads for speaker_id, gender, and young.

Evaluation

python scripts/evaluate.py --checkpoint outputs/celeba_baseline/checkpoints/best_model.pt

Lambda Sweep (Pareto Frontier)

python scripts/sweep_lambda.py --config configs/experiment/celeba_baseline.yaml --epochs 20

Visualization

python scripts/visualize.py --checkpoint outputs/celeba_baseline/checkpoints/best_model.pt

For report-ready figures from a trained run:

python scripts/generate_report_figures.py \
  --checkpoint outputs/celeba_baseline/checkpoints/best_model.pt \
  --output_dir outputs/report_figures

Tests

pytest tests/ -v

Evaluation Metrics

Utility (higher is better): UAR, Weighted Accuracy, Macro F1
Privacy (lower is more private): identity accuracy, gender accuracy, de-identification rate, MI(Z; S)

Project Structure

src/aapr/
├── data/          # Photo dataset loaders and split/collation utils
├── features/      # CNN image feature extractor + feature cache
├── models/        # Privacy filter, task model, adversary, GRL
├── training/      # Adversarial trainer, losses, schedulers, metrics
├── evaluation/    # Evaluator, cross-dataset, Pareto analysis
├── visualization/ # Embeddings, training curves, Pareto plots
└── utils/         # Config, logging, seed, device detection

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
docs		docs
notebooks		notebooks
scripts		scripts
src/aapr		src/aapr
tests		tests
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

appr-photos

Overview

Architecture

Environment Setup

Dataset Download

Custom Dataset Preparation

Training

Multi-Attribute Comparison

Evaluation

Lambda Sweep (Pareto Frontier)

Visualization

Tests

Evaluation Metrics

Project Structure

APPR-PHOTOS

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

appr-photos

Overview

Architecture

Environment Setup

Dataset Download

Custom Dataset Preparation

Training

Multi-Attribute Comparison

Evaluation

Lambda Sweep (Pareto Frontier)

Visualization

Tests

Evaluation Metrics

Project Structure

APPR-PHOTOS

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages