Differentiable synthesis of DJ scratching. A hybrid DDSP system that combines a fixed differentiable resampler (variable-rate playback) with a learned residual generator (vinyl texture), plus VelocityNet for estimating playback velocity from audio.
Paper: DDScratch: Differentiable Synthesis of DJ Scratching (DAFx26)
Synthesis (inference): source audio + velocity + fader → resampler → clean audio + residual → scratch output
Analysis (velocity estimation): source audio + scratch recording → VelocityNet → estimated velocity + fader
The resampler has no learnable parameters — it performs variable-rate interpolation of the source. The residual generator (~200K params) learns setup-specific vinyl texture from ~30 real recordings in under 2 minutes. VelocityNet (1.7M params) is a 1D U-Net with mel spectrogram similarity alignment, trained once on 10K synthetic examples.
Requires Python 3.13+ and uv.
git clone <repo-url>
cd ddscratch
uv syncuv run python scripts/generate_velocity_dataset.py \
--output-dir data/velocity_train \
--num-examples 10000Stage 1 — bootstrap similarity features (no direction augmentation):
uv run python scripts/train_velocity_net.py \
--data-dir data/velocity_train \
--epochs 50 --batch-size 8 --lr 1e-3 \
--save-dir checkpoints/velnet_s1 \
--no-direction-augmentStage 2 — learn direction discrimination:
uv run python scripts/train_velocity_net.py \
--data-dir data/velocity_train \
--epochs 50 --batch-size 8 --lr 1e-3 \
--save-dir checkpoints/velnet_s2 \
--resume checkpoints/velnet_s1/velnet_epoch050.pt \
--weights-onlyuv run python scripts/apply_velocity_net.py \
--checkpoint checkpoints/velnet_s2/velnet_epoch050.ptuv run python scripts/build_real_dataset.py \
--velocity-dir data/real_velocity_neural \
--output-dir data/split/my_train \
--fader-source energy \
--split-file data/split.json \
--split trainuv run python -m ddscratch.train \
--data-dir data/split/my_train \
--epochs 200 --batch-size 8 \
--save-dir checkpoints/ddspuv run python scripts/eval_ddsp.py \
--checkpoint checkpoints/ddsp/checkpoint_epoch200.pt \
--data-dir data/split/my_val \
--save-audiouv run pytest| File | Description | Size |
|---|---|---|
checkpoints/velocity_net_v16/velnet_epoch050.pt |
VelocityNet with mel similarity (best) | 21 MB |
checkpoints/split/neural_efader_v16/checkpoint_epoch200.pt |
DDSP model trained with v16 velocities | 16 MB |
checkpoints/split/neural_efader_v10pg/checkpoint_epoch200.pt |
DDSP model trained with v10pg velocities | 16 MB |
Load a checkpoint:
from ddscratch.velocity_net import VelocityNet
import torch
model = VelocityNet()
ckpt = torch.load("checkpoints/velocity_net_v16/velnet_epoch050.pt")
model.load_state_dict(ckpt["model_state_dict"])
model.eval()ddscratch/
├── ddscratch/
│ ├── model.py # ScratchModel (DiffResampler + ResidualGenerator)
│ ├── velocity_net.py # VelocityNet (1D U-Net + MelSimilarityAlignment)
│ ├── train.py # DDSP training loop
│ ├── data.py # Dataset and dataloader
│ ├── losses.py # Multi-scale spectral loss
│ ├── trajectory.py # Procedural scratch trajectory generator
│ ├── dataset.py # Synthetic dataset generation
│ ├── velocity_data.py # Velocity training dataset
│ └── resample.py # Differentiable resampler
├── scripts/
│ ├── train_velocity_net.py
│ ├── apply_velocity_net.py
│ ├── build_real_dataset.py
│ ├── generate_velocity_dataset.py
│ ├── eval_ddsp.py
│ └── optuna_ddsp.py
├── paper/
│ ├── DDScratch.tex # DAFx26 paper
│ └── slides.tex # Presentation slides
├── checkpoints/ # Pre-trained models (see above)
└── tests/
@inproceedings{ddscratch2026,
title = {{DDScratch}: Differentiable Synthesis of {DJ} Scratching},
author = {Lai, Po-Hsuan},
booktitle = {Proceedings of the 29th International Conference on Digital Audio Effects (DAFx26)},
year = {2026},
}TBD