Fix per-example loss scaling for mixed-length batches by taivu1998 · Pull Request #572 · aqlaboratory/openfold

taivu1998 · 2026-05-10T11:40:38Z

Summary

Keep each AlphaFold loss component as a per-example tensor inside AlphaFoldLoss until final aggregation.
Apply sqrt(min(seq_length, crop_len)) independently per local batch example before taking the final mean.
Add focused regression coverage for per-example reduction paths and mixed-length aggregation.

Root Cause

Issue #517 points out that the loss scale was computed from the average sequence length of the local batch, then applied after the component losses had already been averaged. For local batches containing examples with different sequence lengths, this gives the wrong scaling factor for every example except by coincidence.

Changes

Added lightweight loss reduction helpers so existing public loss functions still default to scalar means while AlphaFoldLoss can request reduction="none".
Updated distogram, experimentally resolved, FAPE/backbone, pLDDT, masked MSA, supervised chi, TM, and violation losses to preserve per-example values when requested.
Changed AlphaFoldLoss to validate per-example component shapes, aggregate weighted per-example losses, apply per-example sequence-length scaling, and then mean over the local batch.
Fixed violation clash normalization to use per-example atom counts instead of one local-batch-wide atom count.
Added tests for the new reduction contract and for the exact mixed-length scaling regression.

Fixes #517.

Validation

python -m py_compile openfold/utils/loss.py tests/test_loss.py
git diff --check
Dependency-light smoke validation covering the changed helpers, component reduction="none" paths, violation atom normalization, and full AlphaFoldLoss mixed-length aggregation.

Notes

The full targeted pytest command could not run in this local checkout because the available Python environment is incomplete/broken: uv run pytest ... fails on invalid local Anaconda llvmlite egg metadata, and python -m pytest ... fails during collection because ml_collections is missing.

Fix per-example loss scaling

a4bc32e

taivu1998 marked this pull request as ready for review May 11, 2026 03:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix per-example loss scaling for mixed-length batches#572

Fix per-example loss scaling for mixed-length batches#572
taivu1998 wants to merge 1 commit into
aqlaboratory:mainfrom
taivu1998:tdv/issue-517-loss-scaling

taivu1998 commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

taivu1998 commented May 10, 2026

Summary

Root Cause

Changes

Validation

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant