-
Notifications
You must be signed in to change notification settings - Fork 707
Description
Search before asking
- I have searched the RF-DETR issues and found no similar feature requests.
Description
When using pretrain_weights for transfer learning, epoch 0 evaluates the pretrained model
before any training occurs. BestMetricHolder and EarlyStoppingCallback both treat this
evaluation as a valid candidate for "best model".
If the target dataset is harder or smaller than the pretraining dataset, the pretrained mAP
at epoch 0 may be higher than any subsequent trained epoch — causing the saved
checkpoint_best_total.pth to always be the untrained pretrained weights, and early stopping
to trigger prematurely.
Proposed solution: Add a skip_best_epochs (or warmup_best_epochs) parameter that
excludes the first N epochs from best-model tracking and early stopping evaluation.
For example, skip_best_epochs=3 would:
- Prevent
BestMetricSingle.update()from recording a new best during epochs 0–2 - Prevent
EarlyStoppingCallbackfrom counting patience during epochs 0–2 - Reset baselines at epoch
skip_best_epochsso the first eligible epoch becomes the
initial reference point
Note: the existing warmup_epochs parameter only affects the LR schedule (linear warmup
before cosine annealing) and does not influence best-model selection or early stopping.
Use case
Industrial transfer learning: we fine-tune RF-DETR (pretrained on COCO) on a domain-specific
dataset with fewer classes and different image characteristics. The pretrained model achieves
~0.84 mAP on epoch 0 (before any training), but training on the new domain initially drops
mAP before recovering. Without skipping early epochs, the "best" checkpoint is always the
untrained pretrained model, and early stopping halts training before the model can adapt.
This affects anyone using pretrain_weights for transfer learning on datasets where the
pretrained model's initial evaluation score is artificially high relative to the training
trajectory. A skip_best_epochs parameter would let users control when best-model tracking
begins, ensuring the saved checkpoint reflects actual training progress.
Additional
Proposed solution:
Add a skip_best_epochs (or warmup_best_epochs) integer parameter to RFDETRConfig
(default 0 for backward compatibility). During training:
BestMetricSingle.update(new_res, epoch)returnsFalseforepoch < skip_best_epochs
— no checkpoint is saved as "best" during warmupEarlyStoppingCallback.update(epoch)skips patience counting forepoch < skip_best_epochs- At
epoch == skip_best_epochs,best_resis reset toinit_res(0.0 or -1.0) so the
first eligible epoch establishes a fair baseline from actual training, not from pretrained
evaluation
This would be a small change in BestMetricSingle.update() (util/utils.py) and
EarlyStoppingCallback.update() (util/early_stopping.py), plus the config parameter.
Note: The existing warmup_epochs parameter only affects the LR schedule (linear warmup
before cosine annealing) and does not influence best-model selection or early stopping.
freeze_encoder also does not address this — the issue is in metric tracking, not in
weight updates.
Are you willing to submit a PR?
- Yes I'd like to help by submitting a PR!