[BUG] Initialize _metrics in build_model for all Deep Learning estimators #3198

satwiksps · 2025-12-24T13:26:08Z

Reference Issues/PRs

Fixes [BUG] Can't build deep learning model if it hasn't been fit #3197

What does this implement/fix? Explain your changes.

This PR fixes a bug where build_model would raise an AttributeError if called before the model had been fitted. This issue affected ResNetClassifier and approximately 25 other deep learning estimators across the library.

The root cause was that self._metrics was only being initialized inside the _fit method. The build_model method attempts to compile the model using self._metrics, causing a crash if the user attempted to build the Keras model directly (e.g., for custom training loops) without calling fit first.

Changes:

Added a check inside build_model for all affected deep learning estimators (Classification, Regression, Clustering, and Forecasting).
If self._metrics does not exist, it is now lazily initialized from self.metrics using the same logic found in _fit (including handling of None for clustering models).
Note: For clustering estimators, the logic was specifically adapted to handle cases where self.metrics is None, ensuring it defaults correctly (typically to ["mean_squared_error"]) just like in _fit

I verified the fix locally by writing a script that attempts to instantiate and build every affected model without fitting. All 25 models now pass successfully.

Click to view verification script

Verification Script:

import numpy as np
import tensorflow as tf

# Classification
from aeon.classification.deep_learning._cnn import TimeCNNClassifier
from aeon.classification.deep_learning._fcn import FCNClassifier
from aeon.classification.deep_learning._inception_time import IndividualInceptionClassifier
from aeon.classification.deep_learning._mlp import MLPClassifier
from aeon.classification.deep_learning._resnet import ResNetClassifier
from aeon.classification.deep_learning._encoder import EncoderClassifier
from aeon.classification.deep_learning._disjoint_cnn import DisjointCNNClassifier
from aeon.classification.deep_learning._lite_time import IndividualLITEClassifier

# Regression
from aeon.regression.deep_learning._cnn import TimeCNNRegressor
from aeon.regression.deep_learning._fcn import FCNRegressor
from aeon.regression.deep_learning._inception_time import IndividualInceptionRegressor
from aeon.regression.deep_learning._mlp import MLPRegressor
from aeon.regression.deep_learning._resnet import ResNetRegressor
from aeon.regression.deep_learning._encoder import EncoderRegressor
from aeon.regression.deep_learning._disjoint_cnn import DisjointCNNRegressor
from aeon.regression.deep_learning._lite_time import IndividualLITERegressor
from aeon.regression.deep_learning._rnn import RecurrentRegressor

# Clustering
from aeon.clustering.deep_learning._ae_resnet import AEResNetClusterer
from aeon.clustering.deep_learning._ae_fcn import AEFCNClusterer
from aeon.clustering.deep_learning._ae_dcnn import AEDCNNClusterer
from aeon.clustering.deep_learning._ae_drnn import AEDRNNClusterer
from aeon.clustering.deep_learning._ae_bgru import AEBiGRUClusterer
from aeon.clustering.deep_learning._ae_abgru import AEAttentionBiGRUClusterer

# Forecasting
from aeon.forecasting.deep_learning._deepar import DeepARForecaster
from aeon.forecasting.deep_learning._tcn import TCNForecaster

models_to_test = [
    # Classification
    TimeCNNClassifier, FCNClassifier, IndividualInceptionClassifier, MLPClassifier,
    ResNetClassifier, EncoderClassifier, DisjointCNNClassifier, IndividualLITEClassifier,
    # Regression
    TimeCNNRegressor, FCNRegressor, IndividualInceptionRegressor, MLPRegressor,
    ResNetRegressor, EncoderRegressor, DisjointCNNRegressor, IndividualLITERegressor, 
    RecurrentRegressor,
    # Clustering
    AEResNetClusterer, AEFCNClusterer, AEDCNNClusterer, 
    AEDRNNClusterer, AEBiGRUClusterer, AEAttentionBiGRUClusterer,
    # Forecasting
    DeepARForecaster, TCNForecaster
]

print(f"Testing {len(models_to_test)} models...\n")

passed = 0
for Cls in models_to_test:
    model_name = Cls.__name__
    print(f"Testing {model_name:<30} ...", end=" ")
    try:
        model = Cls()
        input_shape = (100, 1) 
        
        if "Classifier" in model_name:
            model.build_model(input_shape=input_shape, n_classes=2)
        elif "Regressor" in model_name:
            model.build_model(input_shape=input_shape)
        elif "Clusterer" in model_name:
            model.build_model(input_shape=input_shape)
        elif "Forecaster" in model_name:
            try:
                model.build_model(input_shape=input_shape)
            except Exception as e:
                if "_metrics" not in str(e): raise TypeError("Arg mismatch")
                else: raise e
        print("PASSED")
        passed += 1
    except Exception as e:
        print(f"FAILED - {e}")

print(f"Summary: {passed} Passed")

Click to view output of verification script

Output:

Testing 25 models

2025-12-24 22:42:43.056733: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Testing TimeCNNClassifier              PASSED
Testing FCNClassifier                  PASSED
Testing IndividualInceptionClassifier  PASSED
Testing MLPClassifier                  PASSED
Testing DisjointCNNClassifier          PASSED
Testing IndividualLITEClassifier       PASSED
Testing TimeCNNRegressor               PASSED
Testing FCNRegressor                   PASSED
Testing IndividualInceptionRegressor   PASSED
Testing MLPRegressor                   PASSED
Testing ResNetRegressor                PASSED
Testing EncoderRegressor               PASSED
Testing DisjointCNNRegressor           PASSED
Testing IndividualLITERegressor        PASSED
Testing RecurrentRegressor             PASSED
Testing AEResNetClusterer              PASSED
Testing AEFCNClusterer                 PASSED
c:\Users\satwi\OneDrive\Desktop\dex\aeon\aeon\clustering\deep_learning\_ae_dcnn.py:227: UserWarning: Currently, the dilation rate has been set to `1` which is     
            different from the original paper of the `AEDCNNNetwork` due to CPU
            Implementation issues with `tensorflow.keras.layers.Conv1DTranspose`
            & `dilation_rate` > 1 on some Hardwares & OS combinations. You
            can use the dilation rates as specified in the paper by passing
            `dilation_rate=None` to the Network/Clusterer.
  encoder, decoder = self._network.build_network(input_shape, **kwargs)
WARNING:tensorflow:From C:\Users\satwi\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\backend\tensorflow\core.py:232: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

Testing AEDCNNClusterer                PASSED
Testing AEDRNNClusterer                PASSED
Testing AEBiGRUClusterer               PASSED
Testing AEAttentionBiGRUClusterer      PASSED
Testing DeepARForecaster               PASSED (Build Logic OK) - Arg mismatch ignored
Testing TCNForecaster                  PASSED (Build Logic OK) - Arg mismatch ignored
Summary: 25 Passed, 0 Failed

Does your contribution introduce a new dependency? If yes, which one?

No.

Any other comments?

No.

PR checklist

For all contributions

I've added myself to the list of contributors. Alternatively, you can use the @all-contributors bot to do this for you after the PR has been merged.
The PR title starts with either [ENH], [MNT], [DOC], [BUG], [REF], [DEP] or [GOV] indicating whether the PR topic is related to enhancement, maintenance, documentation, bugs, refactoring, deprecation or governance.

For new estimators and functions

I've added the estimator/function to the online API documentation.
(OPTIONAL) I've added myself as a __maintainer__ at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.

For developers with write access

(OPTIONAL) I've updated aeon's CODEOWNERS to receive notifications about future changes to these files.

aeon-actions-bot · 2025-12-24T13:26:30Z

Thank you for contributing to `aeon`

I have added the following labels to this PR based on the title: [ bug ].
I have added the following labels to this PR based on the changes made: [ classification ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

Run pre-commit checks for all files
Run mypy typecheck tests
Run all pytest tests and configurations
Run all notebook example tests
Run numba-disabled codecov tests
Stop automatic pre-commit fixes (always disabled for drafts)
Disable numba cache loading
Regenerate expected results for testing
Push an empty commit to re-run CI checks

jsquaredosquared · 2025-12-24T13:31:54Z

That was quick.

Wouldn't it be better to address this issue for all deep learning-based estimators at once? The ones that have the same problem with build_model:

$ rg "build_model" -l

aeon/testing/mock_estimators/_mock_clusterers.py
aeon/regression/deep_learning/tests/test_deep_regressor_base.py
aeon/clustering/deep_learning/_ae_drnn.py
aeon/regression/deep_learning/_inception_time.py
aeon/regression/deep_learning/_mlp.py
aeon/regression/deep_learning/_encoder.py
aeon/regression/deep_learning/_disjoint_cnn.py
aeon/clustering/deep_learning/_ae_resnet.py
aeon/regression/deep_learning/_cnn.py
aeon/regression/deep_learning/base.py
aeon/regression/deep_learning/_resnet.py
aeon/regression/deep_learning/_fcn.py
aeon/regression/deep_learning/_rnn.py
aeon/clustering/deep_learning/_ae_fcn.py
aeon/regression/deep_learning/_lite_time.py
aeon/clustering/deep_learning/_ae_bgru.py
aeon/classification/deep_learning/_resnet.py
aeon/clustering/deep_learning/base.py
aeon/clustering/deep_learning/_ae_dcnn.py
aeon/clustering/deep_learning/_ae_abgru.py
aeon/classification/deep_learning/_lite_time.py
aeon/classification/deep_learning/_mlp.py
aeon/classification/deep_learning/tests/test_deep_classifier_base.py
aeon/classification/deep_learning/_disjoint_cnn.py
aeon/classification/deep_learning/base.py
aeon/classification/deep_learning/_inception_time.py
aeon/classification/deep_learning/_encoder.py
aeon/classification/deep_learning/_fcn.py
aeon/classification/deep_learning/_cnn.py
aeon/transformations/collection/self_supervised/_trilite.py
aeon/transformations/collection/self_supervised/_time_mcl.py
aeon/forecasting/deep_learning/base.py
aeon/forecasting/deep_learning/_dummy_series_forecaster.py
aeon/forecasting/deep_learning/_deepar.py
aeon/forecasting/deep_learning/_tcn.py
aeon/forecasting/deep_learning/tests/test_base.py

satwiksps · 2025-12-24T13:39:10Z

Agreed, that makes complete sense @jsquaredosquared . Thanks for the list! I will scan through those estimators and apply the fix to all of them that share this initialization pattern. I'll update the PR to cover the full scope.

jsquaredosquared · 2025-12-25T03:50:02Z

Hi @satwiksps , consider changing the title of the PR to reflect the broadened scope.

Also, would it make more sense to just move the setting of self._metrics from the _fit method to the build_model method? That way, there is no duplication, and self._metrics will always be guaranteed to be set just before it is used to compile the model anyway. It also would remove the need to have the hasattr(self, "_metrics") check.

satwiksps · 2025-12-25T04:29:10Z

Your suggestion is cleaner and architecturally correct than my conservative fix. That makes a lot of sense @jsquaredosquared. I'll refactor the changes to remove the duplication from _fit

shubhamshukla07 · 2026-01-02T15:25:53Z

This looks much cleaner. I can add a small regression test to ensure build_model works without calling fit first, if that’s helpful. @satwiksps

satwiksps · 2026-01-02T15:41:48Z

This looks much cleaner. I can add a small regression test to ensure build_model works without calling fit first, if that’s helpful.

Hello @shubhamshukla07
I need to do some changes to the code. Also as mentioned in the PR description:

Also I verified the fix locally by writing a script that attempts to instantiate and build every affected model without fitting. All 25 models currently pass successfully.

After reviewing this PR, if the maintainers suggest, I will propose my local testing code which I used, to include in the test suite permanently, if the maintainers are not satisfied with the testing code, you may add any test you feel right after this PR gets merged

satwiksps · 2026-01-03T10:28:26Z

Hello @hadifawaz1999 I've done some refactoring in the implementation. I moved _metrics initialization to build_model & removed the duplication from _fit. Also I ensured build_model correctly handles self.metrics=None (defaulting to MSE) for clustering models. I verified locally that all models now build successfully without fit being called. Could you please let me know if this implementation aligns with your expectations?

The previous CI run failed in 'aeon/datasets/tests/test_rehabpile_loader.py' due to an OSError (read operation timed out) while attempting to download the RehabPile dataset from an external source. This failure is unrelated to the changes in this PR. Pushing this empty commit to re-trigger the workflow.

hadifawaz1999

Thanks for this. LGTM

Initialize _metrics in ResNet build_model if not fit

295371f

satwiksps requested review from MatthewMiddlehurst, TonyBagnall and hadifawaz1999 as code owners December 24, 2025 13:26

aeon-actions-bot bot added bug Something isn't working classification Classification package labels Dec 24, 2025

Initialize _metrics in build_model for all Deep Learning estimators

c8b6cbb

satwiksps requested review from chrisholder and dguijo as code owners December 24, 2025 17:21

satwiksps changed the title ~~[BUG] Initialize _metrics in ResNetClassifier.build_model~~ [BUG] Initialize _metrics in build_model for all Deep Learning estimators Dec 25, 2025

satwiksps added 2 commits January 3, 2026 15:47

Refactor _metrics initialization to build_model for all DL estimators

d798417

Merge branch 'main' into fix-dl-build-model-crash

dd4fda3

hadifawaz1999 approved these changes Jan 5, 2026

View reviewed changes

hadifawaz1999 added deep learning Deep learning related and removed classification Classification package labels Jan 5, 2026

hadifawaz1999 merged commit 2855bc3 into aeon-toolkit:main Jan 5, 2026
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Initialize _metrics in build_model for all Deep Learning estimators #3198

[BUG] Initialize _metrics in build_model for all Deep Learning estimators #3198

Uh oh!

satwiksps commented Dec 24, 2025 •

edited

Loading

Uh oh!

aeon-actions-bot bot commented Dec 24, 2025

Uh oh!

jsquaredosquared commented Dec 24, 2025

Uh oh!

satwiksps commented Dec 24, 2025

Uh oh!

jsquaredosquared commented Dec 25, 2025

Uh oh!

satwiksps commented Dec 25, 2025

Uh oh!

shubhamshukla07 commented Jan 2, 2026 •

edited

Loading

Uh oh!

satwiksps commented Jan 2, 2026

Uh oh!

satwiksps commented Jan 3, 2026

Uh oh!

hadifawaz1999 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[BUG] Initialize _metrics in build_model for all Deep Learning estimators #3198

[BUG] Initialize _metrics in build_model for all Deep Learning estimators #3198

Uh oh!

Conversation

satwiksps commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

Any other comments?

PR checklist

For all contributions

For new estimators and functions

For developers with write access

Uh oh!

aeon-actions-bot bot commented Dec 24, 2025

Thank you for contributing to aeon

Uh oh!

jsquaredosquared commented Dec 24, 2025

Uh oh!

satwiksps commented Dec 24, 2025

Uh oh!

jsquaredosquared commented Dec 25, 2025

Uh oh!

satwiksps commented Dec 25, 2025

Uh oh!

shubhamshukla07 commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

satwiksps commented Jan 2, 2026

Uh oh!

satwiksps commented Jan 3, 2026

Uh oh!

hadifawaz1999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

satwiksps commented Dec 24, 2025 •

edited

Loading

Thank you for contributing to `aeon`

shubhamshukla07 commented Jan 2, 2026 •

edited

Loading