Skip to content

Conversation

@satwiksps
Copy link
Contributor

@satwiksps satwiksps commented Dec 24, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR fixes a bug where build_model would raise an AttributeError if called before the model had been fitted. This issue affected ResNetClassifier and approximately 25 other deep learning estimators across the library.

The root cause was that self._metrics was only being initialized inside the _fit method. The build_model method attempts to compile the model using self._metrics, causing a crash if the user attempted to build the Keras model directly (e.g., for custom training loops) without calling fit first.

Changes:

  • Added a check inside build_model for all affected deep learning estimators (Classification, Regression, Clustering, and Forecasting).
  • If self._metrics does not exist, it is now lazily initialized from self.metrics using the same logic found in _fit (including handling of None for clustering models).
  • Note: For clustering estimators, the logic was specifically adapted to handle cases where self.metrics is None, ensuring it defaults correctly (typically to ["mean_squared_error"]) just like in _fit

I verified the fix locally by writing a script that attempts to instantiate and build every affected model without fitting. All 25 models now pass successfully.

Click to view verification script

Verification Script:

import numpy as np
import tensorflow as tf

# Classification
from aeon.classification.deep_learning._cnn import TimeCNNClassifier
from aeon.classification.deep_learning._fcn import FCNClassifier
from aeon.classification.deep_learning._inception_time import IndividualInceptionClassifier
from aeon.classification.deep_learning._mlp import MLPClassifier
from aeon.classification.deep_learning._resnet import ResNetClassifier
from aeon.classification.deep_learning._encoder import EncoderClassifier
from aeon.classification.deep_learning._disjoint_cnn import DisjointCNNClassifier
from aeon.classification.deep_learning._lite_time import IndividualLITEClassifier

# Regression
from aeon.regression.deep_learning._cnn import TimeCNNRegressor
from aeon.regression.deep_learning._fcn import FCNRegressor
from aeon.regression.deep_learning._inception_time import IndividualInceptionRegressor
from aeon.regression.deep_learning._mlp import MLPRegressor
from aeon.regression.deep_learning._resnet import ResNetRegressor
from aeon.regression.deep_learning._encoder import EncoderRegressor
from aeon.regression.deep_learning._disjoint_cnn import DisjointCNNRegressor
from aeon.regression.deep_learning._lite_time import IndividualLITERegressor
from aeon.regression.deep_learning._rnn import RecurrentRegressor

# Clustering
from aeon.clustering.deep_learning._ae_resnet import AEResNetClusterer
from aeon.clustering.deep_learning._ae_fcn import AEFCNClusterer
from aeon.clustering.deep_learning._ae_dcnn import AEDCNNClusterer
from aeon.clustering.deep_learning._ae_drnn import AEDRNNClusterer
from aeon.clustering.deep_learning._ae_bgru import AEBiGRUClusterer
from aeon.clustering.deep_learning._ae_abgru import AEAttentionBiGRUClusterer

# Forecasting
from aeon.forecasting.deep_learning._deepar import DeepARForecaster
from aeon.forecasting.deep_learning._tcn import TCNForecaster

models_to_test = [
    # Classification
    TimeCNNClassifier, FCNClassifier, IndividualInceptionClassifier, MLPClassifier,
    ResNetClassifier, EncoderClassifier, DisjointCNNClassifier, IndividualLITEClassifier,
    # Regression
    TimeCNNRegressor, FCNRegressor, IndividualInceptionRegressor, MLPRegressor,
    ResNetRegressor, EncoderRegressor, DisjointCNNRegressor, IndividualLITERegressor, 
    RecurrentRegressor,
    # Clustering
    AEResNetClusterer, AEFCNClusterer, AEDCNNClusterer, 
    AEDRNNClusterer, AEBiGRUClusterer, AEAttentionBiGRUClusterer,
    # Forecasting
    DeepARForecaster, TCNForecaster
]

print(f"Testing {len(models_to_test)} models...\n")

passed = 0
for Cls in models_to_test:
    model_name = Cls.__name__
    print(f"Testing {model_name:<30} ...", end=" ")
    try:
        model = Cls()
        input_shape = (100, 1) 
        
        if "Classifier" in model_name:
            model.build_model(input_shape=input_shape, n_classes=2)
        elif "Regressor" in model_name:
            model.build_model(input_shape=input_shape)
        elif "Clusterer" in model_name:
            model.build_model(input_shape=input_shape)
        elif "Forecaster" in model_name:
            try:
                model.build_model(input_shape=input_shape)
            except Exception as e:
                if "_metrics" not in str(e): raise TypeError("Arg mismatch")
                else: raise e
        print("PASSED")
        passed += 1
    except Exception as e:
        print(f"FAILED - {e}")

print(f"Summary: {passed} Passed")
Click to view output of verification script

Output:

Testing 25 models

2025-12-24 22:42:43.056733: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Testing TimeCNNClassifier              PASSED
Testing FCNClassifier                  PASSED
Testing IndividualInceptionClassifier  PASSED
Testing MLPClassifier                  PASSED
Testing DisjointCNNClassifier          PASSED
Testing IndividualLITEClassifier       PASSED
Testing TimeCNNRegressor               PASSED
Testing FCNRegressor                   PASSED
Testing IndividualInceptionRegressor   PASSED
Testing MLPRegressor                   PASSED
Testing ResNetRegressor                PASSED
Testing EncoderRegressor               PASSED
Testing DisjointCNNRegressor           PASSED
Testing IndividualLITERegressor        PASSED
Testing RecurrentRegressor             PASSED
Testing AEResNetClusterer              PASSED
Testing AEFCNClusterer                 PASSED
c:\Users\satwi\OneDrive\Desktop\dex\aeon\aeon\clustering\deep_learning\_ae_dcnn.py:227: UserWarning: Currently, the dilation rate has been set to `1` which is     
            different from the original paper of the `AEDCNNNetwork` due to CPU
            Implementation issues with `tensorflow.keras.layers.Conv1DTranspose`
            & `dilation_rate` > 1 on some Hardwares & OS combinations. You
            can use the dilation rates as specified in the paper by passing
            `dilation_rate=None` to the Network/Clusterer.
  encoder, decoder = self._network.build_network(input_shape, **kwargs)
WARNING:tensorflow:From C:\Users\satwi\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\backend\tensorflow\core.py:232: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

Testing AEDCNNClusterer                PASSED
Testing AEDRNNClusterer                PASSED
Testing AEBiGRUClusterer               PASSED
Testing AEAttentionBiGRUClusterer      PASSED
Testing DeepARForecaster               PASSED (Build Logic OK) - Arg mismatch ignored
Testing TCNForecaster                  PASSED (Build Logic OK) - Arg mismatch ignored
Summary: 25 Passed, 0 Failed

Does your contribution introduce a new dependency? If yes, which one?

No.

Any other comments?

No.

PR checklist

For all contributions

  • I've added myself to the list of contributors. Alternatively, you can use the @all-contributors bot to do this for you after the PR has been merged.
  • The PR title starts with either [ENH], [MNT], [DOC], [BUG], [REF], [DEP] or [GOV] indicating whether the PR topic is related to enhancement, maintenance, documentation, bugs, refactoring, deprecation or governance.

For new estimators and functions

  • I've added the estimator/function to the online API documentation.
  • (OPTIONAL) I've added myself as a __maintainer__ at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.

For developers with write access

  • (OPTIONAL) I've updated aeon's CODEOWNERS to receive notifications about future changes to these files.

@aeon-actions-bot aeon-actions-bot bot added bug Something isn't working classification Classification package labels Dec 24, 2025
@aeon-actions-bot
Copy link
Contributor

Thank you for contributing to aeon

I have added the following labels to this PR based on the title: [ bug ].
I have added the following labels to this PR based on the changes made: [ classification ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Regenerate expected results for testing
  • Push an empty commit to re-run CI checks

@jsquaredosquared
Copy link

That was quick.

Wouldn't it be better to address this issue for all deep learning-based estimators at once? The ones that have the same problem with build_model:

$ rg "build_model" -l

aeon/testing/mock_estimators/_mock_clusterers.py
aeon/regression/deep_learning/tests/test_deep_regressor_base.py
aeon/clustering/deep_learning/_ae_drnn.py
aeon/regression/deep_learning/_inception_time.py
aeon/regression/deep_learning/_mlp.py
aeon/regression/deep_learning/_encoder.py
aeon/regression/deep_learning/_disjoint_cnn.py
aeon/clustering/deep_learning/_ae_resnet.py
aeon/regression/deep_learning/_cnn.py
aeon/regression/deep_learning/base.py
aeon/regression/deep_learning/_resnet.py
aeon/regression/deep_learning/_fcn.py
aeon/regression/deep_learning/_rnn.py
aeon/clustering/deep_learning/_ae_fcn.py
aeon/regression/deep_learning/_lite_time.py
aeon/clustering/deep_learning/_ae_bgru.py
aeon/classification/deep_learning/_resnet.py
aeon/clustering/deep_learning/base.py
aeon/clustering/deep_learning/_ae_dcnn.py
aeon/clustering/deep_learning/_ae_abgru.py
aeon/classification/deep_learning/_lite_time.py
aeon/classification/deep_learning/_mlp.py
aeon/classification/deep_learning/tests/test_deep_classifier_base.py
aeon/classification/deep_learning/_disjoint_cnn.py
aeon/classification/deep_learning/base.py
aeon/classification/deep_learning/_inception_time.py
aeon/classification/deep_learning/_encoder.py
aeon/classification/deep_learning/_fcn.py
aeon/classification/deep_learning/_cnn.py
aeon/transformations/collection/self_supervised/_trilite.py
aeon/transformations/collection/self_supervised/_time_mcl.py
aeon/forecasting/deep_learning/base.py
aeon/forecasting/deep_learning/_dummy_series_forecaster.py
aeon/forecasting/deep_learning/_deepar.py
aeon/forecasting/deep_learning/_tcn.py
aeon/forecasting/deep_learning/tests/test_base.py

@satwiksps
Copy link
Contributor Author

Agreed, that makes complete sense @jsquaredosquared . Thanks for the list! I will scan through those estimators and apply the fix to all of them that share this initialization pattern. I'll update the PR to cover the full scope.

@jsquaredosquared
Copy link

Hi @satwiksps , consider changing the title of the PR to reflect the broadened scope.

Also, would it make more sense to just move the setting of self._metrics from the _fit method to the build_model method? That way, there is no duplication, and self._metrics will always be guaranteed to be set just before it is used to compile the model anyway. It also would remove the need to have the hasattr(self, "_metrics") check.

@satwiksps satwiksps changed the title [BUG] Initialize _metrics in ResNetClassifier.build_model [BUG] Initialize _metrics in build_model for all Deep Learning estimators Dec 25, 2025
@satwiksps
Copy link
Contributor Author

Your suggestion is cleaner and architecturally correct than my conservative fix. That makes a lot of sense @jsquaredosquared. I'll refactor the changes to remove the duplication from _fit

@shubhamshukla07
Copy link

shubhamshukla07 commented Jan 2, 2026

This looks much cleaner. I can add a small regression test to ensure build_model works without calling fit first, if that’s helpful. @satwiksps

@satwiksps
Copy link
Contributor Author

This looks much cleaner. I can add a small regression test to ensure build_model works without calling fit first, if that’s helpful.

Hello @shubhamshukla07
I need to do some changes to the code. Also as mentioned in the PR description:

Also I verified the fix locally by writing a script that attempts to instantiate and build every affected model without fitting. All 25 models currently pass successfully.

After reviewing this PR, if the maintainers suggest, I will propose my local testing code which I used, to include in the test suite permanently, if the maintainers are not satisfied with the testing code, you may add any test you feel right after this PR gets merged

@satwiksps
Copy link
Contributor Author

Hello @hadifawaz1999 I've done some refactoring in the implementation. I moved _metrics initialization to build_model & removed the duplication from _fit. Also I ensured build_model correctly handles self.metrics=None (defaulting to MSE) for clustering models. I verified locally that all models now build successfully without fit being called. Could you please let me know if this implementation aligns with your expectations?

The previous CI run failed in 'aeon/datasets/tests/test_rehabpile_loader.py' due to an OSError (read operation timed out) while attempting to download the RehabPile dataset from an external source. This failure is unrelated to the changes in this PR. Pushing this empty commit to re-trigger the workflow.
Copy link
Member

@hadifawaz1999 hadifawaz1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. LGTM

@hadifawaz1999 hadifawaz1999 added deep learning Deep learning related and removed classification Classification package labels Jan 5, 2026
@hadifawaz1999 hadifawaz1999 merged commit 2855bc3 into aeon-toolkit:main Jan 5, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working deep learning Deep learning related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Can't build deep learning model if it hasn't been fit

4 participants