⚡ perf(cli): faster CLI startup via lazy imports by samet-akcay · Pull Request #3535 · open-edge-platform/anomalib

samet-akcay · 2026-04-14T12:08:15Z

Summary

Reduces anomalib --help startup from ~4.0s → ~0.15s (26× faster) by eliminating unnecessary eager imports of torch, lightning, all 23 models, all 15+ datasets, and optional logger backends during CLI initialization.

Problem

Every CLI invocation — even anomalib --help — eagerly imported the entire anomalib stack:

torch + lightning (~2.8s alone)
All 23 model classes (each importing torch.nn, etc.)
All 15+ dataset/datamodule classes
Optional logger backends (Comet, MLflow, TensorBoard, WandB)
Full Engine class with all its dependencies

This made even trivial CLI operations painfully slow.

Solution

Three-phase lazy loading strategy:

Phase 1: Lazy CLI imports

cli/cli.py: Removed all top-level heavy imports (Trainer, torch, Engine, AnomalibModule, AnomalibDataModule). Imports are deferred to the functions that actually need them.
cli/pipelines.py: Pipeline registry loaded lazily via __getattr__.
cli/utils/help_formatter.py: Engine import deferred with cached _get_docstring_usage().

Phase 2: Deferred subcommand parser construction

cli/cli.py: Added _sniff_subcommand() that detects which subcommand the user invoked from sys.argv, then only builds the full ArgumentParser for that single subcommand — skipping expensive parser construction for all unused subcommands.

Phase 3: Lazy `init.py` re-exports

models/__init__.py: 23 model classes → _MODEL_CLASS_MAP + __getattr__ on-demand loading.
data/__init__.py: Datamodules, datasets, and data-format enums loaded lazily. Dataclasses (ImageItem, VideoItem, etc.) kept as eager imports to avoid circular import issues.
engine/__init__.py: Engine, XPUAccelerator, SingleXPUStrategy → lazy __getattr__.
loggers/__init__.py: Comet, MLflow, TensorBoard, WandB loggers → lazy __getattr__.

Compatibility

Added _ActionSubCommands import shim for jsonargparse >=4.47 (class moved to jsonargparse._subcommands).

Benchmarks

Command	Before	After	Speedup
`anomalib --help`	4.0s	0.15s	26×
`anomalib install --help`	4.0s	0.15s	26×
`anomalib train --help`	5.2s	3.9s	25% faster

train --help remaining 3.9s is the irreducible torch + lightning import cost (~2.8s floor) — these must be loaded to construct the training argument parser.

Example

# Before: every command paid the full import tax
$ time anomalib --help    # 4.0s
$ time anomalib install   # 4.0s

# After: only pay for what you use
$ time anomalib --help    # 0.15s
$ time anomalib install   # 0.15s
$ time anomalib train --help  # 3.9s (torch+lightning required here)

Copilot

Pull request overview

This PR refactors Anomalib’s CLI and public package __init__.py re-exports to be lazily imported, significantly reducing startup time for lightweight CLI calls (notably anomalib --help) by avoiding eager imports of heavy dependencies (Torch/Lightning, models, datasets, loggers, engine) unless a subcommand actually needs them.

Changes:

Implement lazy-loading via module-level __getattr__ in anomalib.models, anomalib.data, anomalib.engine, and anomalib.loggers.
Speed up CLI initialization by deferring heavy imports and by only fully building the selected subcommand’s parser arguments.
Make CLI help/docstring behavior lazy by deferring Engine method references used for help text.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`src/anomalib/cli/cli.py`	Adds subcommand “sniffing” + moves heavy imports into subcommand-specific paths to reduce CLI startup overhead.
`src/anomalib/cli/pipelines.py`	Lazily initializes the pipeline registry and avoids importing pipelines during basic CLI/help flows.
`src/anomalib/cli/utils/help_formatter.py`	Lazily builds the Engine-method mapping used for docstring-derived usage/help panels.
`src/anomalib/loggers/__init__.py`	Switches optional logger integrations to lazy attribute loading.
`src/anomalib/models/__init__.py`	Replaces eager model imports with a class→module map and lazy `__getattr__` loading.
`src/anomalib/data/__init__.py`	Keeps lightweight dataclasses eager but makes datasets/datamodules/enums lazy via `__getattr__`.
`src/anomalib/engine/__init__.py`	Lazily re-exports `Engine`, `XPUAccelerator`, and `SingleXPUStrategy`.

Comments suppressed due to low confidence (1)

src/anomalib/data/init.py:144

get_datamodule no longer has the DictConfig | ListConfig | dict type annotation even though the docstring still claims it supports ListConfig. With the current implementation, passing a ListConfig will fail at config_.class_path because a ListConfig is neither converted nor handled specially. Either add explicit ListConfig handling (and keep the typed signature via TYPE_CHECKING imports) or remove ListConfig from the documented/typed contract.

def get_datamodule(config) -> AnomalibDataModule:
    """Get Anomaly Datamodule from config.

    Args:
        config: Configuration for the anomaly model. Can be either:

Copilot · 2026-04-14T12:14:56Z

+        for token in tokens:
+            if not token.startswith("-"):
+                return token


_sniff_subcommand treats the value passed to global options (e.g. -c/--config) as the subcommand because it returns the first token that doesn't start with -. This breaks valid invocations like anomalib --config config.yaml train/anomalib -c config.yaml train (it will think config.yaml is the subcommand and skip building the real subcommand parser, likely causing config-file parsing/validation failures). Update the sniffing logic to skip option values for known global flags (at least -c/--config and --config=...), or fall back to building the full parser when a config file is provided.

Suggested change

for token in tokens:

if not token.startswith("-"):

return token

index = 0

while index < len(tokens):

token = tokens[index]

if token in {"-c", "--config"}:

index += 2

continue

if token.startswith("--config="):

index += 1

continue

if not token.startswith("-"):

return token

index += 1

Copilot · 2026-04-14T12:14:56Z

+
 from anomalib import __version__
-from anomalib.cli.pipelines import PIPELINE_REGISTRY, pipeline_subcommands, run_pipeline
 from anomalib.cli.utils.help_formatter import CustomHelpFormatter, get_short_docstring


get_short_docstring is imported but never used in this module. Please remove the unused import to avoid unnecessary coupling and keep the lazy-import goal consistent.

Suggested change

from anomalib.cli.utils.help_formatter import CustomHelpFormatter, get_short_docstring

from anomalib.cli.utils.help_formatter import CustomHelpFormatter

Copilot · 2026-04-14T12:14:57Z

 __all__ = ["configure_logger"]

-try:
-    from .comet import AnomalibCometLogger  # noqa: F401
-    from .mlflow import AnomalibMLFlowLogger  # noqa: F401
-    from .tensorboard import AnomalibTensorBoardLogger  # noqa: F401
-    from .wandb import AnomalibWandbLogger  # noqa: F401
-
-    __all__.extend(
-        [
-            "AnomalibCometLogger",
-            "AnomalibTensorBoardLogger",
-            "AnomalibWandbLogger",
-            "AnomalibMLFlowLogger",
-        ],
-    )
-except ImportError:
-    print("To use any logger install it using `anomalib install -v`")
+_LOGGER_NAMES = {


The module docstring advertises importing Anomalib*Logger classes from anomalib.loggers, but __all__ only exposes configure_logger. This changes star-import/introspection behavior compared to the previous eager-import implementation and is inconsistent with other lazy __init__.py modules here (e.g., anomalib.models, anomalib.data, anomalib.engine). Consider adding the lazy-exported logger class names to __all__ (even if they remain lazily loaded via __getattr__).

Copilot · 2026-04-14T12:14:57Z

+        import importlib
+
+        module = importlib.import_module(_LOGGER_NAMES[name], __name__)
+        return getattr(module, name)


__getattr__ returns the requested logger class but does not cache it in globals(). This causes repeated attribute access (or repeated from anomalib.loggers import ... in different modules) to re-import the same submodule. Consider caching the resolved object in globals()[name] (as done in anomalib.engine.__getattr__ and anomalib.models.__getattr__).

Suggested change

return getattr(module, name)

logger_class = getattr(module, name)

globals()[name] = logger_class

return logger_class

Copilot · 2026-04-14T12:14:57Z

    """
+    from anomalib.utils.path import convert_to_snake_case, convert_to_title_case
+
+    _import_all_models()
+


list_models docstring describes supported case values as snake_case/original with default snake_case, but the implementation accepts only {snake, pascal, title} with default snake. Please update the docstring so the documented API matches the runtime behavior.

Copilot · 2026-04-14T12:14:58Z

+def get_model(model, *args, **kwdargs) -> AnomalibModule:
    """Get an anomaly detection model instance.

    This function instantiates an anomaly detection model based on the provided


get_model lost its input type annotation (was DictConfig | str | dict | Namespace). Since this is a public API, dropping the signature typing reduces IDE/type-checker usefulness. Consider keeping the annotation using from __future__ import annotations and/or TYPE_CHECKING imports so you can stay lazy at runtime while preserving type information.

Copilot · 2026-04-14T12:14:58Z

+_PIPELINE_REGISTRY: dict[str, type[Pipeline]] | None | str = "uninitialized"
+
+_PIPELINE_DESCRIPTIONS: dict[str, str] = {


Using the string literal sentinel 'uninitialized' for _PIPELINE_REGISTRY forces the variable type to include str and is easy to accidentally collide with. A dedicated sentinel object (e.g., a private object() instance) would avoid widening the type and make the state machine clearer.

Replace eager imports of Comet, MLflow, TensorBoard, and WandB loggers with __getattr__-based lazy loading. These backends pull in heavy optional dependencies that are unnecessary for basic CLI operations.

Phase 1: Replace all top-level heavy imports (torch, lightning, Engine, AnomalibModule, AnomalibDataModule, pipeline registry) with deferred imports inside the functions that need them. Phase 2: Add _sniff_subcommand() to detect which subcommand the user invoked and only build the full ArgumentParser for that single subcommand, skipping expensive parser construction for unused ones. Also adds a compatibility shim for jsonargparse >=4.47 where _ActionSubCommands moved to a new module path.

Replace eager imports in package __init__.py files with __getattr__- based lazy loading: - models: 23 model classes loaded on-demand via _MODEL_CLASS_MAP - data: datamodules/datasets/enums loaded lazily; dataclasses kept eager to avoid circular imports (video.py imports VideoItem at module level) - engine: Engine, XPUAccelerator, SingleXPUStrategy loaded on-demand This eliminates the cascade where importing any anomalib subpackage would trigger torch + lightning + all models + all datasets.

…parse 1.7 compat module_available() from lightning_utilities actually imports the target module, adding ~3s for lightning.pytorch and ~3.7s for anomalib.pipelines. Replace with importlib.util.find_spec() which only checks if a module is findable without importing it. Also defer PIPELINE_REGISTRY import to only trigger when the selected subcommand is a pipeline command, preventing eager Benchmark import on every CLI invocation. Add _format_usage() shim in CustomHelpFormatter to normalize tuples to lists, fixing a crash between rich-argparse >= 1.7 (passes actions as tuple) and jsonargparse (only handles list/dict). Tested with jsonargparse 4.48.0 + rich-argparse 1.7.2.

- Fix _sniff_subcommand to skip -c/--config option values - Remove unused get_short_docstring import - Add lazy logger class names to __all__ - Add globals() caching in loggers __getattr__ - Fix list_models docstring to match snake/pascal/title params - Restore type annotation on get_model via TYPE_CHECKING - Replace fragile string sentinel with object() in pipelines

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

src/anomalib/data/init.py:151

get_datamodule lost its type annotation (config: DictConfig | ListConfig | dict) even though the docstring still documents those accepted types. To preserve typing/IDE support without reintroducing eager imports, consider restoring the annotated signature using from typing import TYPE_CHECKING + if TYPE_CHECKING: from omegaconf import DictConfig, ListConfig (and/or typing.Any as needed).

def get_datamodule(config) -> AnomalibDataModule:
    """Get Anomaly Datamodule from config.

    Args:
        config: Configuration for the anomaly model. Can be either:

Copilot · 2026-04-14T13:37:40Z

+__all__ = [
+    "configure_logger",
+    "AnomalibCometLogger",
+    "AnomalibMLFlowLogger",
+    "AnomalibTensorBoardLogger",
+    "AnomalibWandbLogger",
+]


__all__ now always includes the optional logger classes. This makes from anomalib.loggers import * (and some doc tools that iterate __all__) eagerly access/import optional logger modules and can raise ImportError if optional deps like matplotlib/vendor SDKs are missing. Consider keeping __all__ limited to always-available symbols (e.g. configure_logger) and leaving optional logger names out of __all__ (or only adding them conditionally / under TYPE_CHECKING).

Suggested change

__all__ = [

"configure_logger",

"AnomalibCometLogger",

"AnomalibMLFlowLogger",

"AnomalibTensorBoardLogger",

"AnomalibWandbLogger",

]

__all__ = ["configure_logger"]

Copilot · 2026-04-14T13:37:41Z

+    "ImageDataFormat": ".datamodules.image",
+    "Kaputt": ".datamodules.image",
+    "Kolektor": ".datamodules.image",
+    "MVTec": ".datamodules.image",


_LAZY_IMPORTS maps "MVTec" to .datamodules.image, but anomalib.data.datamodules.image does not define a MVTec symbol (it defines MVTecAD, MVTecAD2, etc.). As a result, from anomalib.data import MVTec (and __all__ exporting it) will raise AttributeError. Please remove this export or add a real alias (e.g. MVTec = MVTecAD) in the target module and keep __all__ consistent.

Suggested change

"MVTec": ".datamodules.image",

Copilot · 2026-04-14T13:37:41Z

+    from anomalib.utils.path import convert_to_snake_case, convert_to_title_case
+
+    _import_all_models()
+
    if case not in {"snake", "pascal", "title"}:


list_models() currently calls _import_all_models(), which forces importing every model module (and their heavy / optional dependencies) just to produce a name list. This undermines the lazy-import goal and can also break in minimal installs if any model has extra optional deps. Since you already have _MODEL_CLASS_MAP, list_models() can derive the snake/title/pascal names from the map keys without importing the model implementations.

Copilot · 2026-04-14T13:37:41Z

+    from anomalib.utils.path import convert_snake_to_pascal_case
+
+    _import_all_models()
+
    logger.info("Loading the model.")


_get_model_class_by_name() calls _import_all_models() and then scans AnomalibModule.__subclasses__(), which eagerly imports every model even when the user requested a single one. This can be very expensive and may fail if unrelated models have optional dependencies. Consider resolving the normalized name against _MODEL_CLASS_MAP (case-insensitive) and importing only that one module/class, falling back to the current error if no match exists.

Copilot · 2026-04-14T13:37:41Z

+    # Flags whose next token is a value, not a subcommand.
+    _OPTIONS_WITH_VALUE = frozenset({"-c", "--config"})
+
+    @staticmethod
+    def _sniff_subcommand(args: Sequence[str] | None) -> str | None:
+        """Peek at args to identify the subcommand without full parsing."""


New _sniff_subcommand() + conditional argument registration changes how parsers are constructed and is now critical to CLI correctness (e.g. -c/--config before/after the subcommand, --help behavior, and ensuring only the selected subcommand gets heavy argument construction). There are CLI integration tests, but there doesn’t appear to be focused coverage for these parsing edge cases; adding a small unit test matrix around _sniff_subcommand and subcommand parser construction would help prevent regressions.

Copilot AI review requested due to automatic review settings April 14, 2026 12:08

Copilot started reviewing on behalf of samet-akcay April 14, 2026 12:08 View session

samet-akcay changed the title ~~⚡ perf(cli): dramatically faster CLI startup via lazy imports~~ ⚡ perf(cli): faster CLI startup via lazy imports Apr 14, 2026

Copilot AI reviewed Apr 14, 2026

View reviewed changes

samet-akcay added 5 commits April 14, 2026 15:15

⚡ perf(loggers): lazy-load optional logger backends

80350ad

Replace eager imports of Comet, MLflow, TensorBoard, and WandB loggers with __getattr__-based lazy loading. These backends pull in heavy optional dependencies that are unnecessary for basic CLI operations.

Copilot AI review requested due to automatic review settings April 14, 2026 13:29

samet-akcay force-pushed the perf/lazy-cli-imports branch from f94c5f5 to 8575531 Compare April 14, 2026 13:29

Copilot started reviewing on behalf of samet-akcay April 14, 2026 13:30 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

-        for token in tokens:
-            if not token.startswith("-"):
-                return token
+        index = 0
+        while index < len(tokens):
+            token = tokens[index]
+            if token in {"-c", "--config"}:
+                index += 2
+                continue
+            if token.startswith("--config="):
+                index += 1
+                continue
+            if not token.startswith("-"):
+                return token
+            index += 1

	from anomalib.cli.utils.help_formatter import CustomHelpFormatter, get_short_docstring
	from anomalib.cli.utils.help_formatter import CustomHelpFormatter

-        return getattr(module, name)
+        logger_class = getattr(module, name)
+        globals()[name] = logger_class
+        return logger_class

		_PIPELINE_REGISTRY: dict[str, type[Pipeline]] \| None \| str = "uninitialized"

		_PIPELINE_DESCRIPTIONS: dict[str, str] = {

Conversation

samet-akcay commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Phase 1: Lazy CLI imports

Phase 2: Deferred subcommand parser construction

Phase 3: Lazy __init__.py re-exports

Compatibility

Benchmarks

Example

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

samet-akcay commented Apr 14, 2026 •

edited

Loading

Phase 3: Lazy `init.py` re-exports