Skip to content

Conversation

@RkGrit
Copy link
Contributor

@RkGrit RkGrit commented Nov 24, 2025

This PR introduces significant improvements in the model storage, loading, and inference pipeline management for better extensibility, efficiency, and ease of use. The changes include the refactoring of model storage to support a wider range of models, streamlining the model loading process, and the introduction of a unified inference pipeline. These improvements aim to optimize model management, reduce memory usage, and enhance the overall inference workflow.

  • Model Storage Refactoring

    • Extended Support for Models: The system now supports not only built-in models like TimerXL and Sundial but also allows the integration of fine-tuned and user-defined models.
    • Unified Model Management: A new model management system enables model registration, deletion, and loading from both local paths and Hugging Face.
    • Code Optimization: Redundant code from previous versions has been removed, and hard-coded model management has been replaced by a more flexible approach that integrates seamlessly with the Hugging Face Transformers ecosystem.
  • Model Loading Refactoring

    • Simplified Model Loading: The previous custom loading logic with complex if...else... conditions has been replaced by a unified model loading interface, simplifying the process.
    • Automatic Model Type Detection: The system now automatically detects the model type and selects the appropriate loading method, supporting models from Transformers, sktime, and PyTorch.
    • Lazy Loading: The PR introduces lazy loading for Python modules, eliminating the need to load multiple modules at startup, reducing initialization time and memory consumption.
  • Inference Pipeline Addition

    • Unified Inference Workflow: The introduction of the Inference Pipeline encapsulates the entire model inference process, offering a standardized interface for preprocessing, inference, and post-processing.
    • Support for Multiple Tasks: The pipeline is versatile, supporting various inference tasks such as prediction, classification, and dialogue-based tasks.

@RkGrit RkGrit force-pushed the model_management_v2 branch from ff6fa25 to 2ade43f Compare November 24, 2025 14:34
@RkGrit RkGrit force-pushed the model_management_v2 branch from 2ade43f to 18cccc3 Compare November 24, 2025 14:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR significantly refactors the model storage, loading, and inference pipeline architecture to improve extensibility and maintainability. The changes transition from hard-coded model management to a flexible, discovery-based system that seamlessly integrates with HuggingFace Transformers and sktime.

Key Changes

  • Unified Model Storage: Introduced a category-based storage system (builtin/user_defined/finetune) with automatic model discovery and lazy registration for Transformers models
  • Simplified Model Loading: Replaced complex conditional logic with a unified ModelLoader class that automatically detects model types (Transformers, sktime, PyTorch) and applies appropriate loading strategies
  • Inference Pipeline Framework: Created a modular pipeline architecture with base classes (BasicPipeline, ForecastPipeline) and model-specific implementations for timerxl and sundial models

Reviewed changes

Copilot reviewed 35 out of 37 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
model_storage.py Complete rewrite to support category-based storage with discovery, registration, and lazy Transformers model registration
model_loader.py New unified loader supporting Transformers, sktime, and PyTorch models with automatic type detection
model_info.py Simplified model info with updated enums, removed complex type detection, added REPO_ID_MAP for HF downloads
model_enums.py Refactored enums: removed BuiltInModelType, updated ModelCategory values, added UriType
handler.py Updated to use get_model_manager() singleton, inlined validation logic, improved error messages
model_manager.py Refactored to use composition (storage + loader), added discovery on init, new query methods
inference_manager.py Simplified strategy pattern, removed model-specific inference classes, uses new pipeline system
pipeline/*.py New modular pipeline architecture with basic, forecast, classification, and chat pipelines
sktime/*.py New sktime model support with configuration system and model factory
poetry.lock Added platform-specific greenlet and statsmodels wheels for Python 3.14

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

"covariance_type",
"diag",
"str",
choices=["sperical", "diag", "full", "tied"],
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in choice value "sperical" should be "spherical". This will cause validation failures when users try to use the correct spelling.

Suggested change
choices=["sperical", "diag", "full", "tied"],
choices=["spherical", "diag", "full", "tied"],

Copilot uses AI. Check for mistakes.
def estimate_pool_size(device: torch.device, model_id: str) -> int:
model_info = BUILT_IN_LTSM_MAP.get(model_id, None)
if model_info is None or model_info.model_type not in MODEL_MEM_USAGE_MAP:
model_info = get_model_manager.get_model_info(model_id)
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing closing parenthesis. The function call get_model_manager.get_model_info(model_id) is missing parentheses after get_model_manager. It should be get_model_manager().get_model_info(model_id).

Suggested change
model_info = get_model_manager.get_model_info(model_id)
model_info = get_model_manager().get_model_info(model_id)

Copilot uses AI. Check for mistakes.
def _process_model_directory(
self, model_dir: Path, model_id: str, category: ModelCategory
):
"""Handling the discovery logic for a single model directory."""
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Missing period at end of comment. The comment should end with a period for consistency with documentation standards.

Copilot uses AI. Check for mistakes.
Comment on lines 19 to +36
import concurrent.futures
import json
import os
import shutil
from collections.abc import Callable
from typing import Dict
from typing import List, Optional

import torch
from torch import nn
from huggingface_hub import hf_hub_download, snapshot_download
from transformers import AutoConfig, AutoModelForCausalLM

from iotdb.ainode.core.config import AINodeDescriptor
from iotdb.ainode.core.constant import (
MODEL_CONFIG_FILE_IN_JSON,
MODEL_WEIGHTS_FILE_IN_PT,
TSStatusCode,
)
from iotdb.ainode.core.exception import (
BuiltInModelDeletionError,
ModelNotExistError,
UnsupportedError,
)
from iotdb.ainode.core.constant import TSStatusCode
from iotdb.ainode.core.exception import BuiltInModelDeletionError
from iotdb.ainode.core.log import Logger
from iotdb.ainode.core.model.built_in_model_factory import (
download_built_in_ltsm_from_hf_if_necessary,
fetch_built_in_model,
)
from iotdb.ainode.core.model.model_enums import (
BuiltInModelType,
ModelCategory,
ModelFileType,
ModelStates,
)
from iotdb.ainode.core.model.model_factory import fetch_model_by_uri
from iotdb.ainode.core.model.model_enums import REPO_ID_MAP, ModelCategory, ModelStates
from iotdb.ainode.core.model.model_info import (
BUILT_IN_LTSM_MAP,
BUILT_IN_MACHINE_LEARNING_MODEL_MAP,
BUILTIN_HF_TRANSFORMERS_MODEL_MAP,
BUILTIN_SKTIME_MODEL_MAP,
ModelInfo,
get_built_in_model_type,
get_model_file_type,
)
from iotdb.ainode.core.model.uri_utils import get_model_register_strategy
from iotdb.ainode.core.model.utils import *
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing import statement. The code uses json module on line 206 but doesn't import it at the top of the file. Add import json to the imports section.

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +146
"pipeline": AttributeConfig(
"pipeline", "last", "str", choices=["last", "mean"]
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute name "pipeline" should likely be "strategy" to match the NaiveForecaster parameter. According to sktime documentation, the NaiveForecaster parameter for choosing between "last" and "mean" is called "strategy", not "pipeline".

Suggested change
"pipeline": AttributeConfig(
"pipeline", "last", "str", choices=["last", "mean"]
"strategy": AttributeConfig(
"strategy", "last", "str", choices=["last", "mean"]

Copilot uses AI. Check for mistakes.
ModelFileType,
ModelStates,
)
from typing import Dict, List, Optional, Tuple
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'List' is not used.
Import of 'Tuple' is not used.

Suggested change
from typing import Dict, List, Optional, Tuple
from typing import Dict, Optional

Copilot uses AI. Check for mistakes.
# under the License.
#

import pandas as pd
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'pd' is not used.

Suggested change
import pandas as pd

Copilot uses AI. Check for mistakes.
import torch

from iotdb.ainode.core.inference.pipeline.basic_pipeline import ForecastPipeline
from iotdb.ainode.core.util.serde import convert_to_binary
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'convert_to_binary' is not used.

Suggested change
from iotdb.ainode.core.util.serde import convert_to_binary

Copilot uses AI. Check for mistakes.
# under the License.
#

import pandas as pd
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'pd' is not used.

Suggested change
import pandas as pd

Copilot uses AI. Check for mistakes.
import torch

from iotdb.ainode.core.inference.pipeline.basic_pipeline import ForecastPipeline
from iotdb.ainode.core.util.serde import convert_to_binary
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'convert_to_binary' is not used.

Suggested change
from iotdb.ainode.core.util.serde import convert_to_binary

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@SpriCoder SpriCoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Jobs. Please take a look of CI

Copy link
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL.

Comment on lines +24 to +27
if model_id == "timerxl":
return TimerxlPipeline(model_id, device=device)
elif model_id == "sundial":
return SundialPipeline(model_id, device=device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under current implementation, to integrate a new model, we should add another if-else branch here? Is it toooooo much?

Copy link
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL.

Comment on lines +119 to +124
batch_output = self._inference_pipeline.infer(
batch_inputs,
predict_length=requests[0].max_new_tokens,
# num_samples=10,
revin=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your current implementation, there is no parameters can be delivered into inference_pipeline, we should further discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants