GitHub - VincyZhang/Time-HD-Lib: Time-HD-Lib: A Library for High-Dimensional Time Series Forecasting

🚀 A Library for High-Dimensional Time Series Forecasting [Paper Page]

A comprehensive, production-ready framework for high-dimensional time series forecasting with support for 20+ state-of-the-art models, distributed training, automated hyperparameter optimization.

🌟 Key Features

📊 High-Dimensional: Optimized for datasets with thousands of dimensions
🤖 20+ SOTA Models: Latest time series forecasting models (2017-2024) with unified interface
🚀 Distributed Training: Built-in multi-GPU support with HuggingFace Accelerate
🔍 AutoML: Automated hyperparameter search with multi-horizon evaluation

📋 Supported Models (20+)

🎯 High-Dimensional Specialized

Model	Year	Paper	Description
UCast	2025	Learning Latent Hierarchical Channel Structure	High-dimensional forecasting

🏛️ Transformer-Based Models

Model	Year	Paper	Description
Transformer	2017	Attention Is All You Need	Original transformer architecture
Informer	2021	Beyond Efficient Transformer	ProbSparse attention mechanism
Autoformer	2021	Decomposition Transformers	Auto-correlation mechanism
Pyraformer	2021	Pyramidal Attention	Low-complexity attention
FEDformer	2022	Frequency Enhanced Decomposed	Frequency domain modeling
Nonstationary Transformer	2022	Non-stationary Transformers	Handles non-stationarity
ETSformer	2022	Exponential Smoothing Transformers	ETS-based transformers
Crossformer	2023	Cross-Dimension Dependency	Cross-dimensional attention
PatchTST	2023	A Time Series is Worth 64 Words	Patch-based transformers
iTransformer	2024	Inverted Transformers	Channel-attention design

🧠 CNN & MLP-Based Models

Model	Year	Paper	Description
MICN	2023	Multi-scale Local and Global Context	Isometric convolution
TimesNet	2023	Temporal 2D-Variation Modeling	2D temporal modeling
ModernTCN	2024	Modern Temporal Convolutional Networks	Enhanced TCN architecture
DLinear	2023	Are Transformers Effective?	Simple linear baseline
TSMixer	2023	All-MLP Architecture	MLP-based mixing
FreTS	2023	Simple yet Effective Approach	Frequency representation
TiDE	2023	Time-series Dense Encoder	Dense encoder design
SegRNN	2023	Segment Recurrent Neural Network	Segment-based RNN
LightTS	2023	Lightweight Time Series	Efficient forecasting

📊 Supported Datasets

🎯 Time-HD: High-Dimensional Benchmark

Our framework supports the Time-HD benchmark dataset through HuggingFace Datasets:

📈 Traditional Benchmarks

ETT (ETTh1, ETTh2, ETTm1, ETTm2) - Electricity transformer temperature
Weather - Multi-variate weather forecasting
Traffic - Road traffic flow
ECL - Electricity consuming load

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/LingFengGold/Time-HD-Lib
cd Time-HD-Lib

# Method 1: Using pip
pip install -r requirements.txt

# Method 2: Using conda (recommended)
conda env create -f environment.yaml
conda activate tsf

# Install optional dependencies for full functionality
pip install pandas torchinfo einops reformer-pytorch

Data Preparation

To access the Time-HD benchmark dataset, follow these steps:

a. Create a Hugging Face account, if you do not already have one.

b. Visit the dataset page:
https://huggingface.co/datasets/Time-HD-Anonymous/High_Dimensional_Time_Series

c. Click "Agree and access repository". You must be logged in to complete this step.

d. Create new Access Token. Token type should be "write".

e. Authenticate on your local machine by running:

huggingface-cli login

and enter your generated token above.

f. Then, you can manually download all the dataset by running:

python download_dataset.py

The summary of the supported high-dimensional time series datasets is shown in Table 2 above. Besides these, we also support datasets such as ECL, ETTh1, ETTh2, ETTm1, ETTm2, Weather, and Traffic.

Basic Usage

# 🖥️ Single GPU training
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0

# 🚀 Multi-GPU training (auto-detect all GPUs)
accelerate launch run.py --model UCast --data "Measles"

# 🎯 Specific GPU selection (e.g. 4 GPUs, id: 0,2,3,7)
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7

# 📋 List available models
accelerate launch run.py --list-models

# ℹ️ Show framework information
python run.py --info

Hyperparameter Search

# 🔍 Automated hyperparameter search
accelerate launch run.py --model UCast --data "Measles" --hyper_parameter_searching
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0 --hyper_parameter_searching
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7 --hyper_parameter_searching

🔧 Configuration System

Model Configuration

Create dataset-specific configurations in configs/:

# configs/UCast.yaml
Measles:
  enc_in: 1161
  train_epochs: 10
  alpha: 0.01
  seq_len_factor: 4
  learning_rate: 0.001

Air_Quality:
  enc_in: 2994
  train_epochs: 15
  alpha: 0.1
  seq_len_factor: 5
  learning_rate: 0.0001

Hyperparameter Search Configuration

Define search spaces in config_hp/:

# config_hp/UCast.yaml
learning_rate: [0.001, 0.0001]
seq_len_factor: [4, 5]
d_model: [256, 512]
alpha: [0.01, 0.1]

🏗️ Architecture Overview

📁 Time-HD-Lib Framework
├── 🚀 run.py                     # Main entry point with GPU management
├── 🏗️  core/                     # Core framework components
│   ├── 📝 config/                # Configuration management system
│   │   ├── base.py               # Base configuration classes
│   │   ├── manager.py            # Configuration manager
│   │   └── model_configs.py      # Model-specific configs
│   ├── 📊 registry/              # Model/dataset registration
│   │   ├── __init__.py           # Registry decorators
│   │   └── model_registry.py     # Model registration system
│   ├── 🤖 models/                # Model management and loading
│   │   ├── model_manager.py      # Dynamic model loading
│   │   └── __init__.py           # Model manager interface
│   ├── 📊 data/                  # Self-contained data pipeline
│   │   ├── data_provider.py      # Main data provider
│   │   ├── data_factory.py       # Dataset factory
│   │   └── data_loader.py        # Custom dataset classes
│   ├── 🧪 experiments/           # Experiment orchestration
│   │   ├── base_experiment.py    # Base experiment class
│   │   └── long_term_forecasting.py  # Forecasting experiments
│   ├── ⚙️  execution/             # Execution engine
│   │   └── runner.py             # Experiment runners
│   ├── 🛠️  utils/                # Self-contained utilities
│   │   ├── tools.py              # Training utilities
│   │   ├── metrics.py            # Evaluation metrics
│   │   ├── timefeatures.py       # Time feature extraction
│   │   ├── augmentation.py       # Data augmentation
│   │   ├── masked_attention.py   # Attention mechanisms
│   │   └── masking.py            # Masking utilities
│   ├── 🔌 plugins/               # Plugin system for extensibility
│   └── 💻 cli/                   # Command-line interface
│       └── argument_parser.py    # Comprehensive CLI parser
├── 🤖 models/                    # Model implementations with @register_model
│   ├── UCast.py                  # High-dimensional specialist
│   ├── TimesNet.py               # 2D temporal modeling
│   ├── iTransformer.py           # Inverted transformer
│   ├── ModernTCN.py              # Modern TCN
│   └── ...                       # 16+ other models
├── 🗂️ configs/                   # Model-dataset configurations
├── 🔍 config_hp/                 # Hyperparameter search configs
├── 🧱 layers/                    # Neural network building blocks
└── 📊 results/                   # Experiment outputs and logs

📈 Performance Benchmarks

🎯 Best Practices

1. Model Hyperparameter Configuration

Create Model Configuration Files

Create YAML configuration files for each model in the configs/ directory:

# configs/YourModel.yaml
Measles:
  enc_in: 1161
  train_epochs: 10
  learning_rate: 0.001
  d_model: 512
  batch_size: 16
  seq_len_factor: 4

Prediction Length Configuration

Edit configs/pred_len_config.yaml to set default prediction lengths for datasets:

# configs/pred_len_config.yaml
Measles: [7]           # Use the first value as default
Temp: [168]

2. Multi-GPU Setup and Distributed Training

Automatic GPU Detection

# Use all available GPUs
accelerate launch run.py --model UCast --data "Measles"

Specify Specific GPUs

# Use GPUs 0,2,3,7
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7

# Single GPU training
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --gpu 0

Configure Distributed Training

# Multi-node training
accelerate launch --multi_gpu --main_process_port 29500 run.py --model UCast --data "Measles"

3. Automatic Batch Size Finding during Hyperparameter Searching

The framework automatically finds the maximum available batch size during hyperparameter searching:

# Start from batch size 64, automatically reduce to 32, 16, 8, 4, 2, 1 when encountering OOM
accelerate launch run.py --model UCast --data "Measles" --batch_size 64 --hyper_parameter_searching

Manual batch size control:

# configs/UCast.yaml 
Measles:
  batch_size: 16  # Set smaller batch size for high-dimensional data
  
Wiki-20k:
  batch_size: 8   # Use even smaller batch size for ultra-high-dimensional data

4. Mixed Precision Training

Enable Mixed Precision

accelerate launch --mixed_precision fp16 run.py --model UCast --data "Measles"

5. Batch Training

Use Batch Mode

# Run predefined batch experiments
python run.py --batch

Custom Batch Experiments

from core.config import ConfigManager
from core.execution.runner import BatchRunner

# Create batch experiments
config_manager = ConfigManager()
batch_runner = BatchRunner(config_manager)

# Add experiments
models = ['UCast', 'TimesNet', 'iTransformer']
datasets = ['Measles', 'SIRS', 'ETTh1']

for model in models:
    for dataset in datasets:
        batch_runner.add_experiment(
            model=model,
            data=dataset,
            is_training=True
        )

# Run batch experiments
results = batch_runner.run_batch()

6. Hyperparameter Search Configuration and Execution

Create Hyperparameter Search Configuration

# config_hp/UCast.yaml
learning_rate: [0.001, 0.0001, 0.00001]
seq_len_factor: [3, 4, 5]
d_model: [256, 512, 1024]
alpha: [0.01, 0.1, 1.0]
batch_size: [8, 16, 32]

Set Prediction Length Ranges for Datasets

# configs/pred_len_config.yaml  
Measles: [7, 14, 21]      # These 3 values will be tested during hyperparameter search
ETTh1: [96, 192, 336]     # Multiple prediction lengths for traditional datasets
"Air Quality": [28, 56]   # Suitable prediction lengths for high-dimensional data

Execute Hyperparameter Search

# Single GPU hyperparameter search
accelerate launch --num_processes=1 run.py --model UCast --data "Measles" --hyper_parameter_searching

# Multi-GPU hyperparameter search
accelerate launch --num_processes=4 run.py --model UCast --data "Measles" --gpu 0,2,3,7 --hyper_parameter_searching

# Specify log directory
accelerate launch run.py --model UCast --data "Measles" --hyper_parameter_searching --hp_log_dir ./my_hp_logs/

View Search Results

# Results are saved in hp_logs/ directory
hp_logs/
└── UCast_Measles_20241201_143022/
    ├── best_result.json     # Best configuration and results
    ├── hp_summary.json      # Summary of all configurations
    ├── results.csv          # CSV format results
    └── result_*.json        # Detailed results for each configuration

🔧 Development & Extension

1. Adding New Models

Step 1: Implement Model Class

Create a new model file in the models/ directory:

# models/YourNewModel.py
import torch
import torch.nn as nn
from core.registry import register_model

@register_model("YourNewModel", paper="Your Paper Title", year=2024)
class Model(nn.Module):  # Class name must be 'Model'
    def __init__(self, configs):
        super().__init__()
        self.configs = configs
        
        # Get parameters from configs
        self.seq_len = configs.seq_len
        self.pred_len = configs.pred_len
        self.enc_in = configs.enc_in
        self.d_model = configs.d_model
        
        # Implement your model architecture
        self.encoder = nn.Linear(self.enc_in, self.d_model)
        self.decoder = nn.Linear(self.d_model, self.enc_in)
        
    def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
        # x_enc: [batch_size, seq_len, enc_in]
        # Return: [batch_size, pred_len, enc_in]
        
        # Implement forward propagation
        encoded = self.encoder(x_enc)
        # ... Your model logic ...
        output = self.decoder(encoded)
        
        return output

Step 2: Create Model Configuration

# configs/YourNewModel.yaml
Measles:
  enc_in: 1161
  train_epochs: 10
  learning_rate: 0.001
  d_model: 512
  batch_size: 16
  seq_len_factor: 4
  # Add model-specific parameters
  your_param: 0.1

ETTh1:
  enc_in: 7
  train_epochs: 15
  learning_rate: 0.0001
  d_model: 256

Step 3: Create Hyperparameter Search Configuration

# config_hp/YourNewModel.yaml
learning_rate: [0.001, 0.0001]
d_model: [256, 512]
your_param: [0.1, 0.5, 1.0]
seq_len_factor: [3, 4, 5]

Step 4: Test New Model

# Test if model is correctly registered
python run.py --list-models

# Quick validation training
accelerate launch --num_processes=1 run.py --model YourNewModel --data "Measles" --train_epochs 1

# Full training
accelerate launch run.py --model YourNewModel --data "Measles"

# Hyperparameter search
accelerate launch run.py --model YourNewModel --data "Measles" --hyper_parameter_searching

2. Adding New Datasets (Upload to HuggingFace)

Step 1: Prepare Dataset

📊 Standard Dataset Format

Time-HD-Lib expects datasets to follow a standardized format:

📅 Date Column: First column named 'date' containing timestamps
📈 Feature Columns: Remaining columns represent different features/dimensions
⏰ Row Structure: Each row represents one time step/timestamp
📋 Column Order: ['date', 'feature_0', 'feature_1', ..., 'feature_n']

Example Dataset Structure:

        date          feature_0    feature_1    feature_2    ...    feature_499
0    2020-01-01 00:00:00   0.234       -1.456       0.789    ...       2.341
1    2020-01-01 01:00:00  -0.567        0.891      -0.234    ...      -1.234  
2    2020-01-01 02:00:00   1.234       -0.567       1.456    ...       0.567
...               ...        ...          ...         ...    ...         ...
9999 2021-02-23 07:00:00   0.123        1.789      -0.987    ...       1.567

🔧 Format Requirements:

Time Column: Must be named 'date' and contain valid timestamps
Feature Naming: Can use any naming convention (e.g., feature_0, sensor_1, temperature)
Data Types: Numeric values for features, datetime for date column
Missing Values: Handle NaN values before uploading (interpolate or remove)
Frequency: Consistent time intervals (hourly, daily, etc.)

Step 2: Upload to HuggingFace (https://huggingface.co/datasets/Time-HD-Anonymous/High_Dimensional_Time_Series)

Step 3: Add Dataset Support in Framework or use Dataset_Custom

# core/data/data_loader.py - Add new dataset class
class Dataset_YourDataset(Dataset):
    def __init__(self, args, root_path, flag='train', size=None, 
                 features='S', data_path='your_dataset.csv',
                 target='feature_0', scale=True, timeenc=0, freq='h'):
        
        # Implement data loading logic
        # Can load from HuggingFace or local CSV
        if args.use_hf_datasets:
            from datasets import load_dataset
            hf_dataset = load_dataset("your-username/your-dataset-name")
            self.data_x = hf_dataset[flag].to_pandas()
        else:
            # Load from local
            df_raw = pd.read_csv(os.path.join(root_path, data_path))
            self.data_x = df_raw
            
        # Implement the rest of data processing logic...

Step 4: Update Data Factory

# core/data/data_factory.py
data_dict = {
    'ETTh1': Dataset_ETT_hour,
    'ETTh2': Dataset_ETT_hour,
    'ETTm1': Dataset_ETT_minute,
    'ETTm2': Dataset_ETT_minute,
    'custom': Dataset_Custom,
    'your_dataset': Dataset_YourDataset,  # Add new dataset
}

Step 5: Add Configuration Support

# configs/pred_len_config.yaml
your_dataset: [24, 48, 96]  # Set default prediction length

# configs/UCast.yaml (or other model configurations)
your_dataset:
  enc_in: 500  # Number of features in your dataset
  train_epochs: 10
  learning_rate: 0.001
  seq_len_factor: 4

Step 6: Test New Dataset

# Test data loading
accelerate launch --num_processes=1 run.py --model UCast --data your_dataset --train_epochs 1

# Full training
accelerate launch run.py --model UCast --data your_dataset

# Hyperparameter search
accelerate launch run.py --model UCast --data your_dataset --hyper_parameter_searching

📊 Experiment Results Management

📁 Output Structure

Time-HD-Lib/
├── 📊 results/                          # Main experiment results
│   └── long_term_forecast_{model}_{dataset}_slxxx_plxxx/
│       ├── metrics.npy                  # Final test metrics [mae, mse, rmse, mape, mspe]
│       ├── pred.npy                     # Model predictions [batch, pred_len, features]
│       └── true.npy                     # Ground truth values [batch, pred_len, features]
│
├── 🎯 test_results/                     # Visualization and detailed analysis
│   └── long_term_forecast_{model}_{dataset}_slxxx_plxxx/
│       ├── 0.pdf                        # Prediction plots for feature 0
│       ├── 20.pdf                       # Prediction plots for feature 20
│       └── ...                          # Additional feature visualizations
│
└── 🔍 hp_logs/                          # Hyperparameter search results
    └── {model}_{dataset}_{timestamp}/
        ├── best_result.json             # Best configuration and performance metrics
        ├── hp_summary.json              # Summary of all tested configurations
        └── results.csv                  # All results in tabular format

📝 Citation

If you use Time-HD-Lib or Time-HD benchmark in your research, please cite:

@article{ucast_2024,
    title = {Are We Overlooking the Dimensions? Learning Latent Hierarchical Channel Structure for High-Dimensional Time Series Forecasting},
    author = {Juntong Ni, Shiyu Wang, Zewen Liu, Xiaoming Shi, Xinyue Zhong, Zhou Ye, Wei Jin},
    journal = {In Submission},
    year = {2025}
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Acknowledgments

Time-Series-Library - Foundation and inspiration (GitHub)
HuggingFace Accelerate - Distributed training infrastructure
PyTorch Ecosystem - Deep learning framework
Time Series Research Community - For advancing the field

🌟 Contributing

We welcome contributions! Please see our Contributing Guide for details.

🍴 Fork the repository
🌿 Create a feature branch (git checkout -b feature/amazing-feature)
💻 Make your changes and add tests
✅ Ensure all tests pass (python -m pytest tests/)
📝 Update documentation if needed
🚀 Submit a pull request

📞 Support & Community

📧 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

🚀 Ready to forecast the future with high-dimensional time series? Get started today!

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
config_hp		config_hp
configs		configs
core		core
layers		layers
models		models
pic		pic
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
download_dataset.py		download_dataset.py
environment.yaml		environment.yaml
requirements.txt		requirements.txt
run.py		run.py

License

VincyZhang/Time-HD-Lib

Folders and files

Latest commit

History

Repository files navigation