Skip to content

Commit 140472d

Browse files
authored
Merge pull request #40 from Zihang-Xu-2002/ltsm-stack
README Update
2 parents 456e2d6 + ae55cd4 commit 140472d

File tree

3 files changed

+238
-61
lines changed

3 files changed

+238
-61
lines changed

README.md

Lines changed: 212 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,74 +1,245 @@
1-
# Understanding Different Design Choices in Training Large Time Series Models
2-
<img width="700" height="290" src="./imgs/ltsm_model.png">
1+
# LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting
2+
3+
<div align="center">
4+
<img src="./imgs/ltsm_model.png" width="700" height="290" alt="LTSM Model">
5+
</div>
36

47
[![Test](https://github.com/daochenzha/ltsm/actions/workflows/test.yml/badge.svg)](https://github.com/daochenzha/ltsm/actions/workflows/test.yml)
58

6-
This work investigates the transition from traditional Time Series Forecasting (TSF) to Large Time Series Models (LTSMs), leveraging universal transformer-based models. Training LTSMs on diverse time series data introduces challenges due to varying frequencies, dimensions, and patterns. We explore various design choices for LTSMs, including pre-processing, model configurations, and dataset setups. We introduce **Time Series Prompt**, a statistical prompting strategy, and $\texttt{LTSM-bundle}$, which encapsulates the most effective design practices identified. $\texttt{LTSM-bundle}$ is developed by [Data Lab at Rice University](https://cs.rice.edu/~xh37/).
9+
> Empowering forecasts with precision and efficiency.
10+
11+
## Table of Contents
12+
13+
* [Overview](#overview)
14+
* [Why LTSM-bundle](#why-ltsm-bundle)
15+
* [Features](#features)
16+
* [Installation](#installation)
17+
* [Quick Start](#quick-start)
18+
* [Project Structure](#project-structure)
19+
* [Datasets and Prompts](#datasets-and-prompts)
20+
* [Model Access](#model-access)
21+
* [Cite This Work](#cite-this-work)
22+
* [License](#license)
23+
* [Acknowledgments](#acknowledgments)
24+
25+
---
726

8-
## Resources
9-
:star2: Please star our repo to follow the latest updates on LTSM-bundle!
27+
## Overview
1028

11-
:mega: We have released our [paper](https://arxiv.org/abs/2406.14045) and source code of LTSM-bundle-v1.0!
29+
This work investigates the transition from traditional Time Series Forecasting (TSF) to Large Time Series Models (LTSMs), leveraging large transformer-based models like GPT. Training LTSMs on diverse time series data introduces challenges due to varying frequencies, dimensions, and patterns.
1230

13-
:books: Follow our latest [English Tutorial](https://github.com/daochenzha/ltsm/tree/main/tutorial) or [中文教程](https://zhuanlan.zhihu.com/p/708804309) to costomize your LTSM!
31+
We explore multiple design choices, including pre-processing strategies, tokenization, model architectures, and dataset setups. We introduce:
1432

15-
:earth_americas: For more information, please visit:
16-
* Paper: [https://arxiv.org/abs/2406.14045](https://arxiv.org/abs/2406.14045)
17-
* Blog: [Time Series Are Not That Different for LLMs](https://towardsdatascience.com/time-series-are-not-that-different-for-llms-56435dc7d2b1)
18-
* Tutorial: [Build your own LTSM-bundle](https://github.com/daochenzha/ltsm/tree/main/tutorial)
19-
* Chinese Tutorial: [https://zhuanlan.zhihu.com/p/708804309](https://zhuanlan.zhihu.com/p/708804309)
20-
* Do you want to learn more about data pipeline search? Please check out our [data-centric AI survey](https://arxiv.org/abs/2303.10158) and [data-centric AI resources](https://github.com/daochenzha/data-centric-AI) !
33+
* **Time Series Prompt**: A statistical prompting strategy
34+
* **LTSM-bundle**: A toolkit encapsulating effective design practices
35+
36+
The project is developed by the [Data Lab at Rice University](https://cs.rice.edu/~xh37/).
37+
38+
---
2139

2240
## Why LTSM-bundle?
23-
The LTSM-bundle package leverages the HuggingFace transformers toolkit, offering flexibility to switch between different advanced language models as the backbone. It is easy to tailor the general LTSMs to their specific time series forecasting needs by selecting the most suitable language model from a wide array of options. The flexibility enhances the adaptability of the package across different industries and data types, ensuring optimal performance in diverse scenarios.
41+
42+
The LTSM-bundle leverages HuggingFace transformers, allowing flexible integration of large-scale pre-trained language models for time series tasks. Users can customize the pipeline to fit specific forecasting needs with minimal overhead, making it adaptable across various domains and industries.
43+
44+
Key highlights:
45+
46+
* Plug-and-play with GPT-style backbones
47+
* Modular pipeline for easy experimentation
48+
* Support for statistical and text prompts
49+
50+
---
51+
52+
## Features
53+
54+
| Category | Highlights |
55+
| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
56+
| ⚙️ Architecture | Modular design, GPT-style transformers for time series |
57+
| 📝 Prompting | Time Series Prompt & Text Prompt support |
58+
| ⚡️ Performance | GPU acceleration, optimized pipelines |
59+
| 🔧 Integrations | LoRA support, JSON/CSV-based dataset and prompt interfaces |
60+
| 🔬 Testing | Unit and integration tests, GitHub Actions CI |
61+
| 📊 Data | Built-in data loaders, scalers, and tokenizers |
62+
| 📂 Documentation | Tutorials in [English](https://github.com/daochenzha/ltsm/tree/main/tutorial) and [Chinese](https://zhuanlan.zhihu.com/p/708804309) |
63+
64+
---
2465

2566
## Installation
26-
```
67+
68+
We recommend using Conda:
69+
70+
```bash
2771
conda create -n ltsm python=3.8.0
2872
conda activate ltsm
29-
git clone [email protected]:daochenzha/ltsm.git
30-
cd ltsm
31-
pip3 install -e .
32-
pip3 install -r requirements.txt
3373
```
3474

35-
## Quick Exploration on LTSM-bundle
75+
Then install the package:
3676

37-
Training on **[Time Series Prompt]** and **[Linear Tokenization]**
3877
```bash
39-
bash scripts/train_ltsm_csv.sh
78+
git clone https://github.com/datamllab/ltsm.git
79+
cd ltsm
80+
pip install -e .
81+
pip install -r requirements.txt
4082
```
4183

42-
Training on **[Text Prompt]** and **[Linear Tokenization]**
43-
```bash
44-
bash scripts/train_ltsm_textprompt_csv.sh
84+
---
85+
86+
## 🔧 Training Examples
87+
<!-- Joshua please helps this part -->
88+
```Python
4589
```
4690

47-
Training on **[Time Series Prompt]** and **[Time Series Tokenization]**
48-
```bash
49-
bash scripts/train_ltsm_tokenizer_csv.sh
91+
92+
## 🔍 Inference Examples
93+
94+
```Python
95+
import os
96+
import torch
97+
import pandas as pd
98+
from huggingface_hub import hf_hub_download
99+
from safetensors.torch import load_file
100+
from ltsm.models import LTSMConfig, ltsm_model
101+
102+
# Download model config and weights from Hugging Face
103+
config_path = hf_hub_download("LSC2204/LTSM-bundle", "config.json")
104+
weights_path = hf_hub_download("LSC2204/LTSM-bundle", "model.safetensors")
105+
106+
# Load model and weights
107+
model_config = LTSMConfig()
108+
model_config.load(config_path)
109+
model = ltsm_model.LTSM(model_config)
110+
111+
state_dict = load_file(weights_path)
112+
model.load_state_dict(state_dict)
113+
114+
# Move model to device
115+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
116+
model = model.to(device).eval()
117+
118+
# Load your dataset (e.g., weather)
119+
df_weather = pd.read_csv("/path/to/dataset.csv")
120+
print("Loaded data shape:", df_weather.shape)
121+
122+
# Load prompts per feature
123+
feature_prompts = {}
124+
prompt_dir = "/path/to/prompts/"
125+
for feature, filename in {
126+
"T (degC)": "weather_T (degC)_prompt.pth.tar",
127+
"rain (mm)": "weather_rain (mm)_prompt.pth.tar"
128+
}.items():
129+
prompt_tensor = torch.load(os.path.join(prompt_dir, filename))
130+
feature_prompts[feature] = prompt_tensor.squeeze(0).float().to(device)
131+
132+
# Predict (custom code here depending on your model usage)
133+
# For example:
134+
with torch.no_grad():
135+
inputs = feature_prompts["T (degC)"].unsqueeze(0)
136+
preds = model(inputs)
137+
print("Prediction output shape:", preds.shape)
138+
```
139+
140+
---
141+
142+
## Project Structure
143+
144+
```text
145+
└── ltsm-package/
146+
├── datasets
147+
│ └── README.md
148+
├── imgs
149+
│ ├── ltsm_model.png
150+
│ ├── prompt_csv_tsne.png
151+
│ └── stat_prompt.png
152+
├── ltsm
153+
│   ├── common # Base classes
154+
│   ├── data_pipeline # Model lifecycle management and training pipeline
155+
│   ├── data_provider # Dataset construction
156+
│   ├── data_reader # Read input data from various formats (CSV, JSON, etc.)
157+
│   ├── evaluate_pipeline # Evaluation workflow for model performance
158+
│   ├── layers # Custom neural network components
159+
│   ├── models # Implementations: LTSM, DLinear, Informer, PatchTST
160+
│   ├── prompt_reader # Prompt generation and formatting
161+
│   ├── sk_interface # Scikit-learn style interface
162+
│   └── utils # Shared helper functions
163+
├── multi_agents_pipeline # Multi-agent time series reasoning framework
164+
│   ├── Readme.md
165+
│   ├── agents # Agent definitions: Planner, QA, TS, Reward
166+
│   ├── llm-server.py # Local LLM server interface
167+
│   ├── ltsm_inference.py # Inference script using LTSM pipeline
168+
│   ├── main.py # Pipeline entry point
169+
│   └── model_config.yaml # Configuration file for models and agents
170+
├── requirements.txt
171+
├── setup.py
172+
├── tests # Unit tests for LTSM modules
173+
│   ├── common
174+
│   ├── data_pipeline
175+
│   ├── data_provider
176+
│   ├── data_reader
177+
│   ├── evaluate_pipeline
178+
│   ├── models
179+
│   └── test_scripts
180+
└── tutorial
181+
└── README.md
50182
```
51183

52-
## Datasets and Time Series Prompts
53-
Download the datasets
184+
---
185+
186+
## Datasets and Prompts
187+
188+
Download datasets:
189+
54190
```bash
55191
cd datasets
56-
download: https://drive.google.com/drive/folders/1hLFbz0FRxdiDCzgFYtKCOPJYSBVvwW9P
192+
# Google Drive link:
193+
https://drive.google.com/drive/folders/1hLFbz0FRxdiDCzgFYtKCOPJYSBVvwW9P
57194
```
58195

59-
Download the time series prompts
196+
Download time series prompts:
197+
60198
```bash
61-
cd prompt_bank/propmt_data_csv
62-
download: https://drive.google.com/drive/folders/1hLFbz0FRxdiDCzgFYtKCOPJYSBVvwW9P
199+
cd prompt_bank/prompt_data_csv
200+
# Same Google Drive link applies
63201
```
64202

203+
---
204+
205+
## Model Access
206+
207+
You can find our trained LTSM models on Hugging Face:
208+
209+
➡️ [https://huggingface.co/LSC2204/LTSM-bundle](https://huggingface.co/LSC2204/LTSM-bundle)
210+
211+
---
212+
65213
## Cite This Work
66-
If you find this work useful, you may cite this work:
67-
```
68-
@article{ltsm-bundle,
69-
title={Understanding Different Design Choices in Training Large Time Series Models},
70-
author={Chuang*, Yu-Neng and Li*, Songchen and Yuan*, Jiayi and Wang*, Guanchu and Lai*, Kwei-Herng and Yu, Leisheng and Ding, Sirui and Chang, Chia-Yuan and Tan, Qiaoyu and Zha, Daochen and Hu, Xia},
71-
journal={arXiv preprint arXiv:2406.14045},
72-
year={2024}
214+
215+
If you find this work useful, please cite:
216+
217+
```bibtex
218+
@misc{chuang2025ltsmbundletoolboxbenchmarklarge,
219+
title={LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting},
220+
author={Yu-Neng Chuang and Songchen Li and Jiayi Yuan and Guanchu Wang and Kwei-Herng Lai and Songyuan Sui and Leisheng Yu and Sirui Ding and Chia-Yuan Chang and Qiaoyu Tan and Daochen Zha and Xia Hu},
221+
year={2025},
222+
eprint={2406.14045},
223+
archivePrefix={arXiv},
224+
primaryClass={cs.LG},
225+
url={https://arxiv.org/abs/2406.14045},
73226
}
74227
```
228+
229+
---
230+
231+
## License
232+
233+
This project is licensed under the MIT License. See the [LICENSE](https://choosealicense.com/licenses/mit/) file for details.
234+
235+
---
236+
237+
## Acknowledgments
238+
239+
We thank all contributors and collaborators involved in the LTSM project. Special thanks to the Data Lab at Rice University and the open-source community for enabling fast prototyping and reproducible research.
240+
241+
---
242+
243+
<div align="right">
244+
<a href="#top">⬆️ Back to Top</a>
245+
</div>

multi_agents_pipeline/Agents.jpg

49.8 KB
Loading

multi_agents_pipeline/Readme.md

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11

22
# Quick Command
3-
The command `CUDA_VISIBLE_DEVICES=1,2,3 uvicorn llm-server:app --port <port number> --reload` should be run in the `multi_agents_pipeline` directory. e.g. `CUDA_VISIBLE_DEVICES=1,2,3 uvicorn llm-server:app --reload` will run the FastAPI app on http://127.0.0.1:8000/.
43

5-
`lsof -i :8000` can be used to check the running local LLM.
4+
## Run the local LLM Server
5+
The command `CUDA_VISIBLE_DEVICES=1,2,3 uvicorn llm-server:app --port <port number> --reload` should be run in the `multi_agents_pipeline` directory. e.g. `CUDA_VISIBLE_DEVICES=2,3,4 uvicorn llm-server:app` will run the FastAPI app on http://127.0.0.1:8000/.
66

7+
## Run the Pipeline
78
To execute the full pipeline, go to the `multi_agents_pipeline` folder and run `python main.py`.
89

910
> To use LLama-3-8B-Instruct, please check transformers >= 4.40!
@@ -30,43 +31,48 @@ class TSMessage(BaseModel):
3031
filepath: str # TO DO : Sopport more possible types
3132
task_type:Optional[str] = None
3233
description: Optional[str] = None
34+
35+
class TSTaskMessage(BaseModel):
36+
"""
37+
passed to Planner
38+
39+
This message contains a text prompt and the filepath to the data file.
40+
"""
41+
description: str
42+
filepath: str
3343
```
44+
| **Agent** | **Publishes** | **Subscribes** |
45+
|------------------|--------------------------------------------------------|--------------------------------------------------------|
46+
| **Planner** | `Planner-QA` (`TextMessage`) <br> `Planner-TS` (`TSMessage`) | `TSTaskMessage` |
47+
| **TS Agent** | `TS-Info` (`TSMessage`) | `Planner-TS` (`TSMessage`) <br> `Reward-TS` (`TSMessage`) |
48+
| **QA Agent** | `QA-Response` (`TextMessage`) | `Planner-QA` (`TextMessage`) <br> `TS-Info` (`TSMessage`) <br> `Reward-QA` (`TextMessage`) |
49+
| **Reward Agent** | `Reward-QA` (`TextMessage`) <br> `Reward-TS` (`TSMessage`) | `TS-Info` (`TSMessage`) <br> `QA-Response` (`TextMessage`) |
3450

35-
! To Discuss ! : Planner publishes messages with topic "Planner-QA"(`TextMessage`), "Planner-TS"(`TSMessage`)
3651

37-
TS Agent publishes messages with topic "TS-Info"(`TSMessage`), subscribes "Planner-TS"(`TSMessage`) and "Reward-TS"(`TSMessage`),
3852

39-
QA Agent publishes messages with topic "QA-Response"(`TextMessage`), subscribes ""Planner-QA"(`TextMessage`)", "TS-Info"(`TSMessage`) and "Reward-QA"(`TextMessage`),
53+
# Agents
4054

41-
Reward Agent publishes messages with topic "Reward-QA"(`TextMessage`), , "Reward-TS"(`TSMessage`), and subscribes "TS-Info"(`TSMessage`), "QA-Response"(`TextMessage`).
55+
![](./Agents.jpg)
4256

43-
# Agents
57+
## Planner
58+
59+
Receive TSTaskMessage from user. Then generate TS Task and QA Task to be sent tox TS Agent and QA Agent.
4460

4561
## TS Agent
4662

47-
Handle TSMessage, use LTSM to inference
63+
Handle TSMessage, use Time Series Models(e.g., LTSM) or Chat Models(e.g., ChatGPT) to extract features from time series.
4864

4965
## QA Agent
5066

51-
Combine TS Info and Planner-QA, get the response of LLM
67+
Combine TS Info and Planner-QA, get the response of LLM, and provide
5268

5369
## Reward Agent
5470

55-
gather output of TS Agent and QA Agent
56-
57-
58-
# Question:
59-
1. context buffer size : decided by the reward agent?
71+
Gather output of TS Agent and QA Agent. Send Feedback to TS and QA if the evaluation score is lower than a threshold.
6072

61-
2. should reward give signal to planner to handle the next query?
6273

63-
3. dataset selection (Task selection) : forecasting ? classification ? Answer: Time Reasoning
6474

6575

66-
TODO List (April 7, 2025)
67-
- performance of the framework
68-
- remove TS, test the performance of
69-
- use different TS models based on the task
7076

7177

7278

0 commit comments

Comments
 (0)