An end-to-end, experiment-oriented machine learning pipeline for spatial-temporal data
make install
or
python3 -m venv venv
source ./venv/bin/activate
pip3 install -Ur requirements.txt
make test
or
source ./venv/bin/activate
python3 user_main.py --model GWN --window_size 40 --horizon 10
- Define a PyTorch model class.
import torch as nn
class MyModel(nn.Module):
...
- Define a model manager class.
Methods likepreprocess
,train_model
, andtest_model
must be implemented with the option of overriding the defaultrun_pipeline
method.
These should make use of the model instance stored underself.model
.
from stdnn.models.manager import STModelManager
class MyModelManager(STModelManager):
def __init__(self):
super().__init__()
def preprocess(self, ...):
...
def train_model(self, ...):
...
def test_model(self, ...):
...
- Configure the hyperparameters.
These should make use of theConfigSpace
package.
Each hyperparameter name must match a parameter specified in the pipeline/model configuration.
Themeta
argument of each hyperparameter must by set to specify where (in what stage of the pipeline) the hyperparameter must be used (model, preprocess, train, test).
import ConfigSpace as CS
import ConfigSpace.hyperparameters as CSH
# Hyper parameter configuration
cs = CS.ConfigurationSpace()
my_hyperparameter = CSH.UniformFloatHyperparameter(
'...', lower=..., upper=..., meta={"config": "..."}
)
my_hyperparameter_2 = CSH.UniformFloatHyperparameter(
'...', lower=..., upper=..., meta={"config": "..."}
)
# Here two float hyperparameters are added to the config space
cs.add_hyperparameters([my_hyperparameter, my_hyperparameter_2])
- Configure the pipeline and model.
The parameter dictionaries will be unpacked into the corresponding pipeline method at runtime.
These should also include the previously-specified hyperparameters with default values (these will be replaced by the correct values during each experiment).
# Pipeline and model configuration
pipeline_config = {
"model": {
"meta": {
"type": MyModel,
"manager": MyModelManager
},
"params" : {...} # passed to model constructor
},
"preprocess" : {
"params" : {...} # passed to preprocess method
},
"train": {
"params" : {...} # passed to train_model method
},
"test": {
"params" : {...} # passed to test_model method
}
}
- Configure the experiments.
# Experiment configuration
experiment_config = {
"config_space": cs,
# specifies how many values of each hyperparameter to try
"grid": dict(
my_hyperparameter=...,
my_hyperparameter_2=...
),
# the number of repeat runs of each experiment
"runs": ...
}
- Run the experiments.
from stdnn.experiments.experiment import ExperimentManager, ExperimentConfigManager
# Pass config dictionaries to manager
config = ExperimentConfigManager(
pipeline_config,
experiment_config
)
experiment_manager = ExperimentManager(config)
# Run experiment
raw_results = experiment_manager.run_experiments()
- Organise and plot the results.
from stdnn.reporting.plotter import Plotter
results = {
label: result.aggregate(...).get_dataframes()
for label, result in raw_results.get_results().items()
}
Plotter.plot_lines(..., dataframes_dict=results, ...)