Description
I would like to use Hyperdrive to optimize a full pipeline. That is, I would like to optimize hyperparameters on different steps jointly. I raised the issue here https://github.com/MicrosoftDocs/azure-docs/issues/77227 but I was suggested to open it here too.
For example, I have a pipeline defined as:
[prepare_data]
|
v
[extract_features]
|
v
[train_model]
I can use Hyperdriver to tune the hyperparameters of my ML model in the train_model
step based upon some metrics, say validation loss. What I would like to do is to tune the hyperparameters in the train_model
step together with the hyperparameters in the pre-processing steps (e.g., extract_features
). For example, I would like to find the best sequence length in extract_features
that can improve the loss in the model training.
HyperDriveConfig
does accept an argument pipeline
, which seems to be exactly what I am looking for. Unfortunately, I cannot find much information on how to use this parameter.
I tried to submit a Hyperdriver run as:
hd_config = HyperDriveConfig(
hyperparameter_sampling=...,
policy=...,
primary_metric_name=...,
primary_metric_goal=...,
max_total_runs=...,
max_duration_minutes=...,
max_concurrent_runs=...,
pipeline=pipeline,
)
exp = Experiment(workspace=ws, name="test")
hd_run = exp.submit(hd_config)
where pipeline
is one of my published pipelines in the workspace that accepts PipelineParameters to tune. However, I get the error:
Exception has occurred: AttributeError
'PublishedPipeline' object has no attribute 'graph'
How could I proceed?