Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Ray lightning opens a new mlflow run #225

Open
@AugustoPeres

Description

I have a training script using ray, pytorch lightning and mlflow.
When I try to use ray lightning it seems to open another strategy:

First in my script I have the code:

def _log_parameters(**kwargs):
    for key, value in kwargs.items():
        mlflow.log_param(str(key), value)

def main():
    mlflow.start_run()
    _log_parameters(
        dim_model=FLAGS.dim_model,
        learning_rate=FLAGS.learning_rate, some other parameters coming from flags)

I then move on to training with ray:

    ray.init(address='auto')
    plugin = RayStrategy(num_workers=FLAGS.num_workers,
                         num_cpus_per_worker=FLAGS.num_cpus_per_worker,
                         use_gpu=FLAGS.use_gpu)
    trainer = pl.Trainer(max_epochs=FLAGS.max_epochs,
                         strategy=plugin,
                         logger=False,
                         callbacks=all_callbacks,
                         precision=int(FLAGS.precision))
    train.fit(model, training_data_loader, validation_data_loader)

The problem is that, all parameters logged with _log_parameters appear in one run, and all the metrics logged using the callbacks appear in another run.

If I train without ray then everything works as expected. I do not understand why is ray opening another run. Is there a way to prevent this?

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions