Deterministic mode is not set on remote worker #217

MarkusSpanring · 2022-09-23T09:40:20Z

Fix related to #213

This PR should also fix an unreachable code segment introduced in my previous PR #208

MarkusSpanring · 2022-09-23T09:41:57Z

ray_lightning/launchers/ray_launcher.py

@@ -316,18 +320,20 @@ def _collect_rank_zero_results(self, trainer: "pl.Trainer",
        This function is run on the worker process.
        """
        rank_zero_debug("Finalizing the Ray launcher environment.")
+        if trainer.strategy.global_rank != 0:


Is it save to use trainer.strategy instead of self._strategy?

MarkusSpanring · 2022-09-23T10:00:09Z

ray_lightning/launchers/ray_launcher.py

+        # Set operations to deterministic in this worker when required
+        if trainer._accelerator_connector.deterministic:
+            trainer._accelerator_connector._init_deterministic(True)
+
        results = function(*args, **kwargs)

        if trainer is not None:


Why is this check needed? Is there a case when trainer can be None?

MarkusSpanring · 2022-09-23T10:01:25Z

ray_lightning/launchers/ray_launcher.py

+            results = self._collect_rank_zero_results(trainer, results)
+
+        if results is None:
+            trainer._teardown()


Why do we need to tear down trainer only when local_rank or global_rank is !=0?

MarkusSpanring · 2022-09-29T11:37:15Z

@JiahaoYao if you have time, could you check if _init_deterministic(True) is sufficient to replicate Trainer(deterministic=True) on all workers?

MarkusSpanring added 2 commits September 23, 2022 11:29

fix unreachable code

7d284cb

set deterministic mode again in each worker

e8da6b3

MarkusSpanring commented Sep 23, 2022

View reviewed changes

MarkusSpanring marked this pull request as ready for review September 23, 2022 10:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Deterministic mode is not set on remote worker #217

Deterministic mode is not set on remote worker #217

Uh oh!

MarkusSpanring commented Sep 23, 2022

Uh oh!

MarkusSpanring Sep 23, 2022

Uh oh!

MarkusSpanring Sep 23, 2022

Uh oh!

MarkusSpanring Sep 23, 2022

Uh oh!

MarkusSpanring commented Sep 29, 2022

Uh oh!

Uh oh!

Deterministic mode is not set on remote worker #217

Are you sure you want to change the base?

Deterministic mode is not set on remote worker #217

Uh oh!

Conversation

MarkusSpanring commented Sep 23, 2022

Uh oh!

MarkusSpanring Sep 23, 2022

Choose a reason for hiding this comment

Uh oh!

MarkusSpanring Sep 23, 2022

Choose a reason for hiding this comment

Uh oh!

MarkusSpanring Sep 23, 2022

Choose a reason for hiding this comment

Uh oh!

MarkusSpanring commented Sep 29, 2022

Uh oh!

Uh oh!