Conversation
…ker import for types
…20251024_eval_doc
| config.launch.launch(follow=False) | ||
| config.launch.launch( | ||
| follow=False, torchrun=True | ||
| ) # always run with torchrun so you can run distributed scripts optionally on single gpu |
There was a problem hiding this comment.
Bug: Optional Launch Config Causes Errors
The launch field in OlmoEarthExperimentConfig is now optional, but the launch and launch_prep functions don't account for this. They attempt to access attributes or call methods on config.launch directly, which results in an AttributeError if no launch configuration is provided.
Additional Locations (1)
| if is_running_in_beaker() and beaker_user is None: | ||
| raise ValueError( | ||
| "Failed to get Beaker username. Make sure you are authenticated with Beaker." | ||
| "Failed to get Beaker username. Make sure you are authenticated with Beaker if you are not running on a local cluster." |
There was a problem hiding this comment.
Bug: Beaker Username Check Fails
The check for a missing Beaker username when running in Beaker is ineffective. beaker_user is assigned ANONYMOUS_USER if get_beaker_username() is None, which makes the subsequent beaker_user is None condition always false. This prevents the intended ValueError from being raised for unauthenticated Beaker users.
| num_workers=0, | ||
| pooling_type=PoolingType.MEAN, | ||
| norm_stats_from_pretrained=True, | ||
| eval_interval=Duration.steps(2), |
There was a problem hiding this comment.
Bug: Frequent Evaluation Interval in Production
The eval_interval for "m-eurosat" is set to Duration.steps(2) which means evaluation runs every 2 steps. This is extremely frequent and likely a debug value that was accidentally left in production code. Other tasks use 4000 or 20000 steps, so this should probably be similar.
PR #408 has been merged in