Skip outcomes with no observations in TorchAdapter dataset construction (#5077)

ItsMrLin · meta-codesync[bot] · commit 43cf13010d97 · 2026-03-19T18:18:42.000-07:00
Summary: Pull Request resolved: #5077 When tracking metrics have data in only a subset of trials, `_convert_experiment_data` can produce empty `SupervisedDataset`s (0 rows) after NaN filtering. This causes `Standardize` in BoTorch to crash with `ValueError: Can't standardize with no observations`. This is not generation-blocking when the empty metric is only a tracking metric (not in the optimization config). The fix skips such outcomes with a warning instead of producing empty datasets that crash model fitting. Also fixes a latent `IndexError: list index out of range` in the `candidate_metadata` reordering logic: when outcomes are skipped, the metadata list was shorter than the outcomes list, causing misaligned indexing. Changed `candidate_metadata` from a positional list to a dict keyed by outcome name. Meta: this unblocks Ax experiment `ifu_rbvm_session_proxy_pts` Reviewed By: Balandat Differential Revision: D97369642 fbshipit-source-id: 105942828b7bde6fc105eefc73ef82032ab94f59
diff --git a/ax/adapter/torch.py b/ax/adapter/torch.py
@@ -447,7 +447,7 @@ def _convert_experiment_data(
         ).to_numpy()
         metadata = mean_and_params["metadata"]
         datasets: list[SupervisedDataset] = []
-        candidate_metadata = []
+        candidate_metadata: dict[str, list[dict[str, Any] | None]] = {}
         for outcome in outcomes:
             outcome_col_name = (
                 outcome + "_metric" if outcome in duplicated_names else outcome
@@ -513,7 +513,7 @@ def _convert_experiment_data(
                     group_indices=group_indices,
                 )
             datasets.append(dataset)
-            candidate_metadata.append(metadata.loc[to_keep].to_list())
+            candidate_metadata[outcome] = metadata.loc[to_keep].to_list()
 
         # If the search space digest specifies a task feature,
         # convert the datasets into MultiTaskDataset.
@@ -542,6 +542,10 @@ def _convert_experiment_data(
                 )
                 for dataset in datasets
             ]
+        # Build the list of outcomes actually present in datasets (some may
+        # have been skipped above due to all-NaN observations).
+        included_outcomes = [name for d in datasets for name in d.outcome_names]
+
         # Check if there is a `parameter_decomposition` experiment property to
         # decide whether it is a contextual experiment.
         if self._experiment_properties.get("parameter_decomposition", None) is not None:
@@ -551,7 +555,7 @@ def _convert_experiment_data(
                 list[SupervisedDataset],
                 process_contextual_datasets(
                     datasets=datasets,
-                    outcomes=outcomes,
+                    outcomes=included_outcomes,
                     parameter_decomposition=self._experiment_properties[
                         "parameter_decomposition"
                     ],
@@ -561,15 +565,14 @@ def _convert_experiment_data(
                 ),
             )
 
-        # Get the order of outcomes
-        ordered_outcomes = []
-        for d in datasets:
-            ordered_outcomes.extend(d.outcome_names)
+        # Get the order of outcomes (may differ from included_outcomes
+        # after contextual/multi-task dataset transformations).
+        ordered_outcomes = [name for d in datasets for name in d.outcome_names]
         # Re-order candidate metadata
         if not metadata.isnull().all():
             ordered_metadata = []
             for outcome in ordered_outcomes:
-                ordered_metadata.append(candidate_metadata[outcomes.index(outcome)])
+                ordered_metadata.append(candidate_metadata[outcome])
         else:
             ordered_metadata = None