Skip to content

Commit 43cf130

Browse files
ItsMrLinmeta-codesync[bot]
authored andcommitted
Skip outcomes with no observations in TorchAdapter dataset construction (#5077)
Summary: Pull Request resolved: #5077 When tracking metrics have data in only a subset of trials, `_convert_experiment_data` can produce empty `SupervisedDataset`s (0 rows) after NaN filtering. This causes `Standardize` in BoTorch to crash with `ValueError: Can't standardize with no observations`. This is not generation-blocking when the empty metric is only a tracking metric (not in the optimization config). The fix skips such outcomes with a warning instead of producing empty datasets that crash model fitting. Also fixes a latent `IndexError: list index out of range` in the `candidate_metadata` reordering logic: when outcomes are skipped, the metadata list was shorter than the outcomes list, causing misaligned indexing. Changed `candidate_metadata` from a positional list to a dict keyed by outcome name. Meta: this unblocks Ax experiment `ifu_rbvm_session_proxy_pts` Reviewed By: Balandat Differential Revision: D97369642 fbshipit-source-id: 105942828b7bde6fc105eefc73ef82032ab94f59
1 parent 446ddb8 commit 43cf130

1 file changed

Lines changed: 11 additions & 8 deletions

File tree

ax/adapter/torch.py

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -447,7 +447,7 @@ def _convert_experiment_data(
447447
).to_numpy()
448448
metadata = mean_and_params["metadata"]
449449
datasets: list[SupervisedDataset] = []
450-
candidate_metadata = []
450+
candidate_metadata: dict[str, list[dict[str, Any] | None]] = {}
451451
for outcome in outcomes:
452452
outcome_col_name = (
453453
outcome + "_metric" if outcome in duplicated_names else outcome
@@ -513,7 +513,7 @@ def _convert_experiment_data(
513513
group_indices=group_indices,
514514
)
515515
datasets.append(dataset)
516-
candidate_metadata.append(metadata.loc[to_keep].to_list())
516+
candidate_metadata[outcome] = metadata.loc[to_keep].to_list()
517517

518518
# If the search space digest specifies a task feature,
519519
# convert the datasets into MultiTaskDataset.
@@ -542,6 +542,10 @@ def _convert_experiment_data(
542542
)
543543
for dataset in datasets
544544
]
545+
# Build the list of outcomes actually present in datasets (some may
546+
# have been skipped above due to all-NaN observations).
547+
included_outcomes = [name for d in datasets for name in d.outcome_names]
548+
545549
# Check if there is a `parameter_decomposition` experiment property to
546550
# decide whether it is a contextual experiment.
547551
if self._experiment_properties.get("parameter_decomposition", None) is not None:
@@ -551,7 +555,7 @@ def _convert_experiment_data(
551555
list[SupervisedDataset],
552556
process_contextual_datasets(
553557
datasets=datasets,
554-
outcomes=outcomes,
558+
outcomes=included_outcomes,
555559
parameter_decomposition=self._experiment_properties[
556560
"parameter_decomposition"
557561
],
@@ -561,15 +565,14 @@ def _convert_experiment_data(
561565
),
562566
)
563567

564-
# Get the order of outcomes
565-
ordered_outcomes = []
566-
for d in datasets:
567-
ordered_outcomes.extend(d.outcome_names)
568+
# Get the order of outcomes (may differ from included_outcomes
569+
# after contextual/multi-task dataset transformations).
570+
ordered_outcomes = [name for d in datasets for name in d.outcome_names]
568571
# Re-order candidate metadata
569572
if not metadata.isnull().all():
570573
ordered_metadata = []
571574
for outcome in ordered_outcomes:
572-
ordered_metadata.append(candidate_metadata[outcomes.index(outcome)])
575+
ordered_metadata.append(candidate_metadata[outcome])
573576
else:
574577
ordered_metadata = None
575578

0 commit comments

Comments
 (0)