Hard-fail config validation when bottom-up pipeline is selected with zero / disconnected skeleton edges

# sleap-nn issue draft — Validation for bottom-up + zero/disconnected edges

**Target repo:** `talmolab/sleap-nn`
**Suggested labels:** `enhancement`
**Cross-ref:** Migrated from [talmolab/sleap#1247](https://github.com/talmolab/sleap/issues/1247)

---

## Title

`Hard-fail config validation when bottom-up pipeline is selected with zero / disconnected skeleton edges`

## Body

### Problem

The `bottomup` and `multi_class_bottomup` pipelines compute Part-Affinity Fields over skeleton **edges**. When the user's skeleton has no usable edges, training fails with a cryptic error (or, worse, silently constructs a model with a zero-channel PAF head). There are three sub-cases that should all be caught up-front:

1. **Single-node skeleton** (`len(skeleton.nodes) == 1`, therefore `len(edges) == 0`).
2. **Multi-node skeleton with zero edges declared** (`len(edges) == 0`).
3. **Multi-node skeleton with disconnected components** (the edge graph has > 1 connected component, so PAF-based grouping cannot assemble a full instance).

Originally reported in [sleap#1247](https://github.com/talmolab/sleap/issues/1247) for the legacy TF backend; the same structural gap exists in sleap-nn.

### Where it crashes today

If a user manually selects `bottomup` with a zero-edge skeleton, the failure points are:

- `sleap_nn/data/custom_datasets.py:838` — `self.edge_inds = labels[0].skeletons[0].edge_inds` returns an empty tensor.
- `sleap_nn/data/edge_maps.py:242` — `source_inds = edge_inds[:, 0].to(torch.int32)` fails on a rank-1 empty tensor with a low-level shape error.
- `sleap_nn/architectures/heads.py:333` — `PartAffinityFieldsHead.channels` returns `0`, producing a degenerate head.

The disconnected-graph case (sub-case 3) currently has **no validation at all** and would train to a low-quality model without surfacing the issue.

### Existing precedent (steering, not validation)

The config recommender already detects sub-case 2 and steers users toward `centroid`:

```python
# sleap_nn/config_generator/recommender.py:140-153
if stats.num_edges == 0:
    warnings.append("No edges in skeleton - bottom-up requires edges for PAFs")
    ...
    return PipelineRecommendation(
        recommended="centroid",
        reason="No skeleton edges available for bottom-up",
        ...
    )
```

But the recommender only fires from the TUI / config-generation flow. A user who hand-writes a training config, or whose frontend bypasses the recommender, still hits the crash.

### Proposed change

Add a semantic check in `verify_training_cfg()` (currently at `sleap_nn/config/training_job_config.py:114-125` — only validates schema/required fields):

```python
def verify_training_cfg(cfg: DictConfig) -> DictConfig:
    ...
    check_must_be_set(config)
    check_pipeline_skeleton_compatibility(config)   # ← new
    return config
```

`check_pipeline_skeleton_compatibility` should:

- If `head_configs.bottomup` (or `multi_class_bottomup`) is set, inspect the skeleton edges from `data_config.skeletons` (or the loaded labels).
- **Sub-case 1 (single node):** Raise with a message like:
  > `Bottom-up training requires a multi-node skeleton with edges. Your skeleton has 1 node — use 'single_instance' (single animal per frame) or 'centroid' (top-down for multiple instances of a single landmark) instead.`
- **Sub-case 2 (zero edges, multi-node):** Raise with a message recommending adding edges to the skeleton OR switching to `centroid` (top-down).
- **Sub-case 3 (disconnected components):** Raise with a message identifying the disconnected nodes and recommending either adding bridge edges or switching to top-down.

Implementation note for sub-case 3: a simple union-find over `skeleton.edge_inds` is sufficient; no external graph library needed.

### Where the check should live

Putting it in `verify_training_cfg` ensures it fires before `ModelTrainer.get_model_trainer_from_config()` (`sleap_nn/training/model_trainer.py:118`) instantiates the data pipeline (`sleap_nn/data/custom_datasets.py:2135+` for the bottom-up dispatch), so the user gets a clean error instead of a stack trace.

### Acceptance criteria

- [ ] `verify_training_cfg` raises a clear, actionable `ValueError` (or dedicated config-validation exception) for each of the three sub-cases.
- [ ] Error messages name a specific recommended alternative pipeline.
- [ ] Unit tests cover all three sub-cases plus the happy path (multi-node, connected skeleton).
- [ ] Recommender (`config_generator/recommender.py`) is updated to also catch sub-case 3 (disconnected components), for parity.

### Related

- [talmolab/sleap#1247](https://github.com/talmolab/sleap/issues/1247) — original 2023 enhancement request (legacy TF backend; being closed in favor of this issue).
- A companion issue on the frontend/orchestration side will track surfacing this in the model-config UX so users see the warning before submitting an invalid config.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hard-fail config validation when bottom-up pipeline is selected with zero / disconnected skeleton edges #567

sleap-nn issue draft — Validation for bottom-up + zero/disconnected edges

Title

Body

Problem

Where it crashes today

Existing precedent (steering, not validation)

Proposed change

Where the check should live

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Hard-fail config validation when bottom-up pipeline is selected with zero / disconnected skeleton edges #567

Description

sleap-nn issue draft — Validation for bottom-up + zero/disconnected edges

Title

Body

Problem

Where it crashes today

Existing precedent (steering, not validation)

Proposed change

Where the check should live

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions