MoleculeLoader is not sklearn-CV-sliceable, breaking AptaNet + Benchmarking

LLM generated content, by Claude Opus 4.8

### Summary

`AptaNetPipeline` now consumes a `MoleculeLoader` only (via the MoleculeLoader-only `PairsToFeatures`), but `MoleculeLoader` is not sliceable, so it cannot be used with scikit-learn cross-validation / grid-search. This breaks `Benchmarking` with AptaNet estimators.

### Details

`Benchmarking.run()` calls `sklearn.model_selection.cross_validate(estimator, X, y, cv=...)`, which slices `X` per fold via `_safe_indexing`. Two problems:

- Passing a **list of pairs** (as `test_benchmarking.py` does) is sliceable, but the list is then rejected by `PairsToFeatures` (`TypeError: PairsToFeatures accepts only a MoleculeLoader as input, got list.`).
- Passing a **`MoleculeLoader`** is accepted by `PairsToFeatures`, but `MoleculeLoader` has no `__len__`/`__getitem__`, so `_safe_indexing` fails (`'MoleculeLoader' object is not subscriptable`).

Reproduce:

```python
import numpy as np
from sklearn.utils import _safe_indexing
from pyaptamer.data import MoleculeLoader

ml = MoleculeLoader(data={"aptamer": ["ACGU"] * 40, "protein": ["MK"] * 40})
_safe_indexing(ml, np.arange(10))  # TypeError: not subscriptable
```

### Proposed fix

Make `MoleculeLoader` sklearn-sliceable: add `__len__` (number of materialized samples) and `__getitem__` (integer / array / slice → a sub-`MoleculeLoader` over the selected rows), so it survives `cross_validate` and returns a loader each fold. Then migrate `test_benchmarking.py` to pass a `MoleculeLoader` and re-enable the skipped tests.

### Affected tests (currently skipped)

- `pyaptamer/benchmarking/tests/test_benchmarking.py::test_benchmarking_with_predefined_split_classification`
- `pyaptamer/benchmarking/tests/test_benchmarking.py::test_benchmarking_with_predefined_split_regression`

Skipped in the PR that lands the pipeline migration so `main` stays green; this issue tracks the real fix.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MoleculeLoader is not sklearn-CV-sliceable, breaking AptaNet + Benchmarking #706

Summary

Details

Proposed fix

Affected tests (currently skipped)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

MoleculeLoader is not sklearn-CV-sliceable, breaking AptaNet + Benchmarking #706

Description

Summary

Details

Proposed fix

Affected tests (currently skipped)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions