Fix pandas 2.x ValueError in monomer SDF loading by anagnorisis2peripeteia · Pull Request #20 · Boehringer-Ingelheim/pyPept

anagnorisis2peripeteia · 2026-04-27T15:24:14Z

Summary

Fixes #18

Pandas 2.x introduced Arrow-backed storage for string columns. When
get_monomer_info() and _load_monomer_sdf() attempt to write a parsed
Python list into these columns via df.loc[idx, col] = [...], pandas 2.x
raises:

ValueError: Must have equal len keys and value when setting with an iterable

because it interprets the list as multiple row values rather than a single
list object for one cell.

Fix

Two changes, applied in both sequence.py and monomerlib.py:

Cast the three list-valued columns (m_Rgroups, m_RgroupIdx,
m_attachmentPointIdx) to object dtype before writing into them.
Only these three columns are cast — the rest of the DataFrame retains
its pandas 2.x type optimisations.
Use df.at[idx, col] (single-cell assignment) instead of
df.loc[idx, col] (which pandas 2.x misinterprets as a broadcast
when given an iterable).

Note on duplication

get_monomer_info() in sequence.py and _load_monomer_sdf() in
monomerlib.py implement the same SDF loading and list-parsing logic
independently. This fix is applied to both. A follow-up refactor could
consolidate them into a single shared loader to avoid this kind of
divergence in future — happy to open a separate PR for that if useful.

Pandas 2.x introduced Arrow-backed string columns that reject list assignment via loc/at, raising: ValueError: Must have equal len keys and value when setting with an iterable Fix: cast the three list-valued columns (m_Rgroups, m_RgroupIdx, m_attachmentPointIdx) to object dtype before writing parsed list values into them, and use df.at[] (single-cell scalar assignment) instead of df.loc[] (which pandas 2.x interprets as a multi-row broadcast when given an iterable). Applied in both get_monomer_info() (sequence.py) and _load_monomer_sdf() (monomerlib.py), which duplicate the same loading pattern. Closes Boehringer-Ingelheim#18

anagnorisis2peripeteia closed this May 1, 2026

anagnorisis2peripeteia deleted the fix/pandas2-monomer-loading branch May 1, 2026 13:53

anagnorisis2peripeteia restored the fix/pandas2-monomer-loading branch May 1, 2026 21:23

anagnorisis2peripeteia reopened this May 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix pandas 2.x ValueError in monomer SDF loading#20

Fix pandas 2.x ValueError in monomer SDF loading#20
anagnorisis2peripeteia wants to merge 1 commit into
Boehringer-Ingelheim:masterfrom
anagnorisis2peripeteia:fix/pandas2-monomer-loading

anagnorisis2peripeteia commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anagnorisis2peripeteia commented Apr 27, 2026

Summary

Fix

Note on duplication

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant