Skip to content

Allow restraints w.r.t. entity name-based subchains; fix parsing with empty comments#379

Merged
wukevin merged 8 commits intomainfrom
kevin/fix-restraints
Jun 3, 2025
Merged

Allow restraints w.r.t. entity name-based subchains; fix parsing with empty comments#379
wukevin merged 8 commits intomainfrom
kevin/fix-restraints

Conversation

@wukevin
Copy link
Contributor

@wukevin wukevin commented Jun 2, 2025

Description

Allow restraints to be specified w.r.t. fasta-based chain naming.

Motivation

Like in #378 this allows for more consistent and predictable chain naming behavior, streamlining post hoc analysis.

Test plan

Added tests to ensure that fasta and restraint parsing have expected compatibility combinations (e.g., if they are mismatched they shouldn't load, if they are matched they should load).

@wukevin wukevin changed the title Handle case when no comments are provided Allow restraints w.r.t. entity name-based subchains; fix parsing with empty comments Jun 3, 2025
@wukevin wukevin marked this pull request as ready for review June 3, 2025 00:13
@wukevin wukevin requested review from arogozhnikov and Copilot June 3, 2025 00:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables using entity names as subchain identifiers for restraints and fixes handling of empty comment fields during restraint parsing.

  • Introduces entity_name_as_subchain flag throughout raw input loading, chain creation, and feature context pipelines
  • Updates _parse_row to default empty comments to "" instead of NaN
  • Adds tests covering manual chain naming and subchain ID assignment

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_restraints.py Expanded imports and added test_restraints_with_manual_chain_names to verify restraint loading
tests/test_inference_dataset.py Added test_entity_names_as_subchain to check subchain IDs with/without entity_name_as_subchain
chai_lab/data/parsing/restraints.py Changed parsing in _parse_row to map NaN comments to empty strings
chai_lab/data/dataset/inference_dataset.py Added entity_name_as_subchain parameter to raw_inputs_to_entitites_data and load_chains_from_raw
chai_lab/chai1.py Propagated entity_name_as_subchain flag into make_all_atom_feature_context and chain loading
Comments suppressed due to low confidence (4)

chai_lab/data/dataset/inference_dataset.py:94

  • Update the docstring to document the new entity_name_as_subchain parameter and its effect on source_pdb_chain_id and subchain_id.
def raw_inputs_to_entitites_data(

chai_lab/data/dataset/inference_dataset.py:180

  • Extend the docstring to explain the entity_name_as_subchain flag and how it alters chain IDs when loading raw inputs.
def load_chains_from_raw(

chai_lab/chai1.py:339

  • Add documentation for the new entity_name_as_subchain argument in the function signature, describing its behavior.
def make_all_atom_feature_context(

tests/test_restraints.py:44

  • Add a unit test to verify that parse_pairwise_table correctly sets comment to an empty string when the CSV comment cell is NaN.
def test_restraints_with_manual_chain_names(entity_name_as_subchain: bool):

@wukevin wukevin merged commit ac9c1f3 into main Jun 3, 2025
4 checks passed
@wukevin wukevin deleted the kevin/fix-restraints branch June 3, 2025 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants