Skip to content

[BUG] fix: dynamic normalization in aa_props and correct prop_indices interaction#614

Open
DZDasherKTB wants to merge 1 commit into
gc-os-ai:mainfrom
DZDasherKTB:fix/logic-aa-props-dynamic-normalization
Open

[BUG] fix: dynamic normalization in aa_props and correct prop_indices interaction#614
DZDasherKTB wants to merge 1 commit into
gc-os-ai:mainfrom
DZDasherKTB:fix/logic-aa-props-dynamic-normalization

Conversation

@DZDasherKTB

@DZDasherKTB DZDasherKTB commented May 1, 2026

Copy link
Copy Markdown

Reference Issues/PRs

Fixes #613 : [LOGIC] aa_props(normalize=True) uses stale hardcoded matrix and interacts incorrectly with prop_indices

What does this implement/fix? Explain your changes.

_props.py : aa_props()

Two related logic bugs, both caused by a hardcoded pre-normalized matrix that was
substituted when normalize=True, silently discarding the raw matrix built above it.

Bug 1: normalize=True discarded the raw matrix entirely and substituted a second
~460-line hardcoded literal. No normalization was actually computed at call time,
the flag was a misnomer. Any correction to a raw value would not propagate to the
normalized output.

Bug 2: When prop_indices and normalize=True were used together, prop_indices
sliced correctly, but normalize=True then replaced props with the full 20×21
hardcoded matrix, discarding the slice. The hardcoded matrix could silently drift
from the raw matrix over time with no indication to the user.

Both fixed by deleting the hardcoded matrix and replacing the if normalize block
with one dynamic line applied after slicing:

if normalize:
    props = (props - props.mean(axis=0)) / props.std(axis=0)

What should a reviewer concentrate their feedback on?

  • Confirm normalize is applied after prop_indices slicing.
  • Confirm the z-score formula matches the docstring description.

Did you add any tests for the change?

Manually verified the fix against scipy.stats.zscore:

  • aa_props(normalize=True) vs scipy.stats.zscore(aa_props(normalize=False), axis=0):
    max absolute difference = 0.0 , exact match.
  • Normalized columns confirmed to have mean ≈ 0 and std ≈ 1
    (max deviation < 1e-15, floating point noise only).
  • aa_props(normalize=False) returns the raw matrix unchanged.
  • aa_props(prop_indices=[0,1,2], normalize=True) returns correct shape (20, 3).
  • The old hardcoded matrix had a max diff of ~0.0005 from the correct dynamic
    z-score, confirming it was pre-computed offline and had already drifted.

Any other comments?

Single file change. The downstream callers AptaNetPSeAAC and PSeAAC both call
aa_props(normalize=True) and require no changes, the fix is fully contained
in _props.py.

LLM Assisted, by Claude (Anthropic), reviewed and submitted by DZDasherKTB
(Manually Verified and Reviwed the change along with the matrix, through SciPy)

…teraction

- Deleted the second ~460-line hardcoded pre-normalized matrix that was
  substituted when normalize=True, silently discarding the raw matrix.
- Replaced with a single dynamic z-score line computed at call time:
  props = (props - props.mean(axis=0)) / props.std(axis=0)
- Moved normalize step to AFTER prop_indices slicing, so normalization
  is computed on the selected columns only, not the full 21-column matrix.
- Fixes #LOGIC_ISSUE_NUM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[LOGIC] aa_props(normalize=True) uses stale hardcoded matrix and interacts incorrectly with prop_indices

1 participant