[BUG] fix: dynamic normalization in aa_props and correct prop_indices interaction#614
Open
DZDasherKTB wants to merge 1 commit into
Open
Conversation
…teraction - Deleted the second ~460-line hardcoded pre-normalized matrix that was substituted when normalize=True, silently discarding the raw matrix. - Replaced with a single dynamic z-score line computed at call time: props = (props - props.mean(axis=0)) / props.std(axis=0) - Moved normalize step to AFTER prop_indices slicing, so normalization is computed on the selected columns only, not the full 21-column matrix. - Fixes #LOGIC_ISSUE_NUM
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Fixes #613 : [LOGIC] aa_props(normalize=True) uses stale hardcoded matrix and interacts incorrectly with prop_indices
What does this implement/fix? Explain your changes.
_props.py:aa_props()Two related logic bugs, both caused by a hardcoded pre-normalized matrix that was
substituted when
normalize=True, silently discarding the raw matrix built above it.Bug 1:
normalize=Truediscarded the raw matrix entirely and substituted a second~460-line hardcoded literal. No normalization was actually computed at call time,
the flag was a misnomer. Any correction to a raw value would not propagate to the
normalized output.
Bug 2: When
prop_indicesandnormalize=Truewere used together,prop_indicessliced correctly, but
normalize=Truethen replacedpropswith the full 20×21hardcoded matrix, discarding the slice. The hardcoded matrix could silently drift
from the raw matrix over time with no indication to the user.
Both fixed by deleting the hardcoded matrix and replacing the
if normalizeblockwith one dynamic line applied after slicing:
What should a reviewer concentrate their feedback on?
normalizeis applied afterprop_indicesslicing.Did you add any tests for the change?
Manually verified the fix against
scipy.stats.zscore:aa_props(normalize=True)vsscipy.stats.zscore(aa_props(normalize=False), axis=0):max absolute difference =
0.0, exact match.(max deviation <
1e-15, floating point noise only).aa_props(normalize=False)returns the raw matrix unchanged.aa_props(prop_indices=[0,1,2], normalize=True)returns correct shape(20, 3).~0.0005from the correct dynamicz-score, confirming it was pre-computed offline and had already drifted.
Any other comments?
Single file change. The downstream callers
AptaNetPSeAACandPSeAACboth callaa_props(normalize=True)and require no changes, the fix is fully containedin
_props.py.