Skip to content

Commit 0a6ad56

Browse files
claudeRittick
authored andcommitted
Expand printable lookup table to 256 entries and drop char<127 check
Addresses gemini-code-assist review on PR #30. With a 256-entry table, non-ASCII bytes (>=127) naturally return False, so the explicit char < 127 guard in the detect_ascii_len and detect_unicode_len hot loops becomes redundant. Removing it saves one comparison per loop iteration on the string-detection hot path. https://claude.ai/code/session_01PHLmsRuiwBQJ3n7gvR7Aa5
1 parent 81f799d commit 0a6ad56

1 file changed

Lines changed: 3 additions & 7 deletions

File tree

smda/utility/StringExtractor.py

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
from smda.common import SmdaFunction
66

7-
_IS_PRINTABLE_CHAR_CODE = tuple(chr(char) in string.printable for char in range(127))
7+
_IS_PRINTABLE_CHAR_CODE = tuple(chr(char) in string.printable for char in range(256))
88

99
# ported back from our PR to capa v4.0.0
1010
# https://github.com/mandiant/capa/blob/v4.0.0/capa/features/extractors/smda/insn.py
@@ -65,10 +65,7 @@ def detect_ascii_len(smda_report, offset, maxlen=None):
6565
return 0
6666
char = smda_report.buffer[rva]
6767
while (
68-
char < 127
69-
and _IS_PRINTABLE_CHAR_CODE[char]
70-
and (maxlen is None or ascii_len < maxlen)
71-
and rva + 1 < len(smda_report.buffer)
68+
_IS_PRINTABLE_CHAR_CODE[char] and (maxlen is None or ascii_len < maxlen) and rva + 1 < len(smda_report.buffer)
7269
):
7370
ascii_len += 1
7471
rva += 1
@@ -88,8 +85,7 @@ def detect_unicode_len(smda_report, offset, maxlen=None):
8885
char = smda_report.buffer[rva]
8986
second_char = smda_report.buffer[rva + 1]
9087
while (
91-
char < 127
92-
and _IS_PRINTABLE_CHAR_CODE[char]
88+
_IS_PRINTABLE_CHAR_CODE[char]
9389
and second_char == 0
9490
and (maxlen is None or unicode_len < 2 * maxlen)
9591
and rva + 3 < len(smda_report.buffer)

0 commit comments

Comments
 (0)