Commit fd41ee6
[BUG] Fix dna2rna() O(n²) performance and docstring parameter name (#337)
#### Reference Issues/PRs
Closes #323.
#### What does this implement/fix? Explain your changes.
`dna2rna()` in `pyaptamer/utils/_rna.py` used `str.replace()` inside a
`for char in result` loop to replace unknown nucleotides with `'N'`.
This caused O(n²) performance because:
1. `str.replace()` scans the entire string for each unique unknown
character
2. The loop iterates over the original string snapshot while `result` is
reassigned on each replacement
The fix replaces the two-step approach (first `str.translate` for T→U,
then loop for unknowns) with a single-pass generator expression that
handles both T→U conversion and unknown→N replacement in one
character-by-character scan, achieving O(n) time complexity.
Benchmark comparison (from issue #323):
| Input Size | Before (loop + `.replace()`) | After (`join` + genexpr) |
Speedup |
|------------|------------------------------|--------------------------|---------|
| 100,000 | ~0.30s | ~0.006s | ~50x |
| 500,000 | ~7.40s | ~0.028s | ~264x |
Also fixes the docstring parameter name from `seq` to `sequence` to
match the actual function signature.
#### What should a reviewer concentrate their feedback on?
- The single-pass approach preserves exact behavior (T→U, unknown→N,
valid characters unchanged)
- All 20 existing RNA tests pass plus 1 new regression test for repeated
unknowns
#### Did you add any tests for the change?
Yes, added `test_dna2rna_repeated_unknowns` which verifies:
- All-unknown sequences are correctly replaced with 'N'
- Mixed valid/unknown sequences are handled correctly
- A 10,000-character unknown sequence runs without timeout (would take
~0.3s with the old O(n²) code)
#### Any other comments?
Full test suite: 338 passed, 3 skipped. All doctests pass.
#### PR checklist
- [x] The PR title starts with either [ENH], [MNT], [DOC], or [BUG].
[BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving
code, [DOC] - writing or improving documentation or docstrings.
- [x] Added/modified tests
- [x] Used pre-commit hooks when committing to ensure that code is
compliant with hooks. Install hooks with `pre-commit install`.
To run hooks independent of commit, execute `pre-commit run --all-files`
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent 8f64b12 commit fd41ee6
1 file changed
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
| 15 | + | |
14 | 16 | | |
15 | 17 | | |
16 | 18 | | |
| |||
32 | 34 | | |
33 | 35 | | |
34 | 36 | | |
35 | | - | |
36 | | - | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
0 commit comments