Skip to content

Commit 97ed43e

Browse files
committed
feat: Optimize Scopus backend queries with functional indexes (issue #60)
## Problem Scopus backend was taking 37ms per cached query, 40x slower than other cached backends (<1ms). This was caused by SQL queries using LOWER() functions without corresponding functional indexes, forcing full table scans of Scopus's 30k+ journal records. ## Root Cause The `search_journals_by_name()` method uses `LOWER(normalized_name)` and `LOWER(display_name)` for case-insensitive matching, but no functional indexes existed on these expressions. SQLite could not use the regular indexes on these columns when wrapped in LOWER(), resulting in: - Full table scan of 30,272 Scopus journals - 37ms query time vs <1ms for other backends ## Solution Add functional indexes on `LOWER(normalized_name)` and `LOWER(display_name)` to enable efficient case-insensitive lookups for large datasets. ## Changes - Add `idx_journals_normalized_name_lower` functional index - Add `idx_journals_display_name_lower` functional index ## Performance Impact Scopus cached queries: 37ms → <2ms (20x+ faster) Before: ✓ scopus: BackendStatus.FOUND [cached] (37.38ms) After: ✓ scopus: BackendStatus.FOUND [cached] (1.64ms) This brings Scopus performance in line with other cached backends: - doaj: 0.81ms - retraction_watch: 0.71ms - scopus: 1.64ms ✓ ## Testing All 248 tests pass. Performance verified with multiple queries through CLI. Resolves #60 Part of #52 - Performance: Optimize caching to eliminate redundant API calls
1 parent c070a4f commit 97ed43e

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

src/aletheia_probe/cache.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,8 @@ def _init_database(self) -> None:
154154
-- Indexes for performance
155155
CREATE INDEX IF NOT EXISTS idx_journals_normalized_name ON journals(normalized_name);
156156
CREATE INDEX IF NOT EXISTS idx_journals_display_name ON journals(display_name);
157+
CREATE INDEX IF NOT EXISTS idx_journals_normalized_name_lower ON journals(LOWER(normalized_name));
158+
CREATE INDEX IF NOT EXISTS idx_journals_display_name_lower ON journals(LOWER(display_name));
157159
CREATE INDEX IF NOT EXISTS idx_journals_issn ON journals(issn);
158160
CREATE INDEX IF NOT EXISTS idx_journals_eissn ON journals(eissn);
159161
CREATE INDEX IF NOT EXISTS idx_journal_names_name ON journal_names(name);

0 commit comments

Comments
 (0)