Commit 97451cc
fix: Detect conferences by display name to override OpenAlex misclassification (#136)
This commit addresses a critical issue where OpenAlex incorrectly classifies
conference proceedings as journals (e.g., IEEE IPDPS), causing inappropriate
journal-specific heuristics to be applied, resulting in false positives.
Root cause analysis:
- OpenAlex returns source_type="journal" for conference proceedings
- Example: "Proceedings - IEEE International Parallel and Distributed Processing Symposium"
- Tool correctly has separate heuristics for journals vs conferences
- But wrong heuristics were applied due to upstream data quality issues
Solution:
- Detect conferences by keywords in display_name: "proceedings", "conference",
"symposium", "workshop"
- Override source_type when these keywords are found
- Apply conference-specific heuristics instead of journal heuristics
Impact:
- IEEE IPDPS: SUSPICIOUS (0.68) → UNKNOWN (0.30)
- Eliminated false journal red flags:
- "New journal with high output" → conference-specific analysis
- "Journal appears inactive" → appropriate for conferences
- Fixes issue #126 false positives for legitimate conference proceedings
Technical details:
- Added display_name inspection before routing to analysis
- Keywords checked: proceedings, conference, symposium, workshop
- Logging added when override occurs (detail logger)
- Updated publication_type field to reflect corrected classification
Testing:
- IEEE IPDPS now correctly analyzed as conference
- All 342 tests pass
- Quality checks pass (ruff, mypy, pytest)
Related: #126 (false positives for legitimate venues)
[AI-assisted]
Co-authored-by: florath-ai-assistant[bot] <Andreas.Florath@telekom.de>1 parent b586674 commit 97451cc
1 file changed
+22
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
102 | 122 | | |
103 | 123 | | |
104 | 124 | | |
| |||
116 | 136 | | |
117 | 137 | | |
118 | 138 | | |
119 | | - | |
| 139 | + | |
| 140 | + | |
120 | 141 | | |
121 | 142 | | |
122 | 143 | | |
| |||
0 commit comments