Fix #12271: Integrity checker for year, location, and page numbers in booktitle#15465
Fix #12271: Integrity checker for year, location, and page numbers in booktitle#15465Chiragsd13 wants to merge 10 commits intoJabRef:mainfrom
Conversation
Enhance BooktitleChecker to flag booktitle values that contain: - A 4-digit year (e.g. 2015) - A country name (e.g. Norway, Austria, Singapore) - Explicit page-number patterns (e.g. "pp. 1–10", "pages 3-7") Add Countries.java with a hard-coded set of all UN-recognised country names used for the country-presence check. The set is built as a single pre-compiled regex alternation so the pattern is compiled only once. Update BooktitleCheckerTest with parameterised tests covering all three new integrity rules and the blank-value / valid-value edge cases. Closes JabRef#12271
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Review Summary by QodoEnhance BooktitleChecker with year, location, and page detection
WalkthroughsDescription• Adds three new integrity checks to BooktitleChecker for year, location, and page numbers • Detects 4-digit years (1000–2999) in booktitle fields • Detects country names from UN-recognized list using pre-compiled regex • Detects explicit page-number patterns (pp., p., pages keywords) • Creates Countries utility class with hard-coded country name set • Adds comprehensive parameterized tests covering all new checks • Adds localization keys for new warning messages Diagramflowchart LR
BC["BooktitleChecker"]
YC["Year Check<br/>1000-2999"]
CC["Country Check<br/>UN list"]
PC["Page Check<br/>pp/pages"]
CO["Countries<br/>utility class"]
L10N["Localization<br/>keys"]
BC --> YC
BC --> CC
BC --> PC
CC --> CO
BC --> L10N
File Changes1. jablib/src/main/java/org/jabref/logic/integrity/BooktitleChecker.java
|
Code Review by Qodo
1.
|
|
The |
jablib/src/test/java/org/jabref/logic/integrity/BooktitleCheckerTest.java
Show resolved
Hide resolved
jablib/src/main/java/org/jabref/logic/integrity/BooktitleChecker.java
Outdated
Show resolved
Hide resolved
jablib/src/main/java/org/jabref/logic/integrity/BooktitleChecker.java
Outdated
Show resolved
Hide resolved
CI Infrastructure Issue:
|
- Extract year, country, and page-number checks into separate
ValueChecker classes (BooktitleContainsYearChecker,
BooktitleContainsCountryChecker, BooktitleContainsPagesChecker)
so all three issues in one booktitle are reported independently
- Fix word-boundary regex in country checker: replace [a-z]
lookarounds with \p{Alnum} so tokens like USA2015 are not
mis-flagged as locations
- Register all three new checkers in FieldCheckers for BOOKTITLE
- Strengthen tests: use assertEquals with exact expected message
instead of assertNotEquals(Optional.empty()); add regression
test for alphanumeric token false-positive
✅ All tests passed ✅🏷️ Commit: 7a50c37 Learn more about TestLens at testlens.app. |
Related issues and pull requests
Closes #12271
PR Description
Enhances
BooktitleCheckerwith three new integrity checks that warn when abooktitlefield contains data that belongs in dedicated fields: a 4-digit year (1000–2999), a country name from a hard-coded UN-recognised list, or an explicit page-number pattern (pp. X,pages X). A newCountries.javautility class holds the country set and builds a single pre-compiled regex alternation at class-load time for efficient matching.Steps to test
@inproceedingsentry and set thebooktitlefield to:2015 {IEEE} International Conference on Digital Signal Processing, {DSP} 2015, Singapore→ flagged for year and locationEuropean Conference on Circuit Theory and Design, {ECCTD} 2015, Trondheim, Norway→ flagged for year and locationAdvances in Neural Information Processing Systems, pp. 1234-1242→ flagged for page numbersInternational Conference on Machine Learning→ no warningChecklist
CHANGELOG.mdin a way that can be understood by the average user (if change is visible to the user)