Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
323b345
Add RefChecker logic for reference validation Relates to #13604
NishantDG-SST Apr 1, 2026
3ada77b
Update CHANGELOG for #13604
NishantDG-SST Apr 1, 2026
02e4614
Merge remote-tracking branch 'upstream/main' into 13604-refchecker
NishantDG-SST Apr 1, 2026
6463d4a
Updated CHANGELOG
NishantDG-SST Apr 1, 2026
36ac651
Updated e message and comment
NishantDG-SST Apr 1, 2026
ca51c10
Removed Optional.orElse(), added getFieldLatexFree and normalized val…
NishantDG-SST Apr 2, 2026
e35a4d5
Merge remote-tracking branch 'upstream/main' into 13604-refchecker
NishantDG-SST Apr 2, 2026
27acc3e
Added RefChecker tests for UNSURE, arXiv fallback, CrossRef path
NishantDG-SST Apr 3, 2026
71b93d2
Fixed some Logger warn messages, Added docs for clear classification …
NishantDG-SST Apr 4, 2026
99f82c7
Merge branch 'main' into 13604-refchecker
NishantDG-SST Apr 4, 2026
01a522b
Added RefChecker tests from RefChecker test suite
NishantDG-SST Apr 4, 2026
529deed
Merge branch '13604-refchecker' of https://github.com/NishantDG-SST/j…
NishantDG-SST Apr 4, 2026
4b07cf2
Merge branch 'main' into 13604-refchecker
NishantDG-SST Apr 5, 2026
a31f80e
Added RefCheckerUnit tests
NishantDG-SST Apr 7, 2026
57c8933
Merge branch '13604-refchecker' of https://github.com/NishantDG-SST/j…
NishantDG-SST Apr 7, 2026
d31e92d
Merge remote-tracking branch 'upstream/main' into 13604-refchecker
NishantDG-SST Apr 7, 2026
28d15c6
Added more RefCheckerUnit tests of crossRef, arXiv, DOI
NishantDG-SST Apr 7, 2026
291bdee
Merge branch 'main' into 13604-refchecker
NishantDG-SST Apr 10, 2026
56e8a9e
Added bestOfSameLevelUnsure RefCheckerUnit test
NishantDG-SST Apr 10, 2026
215378a
Merge branch '13604-refchecker' of https://github.com/NishantDG-SST/j…
NishantDG-SST Apr 10, 2026
08c316a
Apply suggestions from code review
wanling0000 Apr 11, 2026
d936953
Merge branch 'main' into 13604-refchecker
NishantDG-SST Apr 11, 2026
d3b5338
Addressed arXivMismatchReturnsFake
NishantDG-SST Apr 11, 2026
33e811c
Merge branch '13604-refchecker' of https://github.com/NishantDG-SST/j…
NishantDG-SST Apr 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@

## [Unreleased]

### Added

Check failure on line 12 in CHANGELOG.md

View workflow job for this annotation

GitHub Actions / Markdown

Headings should be surrounded by blank lines

CHANGELOG.md:12 MD022/blanks-around-headings Headings should be surrounded by blank lines [Expected: 1; Actual: 0; Below] [Context: "### Added"] https://github.com/DavidAnson/markdownlint/blob/v0.40.0/doc/md022.md

- We added RefChecker logic to validate entries against DOI, CrossRef, and arXiv sources [#13604](https://github.com/JabRef/jabref/issues/13604)

Check failure on line 13 in CHANGELOG.md

View workflow job for this annotation

GitHub Actions / Markdown

Lists should be surrounded by blank lines

CHANGELOG.md:13 MD032/blanks-around-lists Lists should be surrounded by blank lines [Context: "- We added RefChecker logic to..."] https://github.com/DavidAnson/markdownlint/blob/v0.40.0/doc/md032.md
- We added support for downloading full-text PDFs from Wiley journals via the Wiley TDM API. [#13404](https://github.com/JabRef/jabref/issues/13404)
- We added `--key-patterns` option to CLI parameters to allows users to set a citation key's pattern for a specific entry type. [#14707](https://github.com/JabRef/jabref/issues/14707)
- We added a CLI option `--field-formatters` to the `convert` and `generate-bib-from-aux` commands to apply field formatters during export. [#11520](https://github.com/JabRef/jabref/issues/11520)
Expand Down
43 changes: 43 additions & 0 deletions jablib/src/main/java/org/jabref/logic/database/DuplicateCheck.java
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import java.util.Collection;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.Objects;
Expand All @@ -22,6 +23,7 @@
import org.jabref.model.entry.field.BibField;
import org.jabref.model.entry.field.Field;
import org.jabref.model.entry.field.FieldProperty;
import org.jabref.model.entry.field.InternalField;
import org.jabref.model.entry.field.OrFields;
import org.jabref.model.entry.field.StandardField;
import org.jabref.model.entry.identifier.ISBN;
Expand All @@ -33,6 +35,7 @@

/// This class contains utility method for duplicate checking of entries.
public class DuplicateCheck {
public static final double COMPARE_ENTRIES_THRESHOLD = 0.8; // The threshold that determines if entries are likely to be of the same publication
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity: why 0.8?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So imo at 80%

  • The Title should have a 100% match
  • A mismatching year usually drops the score below 0.8 when with fewer fields.
  • At least the Primary Author matches correctly

the rest 20%

  • Secondary Authors might mismatch or not be present
  • If the DOI is missing in one but present in the other, but the Titles are identical then score stays high

To hit exactly 0.8 with one total mismatch we will need:
5 total common fields 4 of them must be 100% identical (1 + 1 + 1 + 1 + 0) / 5 = 0.8

private static final double DUPLICATE_THRESHOLD = 0.75; // The overall threshold to signal a duplicate pair

private static final Logger LOGGER = LoggerFactory.getLogger(DuplicateCheck.class);
Expand Down Expand Up @@ -338,4 +341,44 @@ public Optional<BibEntry> containsDuplicate(final BibDatabase database,
final BibDatabaseMode bibDatabaseMode) {
return database.getEntries().stream().filter(other -> isDuplicate(entry, other, bibDatabaseMode)).findFirst();
}

/// Computes a similarity score between two entries based on their shared non-internal fields.
///
/// It does not return early when a DOI or other identifier matches.
/// This means that an entry with the correct DOI but a wrong author name will still receive a low score.
/// The old [#isDuplicate] method would have called that a duplicate immediately, which is wrong
/// for reference checking purposes.it ignores JabRefs internal fields such as the citation key and entry type marker.
/// A fetcher never populates those fields so including them in the comparison would produce incorrect results.
/// For eg, if a local entry has a citation key and the fetched entry does not, including the
/// citation key field would skew the score.
///
/// The returned score is a number in the range [0.0, 1.0].
/// A score of 1.0 means every shared non-internal field is identical.
/// A score at or above [#COMPARE_ENTRIES_THRESHOLD] means the entries are likely the same publication.
/// A score of 0.0 is returned when the two entries share no non-internal fields at all.
///
/// @param one the first entry to compare
/// @param two the second entry to compare
/// @return similarity score in [0.0, 1.0]
public static double compareEntries(BibEntry one, BibEntry two) {
StringSimilarity stringSimilarity = new StringSimilarity();

List<Field> commonFields = one.getFields().stream()
.filter(field -> !(field instanceof InternalField))
.filter(field -> two.getField(field).isPresent())
.toList();

if (commonFields.isEmpty()) {
return 0.0;
}

return commonFields.stream()
.mapToDouble(field -> {
String firstValue = one.getField(field).orElse("");
String secondValue = two.getField(field).orElse("");
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compareEntries uses getField(...) (raw field content) which can penalize harmless formatting differences (LaTeX braces/escaping, whitespace, line breaks) and lead to false FAKE/UNSURE classifications. Since DuplicateCheck already normalizes via getFieldLatexFree(...) in its comparison logic, consider using latex-free/normalized values here as well to keep scoring consistent with the rest of the duplicate-checking implementation.

Suggested change
String firstValue = one.getField(field).orElse("");
String secondValue = two.getField(field).orElse("");
String firstValue = one.getFieldLatexFree(field).orElse("");
String secondValue = two.getFieldLatexFree(field).orElse("");

Copilot uses AI. Check for mistakes.
return stringSimilarity.similarity(firstValue, secondValue);
})
.average()
.orElse(0.0);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
package org.jabref.logic.refcheck;

import org.jabref.model.entry.BibEntry;

import org.jspecify.annotations.Nullable;

public record RefCheckResult(RefValidity validity, @Nullable BibEntry otherEntry, double similarityScore) {
}
225 changes: 225 additions & 0 deletions jablib/src/main/java/org/jabref/logic/refcheck/RefChecker.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
package org.jabref.logic.refcheck;

import java.util.Optional;

import org.jabref.logic.database.DuplicateCheck;
import org.jabref.logic.importer.FetcherException;
import org.jabref.logic.importer.fetcher.ArXivFetcher;
import org.jabref.logic.importer.fetcher.CrossRef;
import org.jabref.logic.importer.fetcher.DoiFetcher;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.identifier.ArXivIdentifier;
import org.jabref.model.entry.identifier.DOI;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/// Validates bibliographic entries against authoritative online sources.
///
/// For each entry 3 sources are tried in this order:
///
/// 1. DOI lookup via [DoiFetcher], using the DOI already present in the entry
/// 2. DOI discovery via [CrossRef], then fetching via [DoiFetcher]
/// 3. arXiv identifier lookup via [ArXivFetcher]
///
/// The lookup stops as soon as a [RefValidity#REAL] result is found.
/// If no source produces a real result, the best result among all attempts is returned.
/// UNSURE is considered better than FAKE.
/// Among results with the same validity, the one with the higher similarity score is preferred.
///
/// No LLM is used. Classification is based entirely on
/// [org.jabref.logic.database.DuplicateCheck#compareEntries].
public class RefChecker {

private static final Logger LOGGER = LoggerFactory.getLogger(RefChecker.class);

/// A score below this but above zero means we found something but it does not match well.
/// This boundary separates UNSURE from FAKE when a candidate was found.
private static final double UNSURE_LOWER_BOUND = 0.5;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why 0.5

Copy link
Copy Markdown
Contributor Author

@NishantDG-SST NishantDG-SST Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A score of 0.5 usually means the Title matches but major fields like year or Author are wrong or missing.
So, less than 50% similarity its more likely a different paper with a similar title than it is an UNSURE case

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use self describing constant for magic numbers


private final DoiFetcher doiFetcher;
private final ArXivFetcher arXivFetcher;
private final CrossRef crossRef;

/// Creates a RefChecker using only the two main fetchers.
/// CrossRef is created with its default constructor.
public RefChecker(DoiFetcher doiFetcher, ArXivFetcher arXivFetcher) {
this(doiFetcher, arXivFetcher, new CrossRef());
}

/// Creates a RefChecker with all three dependencies supplied explicitly
///
/// @param doiFetcher the DOI fetcher
/// @param arXivFetcher the arXiv fetcher
/// @param crossRef the CrossRef fetcher used for DOI discovery
public RefChecker(DoiFetcher doiFetcher, ArXivFetcher arXivFetcher, CrossRef crossRef) {
this.doiFetcher = doiFetcher;
this.arXivFetcher = arXivFetcher;
this.crossRef = crossRef;
}

/// Checks a single entry against authoritative online sources and returns a classification.
///
/// The three sources are tried in order. If any of them produces a [RefValidity#REAL] result,
/// that result is returned immediately without trying the remaining sources.
///
/// If none of the sources produce a real result, the best result from all three attempts is returned.
///
/// Individual fetcher failures are logged and treated as "not found" so that
/// a network error from one source does not make the other sources fail
public RefCheckResult check(BibEntry entry) {
RefCheckResult doiResult = checkByDoi(entry);
if (doiResult.validity() == RefValidity.REAL) {
return doiResult;
}

RefCheckResult crossRefResult = checkByCrossRef(entry);
if (crossRefResult.validity() == RefValidity.REAL) {
return crossRefResult;
}

RefCheckResult arXivResult = checkByArXiv(entry);
if (arXivResult.validity() == RefValidity.REAL) {
return arXivResult;
}

return bestOf(doiResult, crossRefResult, arXivResult);
}
Comment on lines +84 to +85
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If none of the sources yields any candidate (otherEntry == null / score 0.0), bestOf(...) currently returns FAKE (because each lookup returns FAKE on “not found”). That conflates “not found / could not verify” with “verified mismatch” and can mislabel obscure/older but real publications as fake. Consider returning UNSURE when no authoritative candidate was found from any source, and reserving FAKE for the case where a candidate exists but similarity is low.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intended behavior is:

  • no candidate found -> FAKE
  • fetch failure -> FAKE
  • low similarity candidate -> FAKE

then I think it’s consistent as it is now,
probably don’t need to change the “not found / could not verify” to UNSURE.

From the classify(...) comments, it seems like this is already the intended behavior.

@koppor do you think this is fine as-is, or do you see this as being too broad semantically?


/// Tries to validate the entry using the DOI already stored in it.
/// Returns a FAKE result with no matched entry if the entry has no DOI
private RefCheckResult checkByDoi(BibEntry entry) {
Optional<DOI> doi = entry.getDOI();
if (doi.isEmpty()) {
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

Optional<BibEntry> found;
try {
found = doiFetcher.performSearchById(doi.get().asString());
} catch (FetcherException e) {
LOGGER.warn("DOI lookup failed for {}: {}", doi.get().asString(), e.getMessage());
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

if (found.isEmpty()) {
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

return classify(entry, found.get());
}

/// Tries to validate the entry by first discovering a DOI through CrossRef,
/// then fetching the authoritative entry using that DOI.
/// Returns a FAKE result with no matched entry if CrossRef finds no DOI
private RefCheckResult checkByCrossRef(BibEntry entry) {
Optional<DOI> foundDoi;
try {
foundDoi = crossRef.findIdentifier(entry);
} catch (FetcherException e) {
LOGGER.warn("CrossRef lookup failed: {}", e.getMessage());
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

if (foundDoi.isEmpty()) {
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

Optional<BibEntry> found;
try {
found = doiFetcher.performSearchById(foundDoi.get().asString());
} catch (FetcherException e) {
LOGGER.warn("DOI fetch after CrossRef discovery failed for {}: {}", foundDoi.get().asString(), e.getMessage());
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

if (found.isEmpty()) {
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

return classify(entry, found.get());
}

/// Tries to validate the entry using arXiv.
/// First looks for an arXiv identifier, then fetches the full entry by that identifier.
/// Returns a FAKE result with no matched entry if no arXiv identifier is found
private RefCheckResult checkByArXiv(BibEntry entry) {
Optional<ArXivIdentifier> identifier;
try {
identifier = arXivFetcher.findIdentifier(entry);
} catch (FetcherException e) {
LOGGER.warn("arXiv identifier lookup failed: {}", e.getMessage());
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

if (identifier.isEmpty()) {
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

Optional<BibEntry> found;
try {
found = arXivFetcher.performSearchById(identifier.get().asString());
} catch (FetcherException e) {
LOGGER.warn("arXiv fetch failed for {}: {}", identifier.get().asString(), e.getMessage());
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

if (found.isEmpty()) {
return new RefCheckResult(RefValidity.FAKE, null, 0.0);
}

return classify(entry, found.get());
}

/// Compares the local entry against the authoritative entry and returns a classified result.
private static RefCheckResult classify(BibEntry local, BibEntry authoritative) {
double score = DuplicateCheck.compareEntries(local, authoritative);

if (score >= DuplicateCheck.COMPARE_ENTRIES_THRESHOLD) {
return new RefCheckResult(RefValidity.REAL, authoritative, score);
}

if (score >= UNSURE_LOWER_BOUND) {
return new RefCheckResult(RefValidity.UNSURE, authoritative, score);
}

return new RefCheckResult(RefValidity.FAKE, authoritative, score);
}

/// Returns the best result from the three attempts.
///
/// REAL > UNSURE > FAKE.
/// Among results with the same validity the one with the higher similarity score wins.
///
/// This method is only called when none of the three results is REAL.
private static RefCheckResult bestOf(RefCheckResult doiResult, RefCheckResult crossRefResult,
RefCheckResult arXivResult) {
RefCheckResult best = doiResult;

if (rank(crossRefResult) > rank(best)
|| (rank(crossRefResult) == rank(best)
&& crossRefResult.similarityScore() > best.similarityScore())) {
best = crossRefResult;
}

if (rank(arXivResult) > rank(best)
|| (rank(arXivResult) == rank(best)
&& arXivResult.similarityScore() > best.similarityScore())) {
best = arXivResult;
}

return best;
}

/// REAL > UNSURE > FAKE
private static int rank(RefCheckResult result) {
return switch (result.validity()) {
case REAL ->
2;
case UNSURE ->
1;
case FAKE ->
0;
};
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
package org.jabref.logic.refcheck;

public enum RefValidity {
REAL,
UNSURE,
FAKE
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.BibEntryTypesManager;
import org.jabref.model.entry.field.Field;
import org.jabref.model.entry.field.InternalField;
import org.jabref.model.entry.field.StandardField;
import org.jabref.model.entry.types.StandardEntryType;

Expand Down Expand Up @@ -627,4 +628,61 @@ void differentInCollectionWithTheSameISBNAreNotDuplicates() {

assertFalse(duplicateChecker.isDuplicate(entryOne, entryTwo, BibDatabaseMode.BIBTEX));
}

@Test
void entriesShareNoNonInternalFields() {
BibEntry one = new BibEntry().withField(StandardField.TITLE, "Some Title");
BibEntry two = new BibEntry().withField(StandardField.AUTHOR, "Some Author");

double score = DuplicateCheck.compareEntries(one, two);

assertEquals(0.0, score);
}

@Test
void sameEntryIsComparedWithItself() {
BibEntry entry = getSimpleArticle();

double score = DuplicateCheck.compareEntries(entry, entry);

assertEquals(1.0, score);
}

@Test
void ignoresCitationKeyWhenDeterminingScore() {
BibEntry one = new BibEntry()
.withField(InternalField.KEY_FIELD, "citationKey1")
.withField(StandardField.TITLE, "Identical Title");
BibEntry two = new BibEntry()
.withField(InternalField.KEY_FIELD, "citationKey2")
.withField(StandardField.TITLE, "Identical Title");

double score = DuplicateCheck.compareEntries(one, two);

assertEquals(1.0, score);
}

@Test
void entriesWithIdenticalTitles() {
BibEntry one = new BibEntry().withField(StandardField.TITLE, "Reinforcement learning: An introduction");
BibEntry two = new BibEntry().withField(StandardField.TITLE, "Reinforcement learning: An introduction");

double score = DuplicateCheck.compareEntries(one, two);

assertTrue(score >= DuplicateCheck.COMPARE_ENTRIES_THRESHOLD);
}

@Test
void entriesWithCompletelyDifferentFields() {
BibEntry one = new BibEntry()
.withField(StandardField.TITLE, "Performance on a Signal")
.withField(StandardField.AUTHOR, "Richard Atkinson");
BibEntry two = new BibEntry()
.withField(StandardField.TITLE, "Rest in Treatment")
.withField(StandardField.AUTHOR, "Elizabeth Ballard");

double score = DuplicateCheck.compareEntries(one, two);

assertTrue(score < DuplicateCheck.COMPARE_ENTRIES_THRESHOLD);
}
Comment on lines +665 to +687
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Weak threshold asserts in tests 📘 Rule violation ☼ Reliability

New tests use predicate assertions (assertTrue(score >= threshold) / < threshold) instead of
asserting exact expected values, weakening regression detection. This violates the unit test
requirement to assert exact values/outputs where possible.
Agent Prompt
## Issue description
The added `DuplicateCheck.compareEntries` tests use threshold-based `assertTrue` predicates, which are considered weak checks.

## Issue Context
Update the tests to assert exact expected values (or exact expected structures) to strengthen regression detection.

## Fix Focus Areas
- jablib/src/test/java/org/jabref/logic/database/DuplicateCheckTest.java[665-687]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

}
Loading
Loading