Rewrite journal abbrevation repository#15336
Rewrite journal abbrevation repository#15336LoayTarek5 wants to merge 67 commits intoJabRef:mainfrom
Conversation
Review Summary by QodoRefactor JournalAbbreviationRepository to use Postgres with fuzzy matching
WalkthroughsDescription• Migrate journal abbreviations from H2 MVStore to Postgres database • Implement fuzzy matching using pg_trgm trigram similarity extension • Generate SQL file instead of MVStore for built-in abbreviations • Refactor repository API with specific getter methods for abbreviation types • Make Abbreviation fields final for immutability and thread safety Diagramflowchart LR
CSV["CSV Files<br/>Journal Lists"]
Generator["JournalListSqlGenerator<br/>Generates SQL"]
SQL["journal-list.sql<br/>Postgres Schema"]
DB["Postgres Database<br/>Built-in Abbreviations"]
Repo["JournalAbbreviationRepository<br/>Query via SQL"]
Custom["Custom Abbreviations<br/>In-Memory"]
CSV --> Generator
Generator --> SQL
SQL --> DB
DB --> Repo
Custom --> Repo
Repo --> Fuzzy["Fuzzy Matching<br/>pg_trgm"]
File Changes1. build-support/src/main/java/JournalListMvGenerator.java
|
Code Review by Qodo
|
jablib/src/main/java/org/jabref/logic/journals/JournalAbbreviationLoader.java
Show resolved
Hide resolved
jablib/src/main/java/org/jabref/logic/journals/JournalAbbreviationLoader.java
Show resolved
Hide resolved
| public void export(@NonNull BibDatabaseContext databaseContext, | ||
| Path file, | ||
| @NonNull List<BibEntry> entries) throws IOException { | ||
| export(databaseContext, file, entries, List.of(), JournalAbbreviationLoader.loadBuiltInRepository()); | ||
| JournalAbbreviationRepository repository = abbreviationRepository != null | ||
| ? abbreviationRepository | ||
| : new JournalAbbreviationRepository(); // fallback to demo data | ||
| export(databaseContext, file, entries, List.of(), repository); | ||
| } |
There was a problem hiding this comment.
3. Exports lack real abbreviations 🐞 Bug ✓ Correctness
TemplateExporter.export falls back to a demo-only JournalAbbreviationRepository when no repository is provided, and built-in/custom TemplateExporter instances are constructed without injecting the real repository, so JournalAbbreviator formatting in exports becomes effectively non-functional.
Agent Prompt
### Issue description
Exports currently fall back to a demo-only abbreviation repository because exporters are constructed without injecting a real `JournalAbbreviationRepository`, breaking `\format[JournalAbbreviator]{...}` behavior in export layouts.
### Issue Context
`TemplateExporter.export()` now selects `new JournalAbbreviationRepository()` when no repository was provided; both built-in exporters and custom exporters are created without supplying one.
### Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/exporter/TemplateExporter.java[203-210]
- jablib/src/main/java/org/jabref/logic/exporter/ExporterFactory.java[27-70]
- jablib/src/main/java/org/jabref/logic/preferences/JabRefCliPreferences.java[1839-1856]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| static void main(String[] args) throws IOException { | ||
| boolean verbose = (args.length == 1) && ("--verbose".equals(args[0])); | ||
|
|
||
| Path abbreviationsDirectory = Path.of("jablib", "src", "main", "abbrv.jabref.org", "journals"); | ||
| if (!Files.exists(abbreviationsDirectory)) { | ||
| System.out.println("Path " + abbreviationsDirectory.toAbsolutePath() + " does not exist"); | ||
| System.exit(0); | ||
| } |
There was a problem hiding this comment.
4. Sql generator main hidden 🐞 Bug ✓ Correctness
JournalListSqlGenerator declares a package-private main method, which prevents the generateJournalListSQL Gradle/JBang task from running and therefore prevents journal-list.sql from being generated.
Agent Prompt
### Issue description
The SQL generator is used as an executable script by the Gradle `generateJournalListSQL` task, but its `main` method is not `public`, so the script cannot be launched.
### Issue Context
If this task fails, `journal-list.sql` will not be generated and built-in abbreviations provisioning will break.
### Fix Focus Areas
- build-support/src/main/java/JournalListSqlGenerator.java[41-43]
- jablib/build.gradle.kts[94-105]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I investigated more and i think the solution would be Squash and Merge becouace it takes all my commits (including the bad intermediate one), combines them into ONE single commit could i use "--force-with-lease" ? |
koppor
left a comment
There was a problem hiding this comment.
Some minor initial comments
However, how do you deal with org.jabref.logic.journals.JournalAbbreviationRepository#findBestFuzzyMatched. The issue pointed to that somehow and especially linked how postgres can do.
test-support/build.gradle.kts
Outdated
| id("java-library") | ||
| } | ||
|
|
||
| dependencies { |
There was a problem hiding this comment.
Read on at https://github.com/JabRef/jabref/blob/main/docs/code-howtos/dependency-management.md
We don't use depdencies any more
|
|
||
| public static synchronized DataSource getDataSource() throws Exception { | ||
| if (pg == null) { | ||
| pg = EmbeddedPostgres.builder().start(); |
| exports org.jabref.languageserver.controller; | ||
| exports org.jabref.languageserver.util; | ||
|
|
||
| requires transitive org.jabref.jablib; |
There was a problem hiding this comment.
this one requires java.sql - so no need ffor line 20`?
There was a problem hiding this comment.
jablib uses requires java.sql (not requires transitive), so java.sql is not transitively available to jabls,
LspLauncher.java directly imports javax.sql.DataSource, jabls needs its own requires java.sql
i also ran ./gradlew :jabls:checkAllModuleInfo, which confirms the current declaration is correct, and when I delete it, it fails.
| for (int i = 0; i < total; i += BATCH_SIZE) { | ||
| int end = Math.min(i + BATCH_SIZE, total); | ||
|
|
||
| writer.write("INSERT INTO journal_abbreviation (name, abbreviation, dotless_abbreviation, shortest_unique_abbreviation) VALUES\n"); |
There was a problem hiding this comment.
oh, this really is hand crafted.
Maybe try with https://github.com/jOOQ/jOOQ
If this is too much, use https://github.com/querydsl/querydsl/tree/master/querydsl-sql or JDBI - https://jdbi.org/#_fluent_api
There was a problem hiding this comment.
The JournalListSqlGenerator is a JBang build-time script that generates a static .sql file, it doesn't execute SQL directly.
the hand crafted SQL uses ON CONFLICT batch inserts for efficiency
the project currently only has jooq:jool (lambda utilities), not the full jOOQ SQL DSL or JDBI,
should I add the full jOOQ SQL library as a JBang dependency to this script, or is the current approach acceptable?given it's a build time file generator, i want to avoid adding a heavy dependency just for string building, what do you think?
|
|
||
| for (int j = i; j < end; j++) { | ||
| Abbreviation abbr = entries[j]; | ||
| String dotless = abbr.getAbbreviation().replace(".", " ").replace(" ", " ").trim(); |
There was a problem hiding this comment.
Don't we have a util class for this?
75c4750 to
77a1f5e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
🚨 TestLens detected 50 failed tests 🚨Here is what you can do:
Test Summary
🏷️ Commit: 8eaa5e0 Test Failures (first 5 of 50)ArXivFetcherTest > abstractIsCleanedUp() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)ArXivFetcherTest > findFullTextByDOI() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
ArXivFetcherTest > findFullTextByTitle() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
ArXivFetcherTest > findFullTextByTitleWithCurlyBracket() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
ArXivFetcherTest > findFullTextByTitleWithCurlyBracketAndPartOfAuthor() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
Muted TestsSelect tests to mute in this pull request:
Reuse successful test results:
Click the checkbox to trigger a rerun:
Learn more about TestLens at testlens.app. |
|
I think the jbang build failure is a CI issue jablib:6.0-SNAPSHOT is not available in the local Maven repo when jbang tries to resolve it, so i believe that it is unrelated to this PR. |
|
Your pull request conflicts with the target branch. Please merge with your code. For a step-by-step guide to resolve merge conflicts, see https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-using-the-command-line. |


Related issues and pull requests
Closes #12571
PR Description
Refactors JournalAbbreviationRepository to use Postgres instead of in-memory HashMaps for built-in journal abbreviations. Abbreviations are now queried on demand via SQL instead of being fully loaded at startup, reducing memory usage and load time.
Steps to test
1- Run JabRef with the profiler, then open a .bib file





2- Select one or more entries
3- Go to Quality -> Cleanup entries
4- Select "Abbreviate (default)"
5- Click "Clean up"
6- Then go back to the profiler and stop it
Checklist
CHANGELOG.mdin a way that can be understood by the average user (if change is visible to the user)