Skip to content

Add heuristic detection for delimiters in importing process#15521

Open
mikezhanghaozhe wants to merge 5 commits intoJabRef:mainfrom
mikezhanghaozhe:fix-for-issue-12974
Open

Add heuristic detection for delimiters in importing process#15521
mikezhanghaozhe wants to merge 5 commits intoJabRef:mainfrom
mikezhanghaozhe:fix-for-issue-12974

Conversation

@mikezhanghaozhe
Copy link
Copy Markdown
Contributor

Related issues and pull requests

Closes #12974

PR Description

Delimiters are now separate, where default delimiters IMPORT_KEYWORD_DELIMITERS are used in the importing process and user's preferred delimiters are used for displaying. Thus, the default delimiters can be flexible and accommodate ";".

Things to note:

  1. For now, default delimiters only include [";", ","]. This list is flexible and can be connected/replaced by user's preferred delimiters later if needed.
  2. The list of default delimiters have priority, which means if ";" is detected, "," will no longer be considered as a delimiter in the importing process. This addresses the concern of having "," as a valid part of the keyword. #12974 Comment
  3. Citation like """ @Article{, Keywords={asdf,asdf,asdf}, } """ in BibtexParserTest will be deduplicated during importing process. The previous unit test preserved the duplicated keywords.

Steps to test

  1. Create a new entry that has keywords field containing ";".
Screenshot 2026-04-09 at 10 21 45 AM
  1. Keywords are separate properly.
Screenshot 2026-04-09 at 10 22 04 AM
  1. The original BibTex source also displays the delimiter user prefers ("," in this case).
Screenshot 2026-04-09 at 10 22 12 AM

Checklist

  • I own the copyright of the code submitted and I license it under the MIT license
  • I manually tested my changes in running JabRef (always required)
  • I added JUnit tests for changes (if applicable)
  • I added screenshots in the PR description (if change is visible to the user)
  • [/] I added a screenshot in the PR description showing a library with a single entry with me as author and as title the issue number
  • I described the change in CHANGELOG.md in a way that can be understood by the average user (if change is visible to the user)
  • [/] I checked the user documentation for up to dateness and submitted a pull request to our user documentation repository

Add KeywordList.parseImport with default delimiters (checking semicolons first, then commas). It detects the default delimiters and normalize keywords
with user's customized delimiter in preference during import.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Add heuristic keyword delimiter detection for BibTeX import

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Implement heuristic keyword delimiter detection during BibTeX import
  - Prioritizes semicolon over comma to avoid false splits
  - Normalizes imported keywords to user's preferred delimiter
• Add KeywordList.parseImport() method with delimiter detection logic
• Update keyword parsing in BibtexParser to use new import method
• Remove duplicate keywords during import process
• Update test cases to reflect deduplication behavior
Diagram
flowchart LR
  A["BibTeX Import"] --> B["BibtexParser.parseField"]
  B --> C["KeywordList.parseImport"]
  C --> D["detectImportDelimiter"]
  D --> E["Check Delimiters in Priority Order"]
  E --> F["Semicolon Found?"]
  F -->|Yes| G["Use Semicolon"]
  F -->|No| H["Use Comma"]
  G --> I["Normalize to User Delimiter"]
  H --> I
  I --> J["Store Keywords"]
Loading

Grey Divider

File Changes

1. jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java ✨ Enhancement +25/-3

Integrate heuristic keyword delimiter detection in parser

• Add imports for Keyword and KeywordList classes
• Define IMPORT_KEYWORD_DELIMITERS constant with priority order [';', ',']
• Update keyword field parsing to use KeywordList.parseImport() for heuristic delimiter detection
• Handle both single and duplicate keyword field scenarios with proper delimiter conversion
• Add comments explaining the import process and delimiter handling

jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java


2. jablib/src/main/java/org/jabref/model/entry/KeywordList.java ✨ Enhancement +26/-0

Add import-specific keyword parsing with delimiter detection

• Add parseImport() method that detects delimiter heuristically from keyword string
• Add detectImportDelimiter() static method that checks delimiters in priority order
• Falls back to comma if no delimiters from the list are found
• Supports hierarchical keywords and maintains existing parse behavior

jablib/src/main/java/org/jabref/model/entry/KeywordList.java


3. jablib/src/test/java/org/jabref/logic/importer/fileformat/BibtexParserTest.java 🧪 Tests +16/-2

Update tests for keyword deduplication and semicolon support

• Update parseDuplicateKeywordsWithOnlyOneEntry() to expect deduplicated keywords
• Update parseDuplicateKeywordsWithTwoEntries() to expect deduplicated keywords
• Add new test parseSemicolonSeparatedKeywords() to verify semicolon delimiter handling
• Verify that semicolon-separated keywords are normalized to comma delimiter

jablib/src/test/java/org/jabref/logic/importer/fileformat/BibtexParserTest.java


View more (2)
4. jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java 🧪 Tests +31/-0

Add comprehensive tests for keyword import parsing

• Add parseImportSingleKeyword() test for single keyword import
• Add parseImportWithSemicolonDelimiter() test for semicolon-separated keywords
• Add parseImportWithCommaDelimiter() test for comma-separated keywords
• Add parseImportPrefersSemicolonOverComma() test verifying priority order
• Add parseImportHierarchicalChain() test for hierarchical keyword import

jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java


5. CHANGELOG.md 📝 Documentation +1/-0

Document keyword delimiter detection feature

• Add entry documenting heuristic keyword delimiter detection feature
• Reference issue #12974 for context
• Explain that special delimiters like semicolon are replaced with comma during import

CHANGELOG.md


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 9, 2026

Code Review by Qodo

🐞 Bugs (2)   📘 Rule violations (4)   📎 Requirement gaps (0)   🎨 UX Issues (0)
🐞\ ≡ Correctness (2)
📘\ ≡ Correctness (1) ⚙ Maintainability (3)

Grey Divider


Action required

1. IMPORT_KEYWORD_DELIMITERS not static final 📘
Description
IMPORT_KEYWORD_DELIMITERS is an instance field named like a constant (UPPER_SNAKE_CASE), which
violates existing naming conventions and may trigger style/check rules. It should be a `private
static final` constant (or renamed to lowerCamelCase if intentionally instance-scoped).
Code

jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[R112-115]

+    // Default delimiters to try when importing keywords, in priority order.
+    // It overrides the delimiter stored in preference.
+    private final List<Character> IMPORT_KEYWORD_DELIMITERS = List.of(';', ',');
+
Evidence
The checklist requires following existing formatting/naming conventions. The added field uses
constant-style naming but is not static final, unlike nearby constants in the same class.

AGENTS.md
jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[112-115]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`IMPORT_KEYWORD_DELIMITERS` is named like a constant but is not `static final`, which breaks the project's naming/style conventions.

## Issue Context
The surrounding fields in `BibtexParser` use `private static final ...` for constants; introducing an instance-level UPPER_SNAKE_CASE field is inconsistent and may fail style checks.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[112-115]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Changelog claims comma replacement 📘
Description
The new CHANGELOG entry states that ";" will be replaced with "," during import, but the
implementation re-serializes keywords using the user's configured separator, which is not
necessarily a comma. This makes the release note misleading for end users.
Code

CHANGELOG.md[14]

+- We added support for heuristic keywords delimiter detection: special delimiters such as ";" will be replaced with "," in the importing process. [#12974](https://github.com/JabRef/jabref/issues/12974)
Evidence
The checklist requires changelog entries to be end-user focused and professionally accurate. The
changelog claims a hard-coded comma replacement, while the code explicitly uses
importFormatPreferences.bibEntryPreferences().getKeywordSeparator() when writing the imported
keywords.

AGENTS.md
CHANGELOG.md[14-14]
jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[810-815]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The CHANGELOG entry incorrectly claims that semicolons are replaced with commas during import.

## Issue Context
The implementation detects import delimiters heuristically but writes keywords back using the user's configured keyword separator, not always a comma.

## Fix Focus Areas
- CHANGELOG.md[14-14]
- jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[810-815]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Missing OpenFastTrace requirement entry 📘
Description
This PR introduces a user-visible change to BibTeX keyword import behavior (heuristic delimiter
detection and semicolon splitting) but does not add a corresponding OpenFastTrace req~...~1
requirement in docs/requirements/. This breaks requirements tracing for a significant behavior
change.
Code

jablib/src/main/java/org/jabref/model/entry/KeywordList.java[R91-115]

+    /// Parses the keyword list using heuristic delimiter detection for the importing process.
+    /// Tries each delimiter in the provided list in priority order; if none found, falls back to comma.
+    ///
+    /// @param keywordString a String of keywordChains
+    /// @param delimiters    a List of delimiters used for separating the keywords in the importing process
+    /// @return a parsed list containing the keywordChains
+    public static KeywordList parseImport(@NonNull String keywordString, @NonNull List<Character> delimiters) {
+        Character delimiter = detectImportDelimiter(keywordString, delimiters);
+        return parse(keywordString, delimiter);
+    }
+
+    /// Detects which delimiter to use by checking the keyword string for each candidate in priority order.
+    ///
+    /// @param keywordString a String of keywordChains
+    /// @param delimiters    a List of delimiters used for separating the keywords in the importing process
+    /// @return a character representing the delimiter to use
+    static Character detectImportDelimiter(@NonNull String keywordString, @NonNull List<Character> delimiters) {
+        for (Character delimiter : delimiters) {
+            if (keywordString.indexOf(delimiter) >= 0) {
+                return delimiter;
+            }
+        }
+        // Falls back to comma if none of the delimiters are found.
+        return ',';
+    }
Evidence
The checklist requires adding an OpenFastTrace requirement entry for significant bug fixes/features.
The diff shows new import parsing behavior in KeywordList.parseImport(...) and new tests
validating it, but no accompanying requirement documentation was added.

AGENTS.md
jablib/src/main/java/org/jabref/model/entry/KeywordList.java[91-115]
jablib/src/test/java/org/jabref/logic/importer/fileformat/BibtexParserTest.java[2181-2193]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A significant import behavior change was implemented without adding a corresponding OpenFastTrace requirement entry under `docs/requirements/`.

## Issue Context
The PR adds heuristic delimiter detection for keyword import (`parseImport`/`detectImportDelimiter`) and new tests for semicolon-separated keywords.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/model/entry/KeywordList.java[91-115]
- jablib/src/test/java/org/jabref/logic/importer/fileformat/BibtexParserTest.java[2181-2193]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Comma keywords get split 🐞
Description
BibtexParser re-serializes imported keywords using the user’s preferred delimiter without escaping
delimiter characters inside a keyword, so a keyword containing ',' will be split into multiple
keywords on later reads. This silently corrupts keyword data when importing semicolon-delimited
keywords that contain commas and the user preference is ','.
Code

jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[R810-816]

+                    if (StandardField.KEYWORDS == field) { // If there are no duplicated keywords fields encountered yet.
+                        // Import the keywordList with default, heuristic delimiter.
+                        KeywordList parsed = KeywordList.parseImport(content, IMPORT_KEYWORD_DELIMITERS);
+                        // Re-serialize with the user's preference delimiter.
+                        entry.setField(field, parsed.getAsString(
+                                importFormatPreferences.bibEntryPreferences().getKeywordSeparator()));
+                    } else {
Evidence
BibtexParser converts imported keywords to a single string via KeywordList.getAsString using the
preference delimiter. Downstream, JabRef parses the stored keywords field with KeywordList.parse
using the same preference delimiter, but getAsString does not escape delimiter characters within
keywords, so embedded commas become indistinguishable from separators; KeywordList.serialize
contains the required escaping logic but is not used here.

/jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[750-819]
/jablib/src/main/java/org/jabref/model/entry/BibEntry.java[982-997]
/jablib/src/main/java/org/jabref/model/entry/KeywordList.java[53-89]
/jablib/src/main/java/org/jabref/model/entry/KeywordList.java[176-179]
/jablib/src/main/java/org/jabref/model/entry/KeywordList.java[117-133]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When importing keywords, the code re-serializes `KeywordList` using `getAsString(preferenceDelimiter)`, which does not escape delimiter characters inside individual keywords. This corrupts data for cases like `keywordOne, keywordTwo; keywordThree` with preference delimiter `,`.

## Issue Context
- `KeywordList.parse(...)` supports escaping via `Keyword.DEFAULT_ESCAPE_SYMBOL`.
- `KeywordList.serialize(...)` already implements correct escaping for delimiters and hierarchical delimiters.
- `BibEntry` later parses the stored keywords field using the preference delimiter.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[769-818]
- jablib/src/main/java/org/jabref/model/entry/KeywordList.java[117-179]
- jablib/src/main/java/org/jabref/model/entry/BibEntry.java[785-804]

## What to change
- Ensure the string written into `StandardField.KEYWORDS` escapes the *output* delimiter when it occurs inside a keyword node.
- Prefer re-serializing via an escaping-aware routine (e.g., add a `KeywordList.serializeWithSpaces(...)` or adjust usage to `KeywordList.serialize(...)` and ensure spacing expectations are handled consistently).
- Add a regression test covering import of semicolon-separated keywords containing commas with preference delimiter `,` (expect embedded commas to remain part of the keyword, e.g. stored as `keywordOne\, keywordTwo, keywordThree` or equivalent escaped form).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. Trivial keyword parsing comments 📘
Description
Several newly added comments restate exactly what the code does (e.g., "Parse..." and "Add each
keyword individually"), adding noise without explaining rationale. This reduces readability and
conflicts with the project's comment guidelines.
Code

jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[R772-777]

+                    // Parse the new content with the heuristic delimiter.
+                    KeywordList importedKeywords = KeywordList.parseImport(content, IMPORT_KEYWORD_DELIMITERS);
+                    Character outputDelimiter = importFormatPreferences.bibEntryPreferences().getKeywordSeparator();
+                    // Add each keyword individually.
+                    for (Keyword kw : importedKeywords) {
+                        entry.addKeyword(kw, outputDelimiter);
Evidence
The checklist discourages trivial comments that simply narrate the code. The added comments
immediately precede code that already clearly expresses the same actions.

AGENTS.md
jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[772-777]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New comments restate nearby code behavior instead of documenting intent/rationale.

## Issue Context
The keyword import logic is readable without narration; comments should explain the why (e.g., why delimiter priority is needed), not the what.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[772-777]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Escaped delimiter mis-detected 🐞
Description
KeywordList.detectImportDelimiter uses raw indexOf checks, so escaped delimiter literals (e.g.,
"\;") can cause the wrong delimiter to be selected. That can make parseImport pick ';' even when
separators are commas, producing wrong keyword tokenization.
Code

jablib/src/main/java/org/jabref/model/entry/KeywordList.java[R107-115]

+    static Character detectImportDelimiter(@NonNull String keywordString, @NonNull List<Character> delimiters) {
+        for (Character delimiter : delimiters) {
+            if (keywordString.indexOf(delimiter) >= 0) {
+                return delimiter;
+            }
+        }
+        // Falls back to comma if none of the delimiters are found.
+        return ',';
+    }
Evidence
KeywordList.parse explicitly supports escaping (it treats characters following the escape symbol as
literals), which means delimiter characters can appear in the string without acting as separators.
detectImportDelimiter does not account for this and will treat any occurrence—including escaped
ones—as evidence the delimiter is in use.

/jablib/src/main/java/org/jabref/model/entry/KeywordList.java[53-79]
/jablib/src/main/java/org/jabref/model/entry/KeywordList.java[102-115]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`detectImportDelimiter` selects a delimiter by `indexOf`, which counts escaped delimiter characters (e.g., `\;`) as present. This can select the wrong delimiter and break import parsing.

## Issue Context
`KeywordList.parse` already implements escape-aware scanning; delimiter detection should follow the same rules.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/model/entry/KeywordList.java[97-115]

## What to change
- Rework `detectImportDelimiter` to scan `keywordString` character-by-character while tracking escaping (similar to `parse`), and only treat *unescaped* delimiter characters as candidates.
- Add a unit test demonstrating that `parseImport("one\\;two, three", [';', ','])` does **not** select `;` just because it appears escaped.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@github-actions github-actions bot added good second issue Issues that involve a tour of two or three interweaved components in JabRef component: import-load component: preferences labels Apr 9, 2026
@mikezhanghaozhe mikezhanghaozhe changed the title Fix for issue 12974 Add heuristic detection for delimiters in importing process Apr 9, 2026
Comment on lines +112 to +115
// Default delimiters to try when importing keywords, in priority order.
// It overrides the delimiter stored in preference.
private final List<Character> IMPORT_KEYWORD_DELIMITERS = List.of(';', ',');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. import_keyword_delimiters not static final 📘 Rule violation ⚙ Maintainability

IMPORT_KEYWORD_DELIMITERS is an instance field named like a constant (UPPER_SNAKE_CASE), which
violates existing naming conventions and may trigger style/check rules. It should be a `private
static final` constant (or renamed to lowerCamelCase if intentionally instance-scoped).
Agent Prompt
## Issue description
`IMPORT_KEYWORD_DELIMITERS` is named like a constant but is not `static final`, which breaks the project's naming/style conventions.

## Issue Context
The surrounding fields in `BibtexParser` use `private static final ...` for constants; introducing an instance-level UPPER_SNAKE_CASE field is inconsistent and may fail style checks.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[112-115]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


### Added

- We added support for heuristic keywords delimiter detection: special delimiters such as ";" will be replaced with "," in the importing process. [#12974](https://github.com/JabRef/jabref/issues/12974)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Changelog claims comma replacement 📘 Rule violation ≡ Correctness

The new CHANGELOG entry states that ";" will be replaced with "," during import, but the
implementation re-serializes keywords using the user's configured separator, which is not
necessarily a comma. This makes the release note misleading for end users.
Agent Prompt
## Issue description
The CHANGELOG entry incorrectly claims that semicolons are replaced with commas during import.

## Issue Context
The implementation detects import delimiters heuristically but writes keywords back using the user's configured keyword separator, not always a comma.

## Fix Focus Areas
- CHANGELOG.md[14-14]
- jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[810-815]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +91 to +115
/// Parses the keyword list using heuristic delimiter detection for the importing process.
/// Tries each delimiter in the provided list in priority order; if none found, falls back to comma.
///
/// @param keywordString a String of keywordChains
/// @param delimiters a List of delimiters used for separating the keywords in the importing process
/// @return a parsed list containing the keywordChains
public static KeywordList parseImport(@NonNull String keywordString, @NonNull List<Character> delimiters) {
Character delimiter = detectImportDelimiter(keywordString, delimiters);
return parse(keywordString, delimiter);
}

/// Detects which delimiter to use by checking the keyword string for each candidate in priority order.
///
/// @param keywordString a String of keywordChains
/// @param delimiters a List of delimiters used for separating the keywords in the importing process
/// @return a character representing the delimiter to use
static Character detectImportDelimiter(@NonNull String keywordString, @NonNull List<Character> delimiters) {
for (Character delimiter : delimiters) {
if (keywordString.indexOf(delimiter) >= 0) {
return delimiter;
}
}
// Falls back to comma if none of the delimiters are found.
return ',';
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Missing openfasttrace requirement entry 📘 Rule violation ⚙ Maintainability

This PR introduces a user-visible change to BibTeX keyword import behavior (heuristic delimiter
detection and semicolon splitting) but does not add a corresponding OpenFastTrace req~...~1
requirement in docs/requirements/. This breaks requirements tracing for a significant behavior
change.
Agent Prompt
## Issue description
A significant import behavior change was implemented without adding a corresponding OpenFastTrace requirement entry under `docs/requirements/`.

## Issue Context
The PR adds heuristic delimiter detection for keyword import (`parseImport`/`detectImportDelimiter`) and new tests for semicolon-separated keywords.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/model/entry/KeywordList.java[91-115]
- jablib/src/test/java/org/jabref/logic/importer/fileformat/BibtexParserTest.java[2181-2193]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +810 to +816
if (StandardField.KEYWORDS == field) { // If there are no duplicated keywords fields encountered yet.
// Import the keywordList with default, heuristic delimiter.
KeywordList parsed = KeywordList.parseImport(content, IMPORT_KEYWORD_DELIMITERS);
// Re-serialize with the user's preference delimiter.
entry.setField(field, parsed.getAsString(
importFormatPreferences.bibEntryPreferences().getKeywordSeparator()));
} else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Comma keywords get split 🐞 Bug ≡ Correctness

BibtexParser re-serializes imported keywords using the user’s preferred delimiter without escaping
delimiter characters inside a keyword, so a keyword containing ',' will be split into multiple
keywords on later reads. This silently corrupts keyword data when importing semicolon-delimited
keywords that contain commas and the user preference is ','.
Agent Prompt
## Issue description
When importing keywords, the code re-serializes `KeywordList` using `getAsString(preferenceDelimiter)`, which does not escape delimiter characters inside individual keywords. This corrupts data for cases like `keywordOne, keywordTwo; keywordThree` with preference delimiter `,`.

## Issue Context
- `KeywordList.parse(...)` supports escaping via `Keyword.DEFAULT_ESCAPE_SYMBOL`.
- `KeywordList.serialize(...)` already implements correct escaping for delimiters and hierarchical delimiters.
- `BibEntry` later parses the stored keywords field using the preference delimiter.

## Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/fileformat/BibtexParser.java[769-818]
- jablib/src/main/java/org/jabref/model/entry/KeywordList.java[117-179]
- jablib/src/main/java/org/jabref/model/entry/BibEntry.java[785-804]

## What to change
- Ensure the string written into `StandardField.KEYWORDS` escapes the *output* delimiter when it occurs inside a keyword node.
- Prefer re-serializing via an escaping-aware routine (e.g., add a `KeywordList.serializeWithSpaces(...)` or adjust usage to `KeywordList.serialize(...)` and ensure spacing expectations are handled consistently).
- Add a regression test covering import of semicolon-separated keywords containing commas with preference delimiter `,` (expect embedded commas to remain part of the keyword, e.g. stored as `keywordOne\, keywordTwo, keywordThree` or equivalent escaped form).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@jabref-machine
Copy link
Copy Markdown
Collaborator

JUnit tests of jablib are failing. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Source Code Tests / Unit tests (pull_request)" and click on it.

You can then run these tests in IntelliJ to reproduce the failing tests locally. We offer a quick test running howto in the section Final build system checks in our setup guide.

@github-actions
Copy link
Copy Markdown
Contributor

Your pull request conflicts with the target branch.

Please merge with your code. For a step-by-step guide to resolve merge conflicts, see https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-using-the-command-line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: import-load component: preferences good second issue Issues that involve a tour of two or three interweaved components in JabRef status: changes-required Pull requests that are not yet complete

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Treat ";" as keyword separator when importing bibtex data or replace them with ","

2 participants