Skip to content

Conversation

@psoujany
Copy link
Contributor

@psoujany psoujany commented Jun 5, 2025

We make use of jtreg to execute openjdk tests for JDK11/17/21 releases on non-UTF-8 returning platforms. We found latest jtreg code is using Files.newBufferedReader(path) to read group files data(TEST.GROUPS) from openjdk via GroupManager (https://github.com/openjdk/jtreg/blob/master/src/share/classes/com/sun/javatest/regtest/config/GroupManager.java#L102C44-L102C61).

This code defaults to return BufferedReader as UTF-8 instance. We see discrepancies when using this version of jtreg on non-UTF-8 platforms where defaultCharset() is non-UTF-8(JDK11 and JDK17).

Hence, we would like to propose a fix of using default.Charset() with Files.newBufferedWriter(Path path, Charset cs) instead of Files.newBufferedReader(path) and Files.readString(Path) to Files.readString(Path,Charset cs) in below jtreg files :
https://github.com/openjdk/jtreg/blob/master/src/share/classes/com/sun/javatest/regtest/config/GroupManager.java#L102C44-L102C61
https://github.com/openjdk/jtreg/blob/master/src/share/classes/com/sun/javatest/regtest/config/ExtraPropDefns.java#L309

We've also tested this fix on OpenJDK supported platforms like Linux, Windows, MAC.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jtreg.git pull/267/head:pull/267
$ git checkout pull/267

Update a local copy of the PR:
$ git checkout pull/267
$ git pull https://git.openjdk.org/jtreg.git pull/267/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 267

View PR using the GUI difftool:
$ git pr show -t 267

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jtreg/pull/267.diff

Using Webrev

Link to Webrev Comment


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jtreg.git pull/267/head:pull/267
$ git checkout pull/267

Update a local copy of the PR:
$ git checkout pull/267
$ git pull https://git.openjdk.org/jtreg.git pull/267/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 267

View PR using the GUI difftool:
$ git pr show -t 267

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jtreg/pull/267.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 5, 2025

👋 Welcome back psoujany! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 5, 2025

@psoujany This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

7904021: Parsing group files using non-UTF-8 encoding fails

Reviewed-by: cstein

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 6 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@sormuras) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@psoujany psoujany changed the title 7904021: Parsing group files(TEST.GROUPS) on non-UTF-8 encoding platforms fails with java.nio.charset.MalformedInputExceptionFixjtreg 7904021: Parsing group files(TEST.GROUPS) on non-UTF-8 encoding platforms fails with java.nio.charset.MalformedInputException Jun 5, 2025
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 5, 2025
@mlbridge
Copy link

mlbridge bot commented Jun 5, 2025

Webrevs

@sormuras
Copy link
Member

sormuras commented Jun 9, 2025

Looking at the former implementation, without NIO API usage, the file read by those methods were assumed to be encoded in ISO 8859-1. Defaulting to UTF-8 was a breaking change, although storing files in UTF-8 nowadays is a common pattern.

Using Charset.defaultCharset() as proposed in the pull request introduces yet another behaviour. Maybe better? Another solution would be to revert to use the original ISO 8859-1. 🤔

@sormuras
Copy link
Member

Please update the title of this PR to read: Parsing group files using non-UTF-8 encoding fails

Also, fix the PR body to appear not empty.

@psoujany psoujany changed the title 7904021: Parsing group files(TEST.GROUPS) on non-UTF-8 encoding platforms fails with java.nio.charset.MalformedInputException 7904021: Parsing group files using non-UTF-8 encoding fails Jun 10, 2025
@psoujany psoujany changed the title 7904021: Parsing group files using non-UTF-8 encoding fails Parsing group files using non-UTF-8 encoding fails Jun 10, 2025
@openjdk openjdk bot removed the rfr Pull request is ready for review label Jun 10, 2025
@psoujany psoujany changed the title Parsing group files using non-UTF-8 encoding fails 7904021: Parsing group files using non-UTF-8 encoding fails Jun 10, 2025
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 10, 2025
@psoujany
Copy link
Contributor Author

@sormuras I've updated the PR title and body. Thank you.

@sormuras
Copy link
Member

sormuras commented Jun 16, 2025

The two places changed in this PR aren't all places in which jtreg reads input from files. Not touch those other places might lead to unexpected/divergent behaviour.

With https://openjdk.org/jeps/400 UTF-8 is the default charset of the standard Java APIs. Yes, that relates to Java 18+, but did you try to store those group files in UTF-8 encoding in your local environment?

Did you try passing file.encoding as a system property to the jtreg runtime? For example: jtreg -J-Dfile.encoding=ISO-8859-1 ...

@psoujany
Copy link
Contributor Author

psoujany commented Jun 18, 2025

In our testing we noticed these 2 places which resolved our issue, will check other places where the change is required.

Yes, we tried keeping TEST.groups in UTF-8 this led to asking .java files too in UTF-8. If we place .java files in UTF-8 then we had encountered javac compilation issues due to encoding mismatch(non UTF-8) platforms.

We also tried passing file.encoding to jtreg but still faced Malformed Error. Hence, we used Charset.defaultCharset().

@sormuras
Copy link
Member

sormuras commented Jul 8, 2025

Right, using Charset.defaultCharset() would also enable support of file.encoding.

And before the conversion to use NIO, the Properties.load(InputStream) method was used. It has:

Reads a property list (key and element pairs) from the input byte stream. The input stream is in a simple line-oriented format as specified in load(Reader) and is assumed to use the ISO 8859-1 character encoding; that is each byte is one Latin1 character. Characters not in Latin1, and certain special characters, are represented in keys and elements using Unicode escapes as defined in section @jls 3.3 of The Java Language Specification

Since jtreg 7, with CODETOOLS-7903091 included, the Properties.load(Reader) is called - which gets the now UTF-8 encoded reader from Files.newBufferedReader(file).

Thus, all-in-all, your change resolves a regression. In the light of that, I'll approve and sponsor this pull request.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 8, 2025
@psoujany
Copy link
Contributor Author

psoujany commented Jul 8, 2025

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jul 8, 2025
@openjdk
Copy link

openjdk bot commented Jul 8, 2025

@psoujany
Your change (at version 6f38fe9) is now ready to be sponsored by a Committer.

@sormuras
Copy link
Member

sormuras commented Jul 8, 2025

/sponsor

@openjdk
Copy link

openjdk bot commented Jul 8, 2025

Going to push as commit 439cb91.
Since your change was applied there have been 6 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 8, 2025
@openjdk openjdk bot closed this Jul 8, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jul 8, 2025
@openjdk
Copy link

openjdk bot commented Jul 8, 2025

@sormuras @psoujany Pushed as commit 439cb91.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@psoujany
Copy link
Contributor Author

Hi @sormuras , I'm looking for this change to be present in jtreg 7.3.1 which is the minimum jtreg version for JDK11 and 17. Could you please help me in getting this PR merged to jtreg7.3.1 binaries with this fix. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

2 participants