Skip to content

Copy GraalVM under shorter path on Windows CI #4917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 26, 2025

Conversation

alexarchambault
Copy link
Collaborator

This adds a nativeImageMaybeCopyGraal command on NativeImageModule. This command only has an effect if NativeImageModule#nativeImageGraalVmCopyBase: Option[os.Path] is non-empty. By default, it's non-empty only in Windows CI environments, where it's Some(os.Path("C:/jvms")). nativeImageMaybeCopyGraal copies the GraalVM distribution under a sub-directory of it, like C:\jvms\5aa5d09c. 5aa5d09c is a hash of the original GraalVM distribution path.

If this directory exists, NativeImageModule#nativeImageTool uses it rather than the original GraalVM distribution, to get the path of native-image. This offers a workaround for too long paths of GraalVM header files, that we run into sometimes on the CI (more on fork CIs than in the main repo, in my experience).

This allows to get rid of the coursier archive cache customizations on the CI, that were meant to work around that problem too. The resulting paths were sometimes still too long, given they contain the URL of the GraalVM distribution, like C:\coursier-arc\https\github.com\graalvm\graalvm-ce-builds\releases\download\jdk-23.0.1\graalvm-community-jdk-23.0.1_windows-x64_bin.zip\. In constrast, with this PR, we get GraalVM distribution paths like C:\jvms\5aa5d09c.

They might disappear across runs, given they're externally managed
These are run like the '/** Usage' ones, but are not put on the website
@alexarchambault
Copy link
Collaborator Author

This also copies the changes in NativeImageModule in dist/package.mill, so that we can benefit from them right away. Applying ci/mill-bootstrap.patch when re-bootstrapping should remove those.

@alexarchambault
Copy link
Collaborator Author

About #4907 (comment), this could be added in coursier. Maybe with something like

ArchiveCache().withShortPathDir(Some(new File("C:/jvms")))

That way, we'd only have to keep NativeImageModule#nativeImageGraalVmCopyBase here (maybe moved to JvmWorkerModule)

@lihaoyi
Copy link
Member

lihaoyi commented Apr 11, 2025

Looking at this PR it feels like it's exposing too much complexity in the API: asking the user to configure an explicit path, and then asking the user to call a separate command to copy things to that explicit path, both seem like incidental complexity that we should try and avoid.

Are there any other options? e.g. Can we always copy the Graal binary to a shorter path in ~/.cache/mill/graal/<hash> before running it?

@alexarchambault
Copy link
Collaborator Author

i reverted all changes, and made it use coursier/coursier#3373 (this needs coursier 2.1.25-M5, whose release is on-going). Going to try to enable it in the bootstrap job via ci/mill-bootstrap.patch (and we should benefit from it here after re-bootstrapping).

Comment on lines 625 to 626
if (isWin) os.Path(System.getenv("UserProfile")) / ".mill/cache/jvm"
else os.home / ".cache/mill/jvm"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snippet here looks a bit unusual. I thought we just use os.home for most things? Or are we using UserProfile in other parts of the code as well?

Copy link
Collaborator Author

@alexarchambault alexarchambault Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, mill.bat uses it here. I also used it in JLineNativeLoader, added in #4882.

Ideally, that directory should be obtained by a native Windows call rather than via the environment. Coursier does it via directories-jvm, but it's quite cumbersome (requires JNI or newer JVM foreign API stuff, or running a powershell script…). Given mill.bat relies on the env for that, it's better just to do the same in Mill IMO.

@@ -51,6 +53,11 @@ trait JvmWorkerModule extends OfflineSupportModule with CoursierModule {

def zincLogDebug: T[Boolean] = Task.Input(Task.ctx().log.debugEnabled)

def useShortJvmPath(jvmId: String): Boolean =
Properties.isWin && (
jvmId.startsWith("graalvm") || jvmId.startsWith("liberica-nik")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's liberica-nik?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an alternative GraalVM distribution. Coursier uses it, as it has support for more platforms (like Alpine Linux on arm64, but not Windows on arm64 though)

 Conflicts:
	build.mill
	core/util/src/mill/util/Jvm.scala
	scalalib/src/mill/scalalib/JvmWorkerModule.scala
@alexarchambault
Copy link
Collaborator Author

Recently run into this issue on my fork here and here, with a somewhat cryptic error message looking like

dist.native.nativeImage os.SubprocessException: Result of C:\coursier-arc\https\github.com\graalvm\graalvm-ce-builds\releases\download\jdk-23.0.1\graalvm-community-jdk-23.0.1_windows-x64_bin.zip\graalvm-community-openjdk-23.0.1+11.1\bin\native-image.cmd�: 1

Copy link
Member

@lihaoyi lihaoyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@alexarchambault alexarchambault marked this pull request as ready for review April 25, 2025 10:44
Comment on lines 629 to 631
os.Path(System.getenv("UserProfile")) / ".mill/cache/jvm"
else {
val cacheBase = sys.env.get("XDG_CACHE_HOME").map(os.Path(_)).getOrElse(os.home / ".cache")
Copy link
Member

@lefou lefou Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things.

  1. When ctx is defined, ctx.get.env is probably the better source for environment variables.
  2. We should either use System.getenv or sys.env.get but not both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val env = ctx.map(_.env).getOrElse(sys.env)

Copy link
Collaborator Author

@alexarchambault alexarchambault Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can check ctx.env for XDG_CACHE_HOME, but I'm more wary of it for UserProfile. On Windows, env var names are case-insensitive, so that the path can be set as Path for example. In that case, System.getenv("PATH") respects the case-insensitiveness on Windows, and finds the variable. sys.env.get("PATH") doesn't. And IIUC, ctx.env.get("PATH") wouldn't either.

So I'd stick to System.getenv for UserProfile, as this is used on Windows only, and unlikely to be customized by users. But ctx.env and both System.getenv and sys.env are fine outside of Windows, for XDG_CACHE_HOME.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, the environment isn't really a Map[String, String], more a Map[CaseInsensitiveString, String]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a comment, since this isn't obvious.

Also, I'd assume Mill already suffers form this nifty details. Ideally, we have a map with case-insensitive keys semantics in TaskCtx.env. Also, sys.env seems to be a bad abstraction when it doesn't work correctly under Windows, so we should always use System.getenv. Nothing we can solve here, but I think we should review Mill regarding this issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the case-insensitive lookup is already implemented in sys.env.

https://www.scala-lang.org/api/current/scala/sys.html#env-0

An immutable Map representing the current system environment.

If lookup fails, use System.getenv(_) for case-insensitive lookup on a certain platform. If that also fails, throw NoSuchElementException.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that TaskCtx.env suffers from that too, yes.

And indeed, sys.env is a bad abstraction. I tend to always use System.getenv in my projects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked that again in a VM to be 100% sure

Capture d’écran 2025-04-25 à 14 53 22

The JDK sources confirm that too.

@alexarchambault alexarchambault merged commit d2552c4 into com-lihaoyi:main Apr 26, 2025
35 of 37 checks passed
@alexarchambault alexarchambault deleted the short-graalvm-paths branch April 26, 2025 10:58
@lefou lefou added this to the 1.0.0-RC1 milestone Apr 26, 2025
@lihaoyi
Copy link
Member

lihaoyi commented Apr 27, 2025

@alexarchambault seems like this has caused some tests to break in main https://github.com/com-lihaoyi/mill/actions/runs/14680562076/job/41202842913

@alexarchambault
Copy link
Collaborator Author

@alexarchambault seems like this has caused some tests to break in main https://github.com/com-lihaoyi/mill/actions/runs/14680562076/job/41202842913

These were broken before merging this PR. The broken task is dist.native.nativeImage, not an example or an integration test. The changes here couldn't break this task (not until a re-bootstrap, that is). I'd expect a re-bootstrap to actually fix those jobs (thanks to the PR here).

lihaoyi added a commit to lihaoyi/mill-1 that referenced this pull request Apr 28, 2025
@lihaoyi
Copy link
Member

lihaoyi commented Apr 28, 2025

@alexarchambault the revert #5009 fixes the tests that have been broken consistently since this PR landed. Rebootstrapping is currently broken due to the same error (https://github.com/com-lihaoyi/mill/actions/runs/14699556292).

I couldn't figure out how this PR causes that breakage either, but will merge the revert first just so we can fix main and unblock re-bootstrapping, and we can figure out how to re-land this in a follow up

lihaoyi added a commit that referenced this pull request Apr 28, 2025
This reverts commit d2552c4.

Just to see if this can fix the main build

CC @alexarchambault
@lihaoyi
Copy link
Member

lihaoyi commented Apr 28, 2025

Publishing seems to be making more progress now since the revert https://github.com/com-lihaoyi/mill/actions/runs/14700178878/job/41248197683

@alexarchambault
Copy link
Collaborator Author

Apparently, the coursier version bump seems to be the problem, rather than the use of shorter paths in NativeImageModule (that wasn't used in the build of Mill itself here). #5015 has issues with long GraalVM paths both on the CI of both my fork and the main repo. I suspect some weird issues with reproduciblity and the state of the coursier cache for JVMs, but can't pinpoint where exactly.

But enabling the use of short GraalVM paths alongside bumping coursier (making sure the Mill build itself relies on shorter paths this time) seems to work, see #5018.

alexarchambault added a commit that referenced this pull request Apr 28, 2025
Comes from #4917, but shouldn't
be a problem
alexarchambault added a commit that referenced this pull request Apr 28, 2025
…utions (new attempt) (#5018)

New attempt at #4917

It seems the too long paths are being an issue as soon as we bump the
coursier version ([this
job](https://github.com/alexarchambault/mill/actions/runs/14705154651/attempts/1)
only bumps the coursier version, and results in issues with too long
GraalVM paths).

Unlike #4917, this not only uses
short paths in examples and integration tests, but also in the Mill
build itself. It seems this makes `dist.native.nativeImage` happy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants