Skip to content

Use prebuilt binaries instead of building protoc from sources #19727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: 30.x
Choose a base branch
from

Conversation

comius
Copy link
Contributor

@comius comius commented Dec 18, 2024

Fetch released prebuilt binaries and use them in a platform based select. That selects appropriate binary for the platform or falls back to sources. (It's mostly just about wiring stuff together for WORKSPACE and bzlmod).

Works with both WORKSPACE and bzlmod mode. It even works with --incompatible_enable_proto_toolchain_resolution, because the only configured toolchain now points to the target that has the select.

This PR should be applied during the release: after archives with prebuilt binaries are created (so that we can compute checksums), but before the source archive is created.

Please ignore changes to .pb.h and versions, those are there just to make examples work. That's because version of prebuilt binaries need to match the sources.

@comius comius requested review from a team as code owners December 18, 2024 18:03
@comius comius requested review from acozzette and googleberg and removed request for a team December 18, 2024 18:03
@mkruskal-google mkruskal-google added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Dec 18, 2024
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Dec 18, 2024
Copy link
Member

@googleberg googleberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good if we can put the versioning back to where it was.

PROTOBUF_PYTHON_VERSION = "5.29.3"
PROTOBUF_PHP_VERSION = "4.29.3"
PROTOBUF_RUBY_VERSION = "4.29.3"
PROTOC_VERSION = "29.1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you revert these version changes, the PR should be a lot smaller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, without the versions it's smaller. But none of the tests pass, because code is generated with a different version.

@@ -229,12 +229,22 @@ alias(
visibility = ["//visibility:public"],
)

cc_binary(
alias(
name = "protoc",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make it impossible to opt-out? If that's the case, we should probably hold off on this until 30.x at least

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enable this with some sort of flag, and flip the default in a subsequent breaking change if desired instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's possible to add a completely Starlark flag, to control if prebuilt binaries are used or not.

@zhangskz
Copy link
Member

This PR should be applied during the release: after archives with prebuilt binaries are created (so that we can compute checksums), but before the source archive is created.

Today we tag a release commit and then create binaries so the binaries match the github release tag. Manipulating our source archive to include a commit updating prebuilt binaries might be doable, but would make it deviate from the tagged release commit (and github's source archive) where I think we'd also have to remove prebuilts in the interim. I assume we can't have the release commit point to its own artifacts?

@deannagarcia deannagarcia added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Dec 20, 2024
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Dec 20, 2024
@comius
Copy link
Contributor Author

comius commented Dec 20, 2024

This PR should be applied during the release: after archives with prebuilt binaries are created (so that we can compute checksums), but before the source archive is created.

Today we tag a release commit and then create binaries so the binaries match the github release tag. Manipulating our source archive to include a commit updating prebuilt binaries might be doable, but would make it deviate from the tagged release commit (and github's source archive) where I think we'd also have to remove prebuilts in the interim. I assume we can't have the release commit point to its own artifacts?

1st option
I think I could reorganize this PR, so that during a release you overwrite a single .bzl file with the checksums of prebuilt binaries. The tag would point to sources with the original .bzl file.

2nd option
Since Bazel is hermetic it can generate identical binaries and .zip files every time. (I'm not sure how release zips are generated at the moment). So in theory we could overwrite checksums before the release, even before the bzl files exist. Tag them and then do the release.

3rd option

  • Generate zips.
  • Make a commit with new checksums.
  • Tag
  • Make a release

@fmeum
Copy link

fmeum commented Dec 30, 2024

While I like the direction, what's the reason for not registering a toolchain per OS/architecture with --incompatible_enable_proto_toolchain_resolution plus the existing source-based toolchain last? That would also provide a natural opt-in to source-based toolchains: explicitly registering the source based toolchain first in the main projects MODULE.bazel or .bazelrc file.

The downsides of the select approach I see compared to toolchains are that it doesn't influence the exec platform and that prebuilt support would be limited to OS/arch combinations provided by this repo.

@acozzette acozzette removed their request for review January 7, 2025 19:10
@comius
Copy link
Contributor Author

comius commented Jan 24, 2025

While I like the direction, what's the reason for not registering a toolchain per OS/architecture with --incompatible_enable_proto_toolchain_resolution plus the existing source-based toolchain last?

The registered toolchain points to //:protoc, so the select also gets used. --incompatible_enable_proto_toolchain_resolution works if there's a prebuilt protoc for first execution plaform.

If there are many execution plaforms and the first one doesn't have prebuilt binary it would fall back to the sources.

@JasonLunn
Copy link
Contributor

@comius - can you update this PR to resolve the merge conflicts?

@comius comius force-pushed the prebuilt-binaries branch from da73f4a to 03dc0fc Compare March 4, 2025 10:30
@comius comius requested review from a team as code owners March 4, 2025 10:30
@comius comius requested review from anandolee, thomasvl, googleberg, Logofile, jskeet and acozzette and removed request for a team March 4, 2025 10:30
@comius comius changed the base branch from 29.x to 30.x March 4, 2025 10:30
@comius
Copy link
Contributor Author

comius commented Mar 4, 2025

@comius - can you update this PR to resolve the merge conflicts?

Done. (I rebased to 30.x)

@JasonLunn JasonLunn added 🅰️ safe for tests Mark a commit as safe to run presubmits over and removed wait for user action labels Mar 4, 2025
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Mar 4, 2025
@JasonLunn JasonLunn added 🅰️ safe for tests Mark a commit as safe to run presubmits over and removed wait for user action labels Mar 5, 2025
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Mar 5, 2025
@JasonLunn
Copy link
Contributor

The latest test run has issues across a wide spectrum of configurations, including:

  • all Windows example tests
  • all CMake tests
  • all Java and C# tests
  • Most objective-c tests

@comius
Copy link
Contributor Author

comius commented Mar 5, 2025

@JasonLunn I can fix the failures - but perhaps you missed the point, that this PR is intended to demonstrate how to use prebuilt binaries and even with fixed failures, you wouldn't be able to merge it, because it depends on a released version of binaries.

@comius
Copy link
Contributor Author

comius commented Mar 19, 2025

Hey all,

I've got an idea over a lunch break, how to implement this, without modifying protobuf release procedures. It wouldn't be hard to implement at all. The only caveat is, that it works only with bzlmod.

The sketch of the solution is:

load("@bazel_skylib//rules:native_binary.bzl", "native_binary")

native_binary(
  name = "protoc",
  src = "bin/protoc",
  out = "protoc",
  visibility = ["//visibility:public"],
)
  • it is also patched also with a trivial MODULE.bazel file.
  • protobuf MODULE would depend on all of the new modules, matching the version
  • releases would be tested with prebuilt binaries disabled; and laziness of bzlmod would help us not to download unexisting dependencies
  • after release, all the modules start to exist on BCR and prebuilt binaries start to work

Why does this work?
There's no need to compute checksums. They are computed during release to BCR and verified by bzlmod. The content of protobuf MODULE is unchanged during the release.

WDYT? Should we move forward with this?

cc @pcloudy @fmeum @meisterT

@fmeum
Copy link

fmeum commented Mar 19, 2025

That's a pretty decent approach.

The only downside I see is that users may be led to (mistakenly) believe that they should depend on protobuf-linux-x86_64 directly when they see it in the BCR. In any case, we would have to add "nodep" deps from the protobuf-* modules back to protobuf to ensure that Minimum Version Selection doesn't end up selecting different versions for the various architectures.

This approach doesn't preclude the use of toolchains instead of select in the future, which is nice.

@alexeagle
Copy link
Contributor

I agree, that's a nice formulation that avoids touching the existing release procedure for the protobuf module and allows OSS maintainers to own this setup without google3 access. I think we should use toolchains rather than select from the beginning, but this is easily done based on what's already in toolchains_protoc.

@acozzette acozzette removed their request for review April 7, 2025 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants