Skip to content

[DRAFT] feat! replace webpack by rsbuild for faster build #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: release
Choose a base branch
from

Conversation

regisb
Copy link
Contributor

@regisb regisb commented Apr 9, 2025

This is the result of a long investigation to accelerate the building of MFE apps in Open edX. Many of the intermediate results can be found in this issue: openedx/wg-frontend#184

The tl;dr is that building MFEs takes a very long time, and that's because of Webpack. In this PR, we replace Webpack by rsbuild, resulting in a considerable performance improvement.

This PR is dependent on the following upstream PRs:

Open questions

  • Does this still support env.config.jsx customisations? I think it should, but I haven't checked.
  • Does this build Paragon? I have no idea. Probably not. To this day I still don't know what's Paragon.

TODO

Things we need to address before (if?) we merge:

  • Remove references to regisb/rsbuild in plugin.py
  • Pin rsbuild dependencies in Dockerfile
  • Add changelog entry

Benchmarks

Configuration

These benchmarks are performed on my personal computer: a Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz with 4 processors and 20 GB RAM. My custom Docker builder runs with max-parallelism = 4, as per these instructions:

$ docker buildx ls
NAME/NODE        DRIVER/ENDPOINT                           STATUS    BUILDKIT   PLATFORMS
max4cpu*         docker-container                                               
 \_ max4cpu0      \_ unix:///var/run/docker.sock           running   v0.13.1    linux/amd64 (+3), linux/386

All plugins are disabled, except for "mfe" and "forum", and there is no bind-mount:

tutor plugins disable all
tutor plugins enable mfe forum
tutor mounts list

The "mfe" image is built from scratch on the "main" branch; then, we re-build the image by clearing the cache for the "*-prod" build steps, which are the ones that run npm run build:

# Update environment
tutor config save
# Fill cache
tutor images build mfe
# Build without cache
time tutor images build --docker-arg=--no-cache-filter=account-prod,authn-prod,authoring-prod,communications-prod,discussions-prod,ecommerce-prod,gradebook-prod,learner-dashboard-prod,learning-prod,ora-grading-prod,payment-prod,profile-prod mfe

The changes in this PR are tested by checking out this branch:

cd <path to>/tutor-mfe
git checkout regisb/rsbuild

Then run the commands above.

We list the benchmarks by date, because we expect these figures to change as we iterate on this PR.

Reference times

Reference 1: total build time

On the main branch, the total elapsed ("real") time of that last step is 468 s (7 min 48 s). This will serve as a point of reference for all benchmarks.

Reference 2: no parallelism build of the learning MFE

Another interesting point of reference is the time that is needed to build the learning MFE, without parallelism -- that is, without concurrent builds of other MFEs. This is a figure that can be obtained by running:

time tutor images build --docker-arg=--no-cache-filter=learning-prod --target=learning-prod mfe

And then checking the time to build the MFE in the Docker logs:

 => [learning-prod 1/1] RUN npm run build                                119.5s

Here, the reference time is 119.5s.

Single-thread build of the learning app

2025/04/10 Locked cache

We added the following locked cache in the build step:

RUN --mount=type=cache,target=/openedx/app/node_modules/.cache,sharing=locked rsbuild build

This causes all builds to run sequentially, and not in parallel. The good thing with this change is that it allows us to no longer customise our Docker builder with max-parallelism=x. The only downside is that there might be a drop in performance; but in practice, we don't really observe a noticeable one.

  • Ref 1: 96s
  • Ref 2: 9.2s (practically unchanged)

2025/04/09 Initial results

  • Ref 1: 85-105s (4.46-5.5x improvement)
  • Ref 2: 7.1s (14.8x improvement)

All MFEs build under 23s, despite the fact that there are 4 builds in parallel:

 => [communications-prod 2/2] RUN rsbuild build                                       21.7s
 => [learner-dashboard-prod 2/2] RUN rsbuild build                                    23.0s
 => [learning-prod 2/2] RUN rsbuild build                                             22.7s
 => [discussions-prod 2/2] RUN rsbuild build                                          22.6s
 => [gradebook-prod 2/2] RUN rsbuild build                                            24.1s
 => [ora-grading-prod 2/2] RUN rsbuild build                                          22.6s
 => [account-prod 2/2] RUN rsbuild build                                              22.7s
 => [authn-prod 2/2] RUN rsbuild build                                                22.7s
 => [profile-prod 2/2] RUN rsbuild build                                              18.6s
 => [authoring-prod 2/2] RUN rsbuild build                                            18.5s

@bradenmacdonald
Copy link

Is there anything you found that doesn't work / isn't supported by rsbuild? If not, we should just replace it everywhere and not just for tutor-mfe.

@regisb
Copy link
Contributor Author

regisb commented Apr 10, 2025

Is there anything you found that doesn't work / isn't supported by rsbuild?

No, I haven't. That's because there's nothing magical with our MFEs: they are just frontend code, bundled as regular js/html/css/images etc.

That being said, I haven't extensively tested the 10 officially maintained MFEs, so it's quite possible I missed some issues.

If not, we should just replace it everywhere and not just for tutor-mfe.

I'm not sure anymore about that. While I worked on this project, I discovered some places where the webpack abstraction leaked into the MFE codebase, and that really feels like an anti-pattern.

Instead, I now think that we should remove all of the build code from all MFE. There shouldn't be any MFE-specific build step. There should be a simple, documented standard to build any MFE. And we should be able to run something like mfe-build --dev ./frontend-app-learning. This "mfe-build" tool could use rsbuild under the hood, or something else, and we should be able to replace that bundler easily in the future, when a better, shinier bundler inevitably comes out.

Note that the locked cache in `rsbuild build` prevents concurrent
builds. This means that we no longer have to limit the parallelism of
the Docker builder, as we used to:
https://docs.tutor.edly.io/troubleshooting.html#high-resource-consumption-by-docker-on-tutor-images-build
@hinakhadim
Copy link
Contributor

Awesome! I tested this PR on my system, and it significantly reduces the build time—from 2007 seconds (using npm run build / Webpack) to 754 seconds (using rsbuild).

With Webpack: [+] Building 2007.7s (127/127) FINISHED docker-container:doublecpu
image

With Rsbuild: [+] Building 754.1s (138/138) FINISHED docker-container:doublecpu
image

I am trying to start(tutor local launch) the system and will give more feedback after tutor local launch.

@bradenmacdonald
Copy link

Instead, I now think that we should remove all of the build code from all MFE. There shouldn't be any MFE-specific build step. There should be a simple, documented standard to build any MFE. And we should be able to run something like mfe-build --dev ./frontend-app-learning. This "mfe-build" tool could use rsbuild under the hood, or something else, and we should be able to replace that bundler easily in the future, when a better, shinier bundler inevitably comes out.

This makes sense to me, and I think it may even have been the original idea behind frontend-build, but things have drifted away from that a bit over time.

@DawoudSheraz DawoudSheraz moved this from Pending Triage to In Progress in Tutor project management Apr 11, 2025
@Danyal-Faheem
Copy link
Contributor

I tested this out on an Arm64 machine (M1 Macbook) and it seems that the rsbuild step breaks on arm64 native images. Here's the traceback:

> [account-prod 2/2] RUN --mount=type=cache,target=/openedx/app/node_modules/.cache,sharing=locked rsbuild build:
0.372 
0.373   Rsbuild v1.3.5
0.373 
1.046 info    Rspack persistent cache enabled (experimental)
4.527 Panic occurred at runtime. Please file an issue on GitHub with the backtrace below: https://github.com/web-infra-dev/rspack/issues
4.527 Message:  created a new `Panic` from: out of range integral type conversion attempted
4.527 Location: index.crates.io-1949cf8c6b5b557f/rancor-0.1.0/src/lib.rs:648
4.527 
4.527 Backtrace omitted.
4.527 
4.527 Run with RUST_BACKTRACE=1 environment variable to display it.
4.527 Run with RUST_BACKTRACE=full to include source snippets.
4.527 Panic occurred at runtime. Please file an issue on GitHub with the backtrace below: https://github.com/web-infra-dev/rspack/issues
4.527 Message:  created a new `Panic` from: out of range integral type conversion attempted
4.527 Location: index.crates.io-1949cf8c6b5b557f/rancor-0.1.0/src/lib.rs:648
4.527 
4.527 Backtrace omitted.
4.527 
4.527 Run with RUST_BACKTRACE=1 environment variable to display it.
4.527 Run with RUST_BACKTRACE=full to include source snippets.
4.563 Aborted
------
Dockerfile:507
--------------------
 505 |     COPY rsbuild.config.ts /openedx/app/rsbuild.config.ts
 506 |     
 507 | >>> RUN --mount=type=cache,target=/openedx/app/node_modules/.cache,sharing=locked rsbuild build
 508 |     
 509 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c rsbuild build" did not complete successfully: exit code: 134

I then tried it out by building it using the amd64 platform using the command: tutor images build mfe --no-cache --docker-arg=--platform=linux/amd64 and it worked perfectly.

The results were impressive nonetheless even with amd64 emulation with the build layers taking 2-3x less time than when we were using webpack.

@regisb
Copy link
Contributor Author

regisb commented Apr 14, 2025

Thanks for testing @Danyal-Faheem! Would you mind trying out debugging with RUST_BACKTRACE, as suggested in the error logs?

@Danyal-Faheem
Copy link
Contributor

@regisb, Running with both RUST_BACKTRACE=full and RUST_BACKTRACE=1 gave me an empty backtrace. I even tried adding RSDOCTOR as mentioned in the config file and changing NODE_ENV=development but to no avail.

------
 > [profile-prod 2/2] RUN --mount=type=cache,target=/openedx/app/node_modules/.cache,sharing=locked rsbuild build:
0.350 
0.351   Rsbuild v1.3.6
0.351 
1.143 info    Rspack persistent cache enabled (experimental)
6.046 Panic occurred at runtime. Please file an issue on GitHub with the backtrace below: https://github.com/web-infra-dev/rspack/issues
6.046 Message:  created a new `Panic` from: out of range integral type conversion attempted
6.046 Location: index.crates.io-1949cf8c6b5b557f/rancor-0.1.0/src/lib.rs:648
6.046 
6.046 Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
6.047 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
6.047 <empty backtrace>
6.095 Aborted
------
Dockerfile:582
--------------------
 580 |     COPY rsbuild.config.ts /openedx/app/rsbuild.config.ts
 581 |     
 582 | >>> RUN --mount=type=cache,target=/openedx/app/node_modules/.cache,sharing=locked rsbuild build
 583 |     
 584 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c rsbuild build" did not complete successfully: exit code: 134

@DawoudSheraz
Copy link
Contributor

I tested this branch on Macbook M2 and got the same error as Danyal mentioned above #252 (comment). However, even when I specified the platform (amd/64) in build argument, the build process did not complete and gave the same error. npm connection errors were happening as well, which has been more prominent than other build issues for me (even though I am using max2cpu config). I am adding some stats below:

  • Indigo enabled
    • Mfe build with —no-cache
      • 614.9s, failed due to storage size issue
    • Mfe build —no-cache —no-registry-cache
      • Failed due to react compatibility issues in Indigo, 1115.7 seconds
  • Indigo Disabled
    • Mfe build —no-cache —no-registry-cache
      • Failed with network error, 795.1s
    • Re-tried but no cache
      • Failed with rsbuild error, 403.1 seconds

@regisb
Copy link
Contributor Author

regisb commented Apr 22, 2025

I was able to reproduce the issue by building on a remote arm64 server:

$ tutor images build --docker-arg=--platform=arm64 --docker-arg=--target=learner-dashboard-prod mfe
...
0.427 
0.427   Rsbuild v1.3.9
0.427 
1.165 info    Rspack persistent cache enabled (experimental)
9.090 Panic occurred at runtime. Please file an issue on GitHub with the backtrace below: https://github.com/web-infra-dev/rspack/issues
9.090 Message:  created a new `Panic` from: out of range integral type conversion attempted
9.090 Location: index.crates.io-1949cf8c6b5b557f/rancor-0.1.0/src/lib.rs:648
9.090 
9.090 Backtrace omitted.
9.090 
9.090 Run with RUST_BACKTRACE=1 environment variable to display it.
9.090 Run with RUST_BACKTRACE=full to include source snippets.
9.254 Aborted (core dumped)

The issue disappears when I disable the cache, which is tagged as "experimental". So what I did was to disable caching on arm64. I also filed the following upstream issue: web-infra-dev/rspack#10118

Thus, Apple users should now be able to test this PR again. 👀 @Danyal-Faheem @DawoudSheraz

regisb added 2 commits April 22, 2025 09:59
Caching should be disabled on this platform until the following issue is
resolved: web-infra-dev/rspack#10118
@Danyal-Faheem
Copy link
Contributor

Danyal-Faheem commented Apr 23, 2025

@regisb I tested it with the updated commits and It's building on arm64 natively now!

The results are also impressive:

Running with a max-parallelism of 2 (I had to do this otherwise the npm clean-install step fails due to network issues), The rsbuild steps are 3-4x faster on average than npm build.

Some stats:

Build Time Comparison

Component Webpack/NPM (s) rsbuild (s)
Full MFE Build 1228.1 748.7
Learner-Dashboard 111.9 48.5
Learning 136.3 11.1
Profile 86.0 13.9

I think this is really impressive but we will have to make changes to the npm clean-install step as well to get the full benefits of rsbuild.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants