Skip to content

Conversation

@marxin
Copy link
Contributor

@marxin marxin commented Dec 22, 2025

I collected statistics for all our current Python WA modules where the change helps to reduce the compilation outliers. The total build of all the modules shrinks from 600s to 200s. The selected function threshold (100KB of LLVM IR) was selected based on the following numbers:

┌─────────────────────────────────────────────────────────────────────────────────┬───────────────────────┬─────────────────┬────────────────┬────────────┬───────────────┐
│ Function                                                                        │   LLVM IR size (in B) │   Before (in s) │   After (in s) │   x faster │ Note          │
├─────────────────────────────────────────────────────────────────────────────────┼───────────────────────┼─────────────────┼────────────────┼────────────┼───────────────┤
│ <module>:function_810                                                           │                191932 │          111.63 │           1    │      111.6 │ after is <1 s │
│ <module>:function_365                                                           │                518749 │           48.63 │           1.43 │       34   │               │
│ <module>:function_9039                                                          │                464830 │           48.08 │           2.86 │       16.8 │               │
│ <module>:function_5166                                                          │                468479 │           43.58 │           6.94 │        6.3 │               │
│ <module>:function_177                                                           │                404727 │           31.35 │           1.07 │       29.3 │               │
│ <module>:function_12519                                                         │                443508 │           29.63 │           8.8  │        3.4 │               │
│ <module>:function_311                                                           │                235271 │           29.22 │           1    │       29.2 │ after is <1 s │
│ <module>:function_22880                                                         │                313023 │           25.17 │           6.92 │        3.6 │               │
│ <module>:function_20346                                                         │                227496 │           24.12 │           7    │        3.4 │               │
│ libpq.so.5:__wasm_apply_data_relocs                                             │                340545 │           19.67 │           1    │       19.7 │ after is <1 s │
│ <module>:function_336                                                           │                247915 │           13.62 │           1    │       13.6 │ after is <1 s │
│ <module>:function_22552                                                         │                169301 │           11.8  │           3.16 │        3.7 │               │
│ <module>:function_1183                                                          │                194823 │           10.35 │           1    │       10.3 │ after is <1 s │
│ <module>:function_808                                                           │                387141 │            9.34 │           1.54 │        6.1 │               │
│ <module>:function_17074                                                         │                127432 │            8.71 │           2.76 │        3.2 │               │
│ <module>:function_139                                                           │                179688 │            7.07 │           1    │        7.1 │ after is <1 s │
│ <module>:function_812                                                           │                179498 │            6.61 │           1    │        6.6 │ after is <1 s │
│ <module>:function_23605                                                         │                 84627 │            6.49 │           4.83 │        1.3 │               │
│ <module>:function_22322                                                         │                 90254 │            6.11 │           4.93 │        1.2 │               │
│ loop.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_loop                           │                145683 │            6.02 │           1    │        6   │ after is <1 s │
│ <module>:function_21507                                                         │                115251 │            5.87 │           1.9  │        3.1 │               │
│ offsets.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings          │                 41411 │            4.86 │           4.95 │        1   │               │
│ <module>:function_218                                                           │                 38860 │            4.64 │           4.64 │        1   │               │
│ <module>:function_21167                                                         │                 60100 │            4.6  │           3.97 │        1.2 │               │
│ <module>:function_17626                                                         │                 60179 │            3.52 │           2.4  │        1.5 │               │
│ hashtable.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings        │                 34755 │            3.41 │           3.45 │        1   │               │
│ <module>:function_9037                                                          │                140197 │            3.32 │           1    │        3.3 │ after is <1 s │
│ groupby.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_groupby                     │                115042 │            3.23 │           1    │        3.2 │ after is <1 s │
│ _generator.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings       │                 33293 │            3.18 │           3.21 │        1   │               │
│ <module>:function_695                                                           │                109653 │            2.83 │           1    │        2.8 │ after is <1 s │
│ <module>:function_18038                                                         │                145429 │            2.62 │           1    │        2.6 │ after is <1 s │
│ lib.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings              │                 30045 │            2.59 │           2.6  │        1   │               │
│ mtrand.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings           │                 28927 │            2.45 │           2.47 │        1   │               │
│ timestamps.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings       │                 28346 │            2.45 │           2.45 │        1   │               │
│ etree.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_etree                         │                132010 │            2.38 │           1    │        2.4 │ after is <1 s │
│ <module>:function_21626                                                         │                 45890 │            2.2  │           2.28 │        1   │               │
│ etree.cpython-313-wasm32-wasi.so:__wasm_apply_data_relocs                       │                 91676 │            2.17 │           2.51 │        0.9 │               │
│ timedeltas.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings       │                 26355 │            2.08 │           2.08 │        1   │               │
│ <module>:function_1678                                                          │                 34104 │            2.06 │           1.99 │        1   │               │
│ <module>:function_197                                                           │                 69549 │            1.94 │           2.07 │        0.9 │               │
│ <module>:function_18505                                                         │                 47317 │            1.85 │           1.75 │        1.1 │               │
│ _multiarray_umath.cpython-313-wasm32-wasi.so:__wasm_apply_data_relocs           │                 74951 │            1.76 │           1.91 │        0.9 │               │
│ _multiarray_umath.cpython-313-wasm32-wasi.so:__wasm_apply_data_relocs           │                 76106 │            1.76 │           1.99 │        0.9 │               │
│ <module>:function_2070                                                          │                136716 │            1.75 │           1    │        1.8 │ after is <1 s │
│ hashtable.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_hashtable                 │                161367 │            1.58 │           1    │        1.6 │ after is <1 s │
│ index.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings            │                 22482 │            1.48 │           1.44 │        1   │               │
│ <module>:function_19449                                                         │                 30969 │            1.45 │           1.32 │        1.1 │               │
│ period.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings           │                 21974 │            1.44 │           1.41 │        1   │               │
│ algos.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings            │                 22552 │            1.44 │           1.55 │        0.9 │               │
│ _write.cpython-313-wasm32-wasi.so:__pyx_pymod_exec__write                       │                 61709 │            1.43 │           1.53 │        0.9 │               │
│ <module>:function_252                                                           │                 20719 │            1.38 │           1.37 │        1   │               │
│ <module>:function_7717                                                          │                 77908 │            1.38 │           1.43 │        1   │               │
│ <module>:function_21490                                                         │                 29077 │            1.37 │           1.16 │        1.2 │               │
│ <module>:function_17019                                                         │                 30498 │            1.35 │           1.29 │        1   │               │
│ parsers.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings          │                 20742 │            1.3  │           1.3  │        1   │               │
│ <module>:function_22505                                                         │                 28464 │            1.27 │           1.1  │        1.2 │               │
│ <module>:function_18467                                                         │                 52383 │            1.21 │           1    │        1.2 │ after is <1 s │
│ __init__.cpython-313-wasm32-wasi.so:add_apsw_constants                          │                 31287 │            1.19 │           1.23 │        1   │               │
│ corecext.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_corecext                   │                 52555 │            1.19 │           1.19 │        1   │               │
│ dtypes.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings           │                 19827 │            1.17 │           1.15 │        1   │               │
│ _sqlite_ext.cpython-313-wasm32-wasi.so:__pyx_pymod_exec__sqlite_ext             │                 37829 │            1.16 │           1.15 │        1   │               │
│ <module>:function_12304                                                         │                 55745 │            1.14 │           1.1  │        1   │               │
│ <module>:function_23392                                                         │                 27160 │            1.13 │           1    │        1.1 │ after is <1 s │
│ <module>:function_14405                                                         │                 29056 │            1.13 │           1    │        1.1 │ after is <1 s │
│ <module>:function_18023                                                         │                 24761 │            1.09 │           1    │        1.1 │ after is <1 s │
│ aggregations.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_aggregations(_object*) │                 35550 │            1.07 │           1.12 │        1   │               │
│ algos.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_algos                         │                111592 │            1.07 │           1    │        1.1 │ after is <1 s │
│ <module>:function_28130                                                         │                 36104 │            1.07 │           1    │        1.1 │ after is <1 s │
│ _http_parser.cpython-313-wasm32-wasi.so:__pyx_pymod_exec__http_parser           │                 95902 │            1.06 │           1.15 │        0.9 │               │
│ <module>:function_15118                                                         │                 21743 │            1.04 │           1    │        1   │ after is <1 s │
│ parsing.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings          │                 18431 │            1.03 │           1    │        1   │ after is <1 s │
│ nattype.cpython-313-wasm32-wasi.so:__pyx_pymod_exec_nattype                     │                 51922 │            1.02 │           1.05 │        1   │               │
│ <module>:function_26195                                                         │                 28699 │            1.02 │           1    │        1   │ after is <1 s │
│ interval.cpython-313-wasm32-wasi.so:__Pyx_CreateStringTabAndInitStrings         │                 18594 │            1    │           1.01 │        1   │               │
└─────────────────────────────────────────────────────────────────────────────────┴───────────────────────┴─────────────────┴────────────────┴────────────┴───────────────┘

Copilot AI review requested due to automatic review settings December 22, 2025 14:46
@marxin marxin changed the title feat: use no-opt for extremelly large functions feat(LLVM): use no-opt for extremelly large functions Dec 22, 2025
@marxin marxin changed the title feat(LLVM): use no-opt for extremelly large functions feat(LLVM): use no-opt for extremely large functions Dec 22, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a compile-time optimization that selectively disables LLVM optimizations for extremely large functions (>100KB of LLVM IR). This reduces total build time from 600s to 200s by avoiding expensive optimization passes on functions where the optimization overhead outweighs the runtime benefits.

Key Changes:

  • Added optional unoptimized target machine to FuncTranslator for handling large functions
  • Introduced target_machine_with_opt() method to create target machines with configurable optimization levels
  • Implemented size-based threshold (100KB) to switch between optimized and unoptimized compilation

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
lib/compiler-llvm/src/translator/code.rs Added optional unoptimized target machine field and logic to select between optimized/unoptimized compilation based on function size
lib/compiler-llvm/src/config.rs Added target_machine_with_opt() method to create target machines with configurable optimization levels
lib/compiler-llvm/src/compiler.rs Updated FuncTranslator initialization in parallel compilation path to pass both optimized and unoptimized target machines

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@Arshia001 Arshia001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • zen mode activated *

@marxin marxin enabled auto-merge (squash) December 22, 2025 15:26
@syrusakbary
Copy link
Member

What is the speed difference when running Python? Also, can we have just some passes but detect which one is actually slowing things down? (rather than just have 0 passes)

@marxin
Copy link
Contributor Author

marxin commented Dec 22, 2025

What is the speed difference when running Python? Also, can we have just some passes but detect which one is actually slowing things down?

On my AMD 12-core Zen machine, python/python goes from 92s to 10s. One my Framework laptop, it's 135s -> 30s.

My expectation is the outliers are so huge that there will be multiple LLVM passes struggling. Should I create an issue and address it in a more detail?

@marxin marxin disabled auto-merge December 23, 2025 08:20
@marxin
Copy link
Contributor Author

marxin commented Dec 23, 2025

I've just checked the php-benchmark-script and pystone and there's no change from the perspective of the benchmarks before and with the PR changes.

@syrusakbary
Copy link
Member

My expectation is the outliers are so huge that there will be multiple LLVM passes struggling. Should I create an issue and address it in a more detail?

Yes, please

@marxin
Copy link
Contributor Author

marxin commented Dec 23, 2025

Yes, please

#6005

@marxin marxin merged commit 3c25cdb into main Dec 23, 2025
251 of 267 checks passed
@marxin marxin deleted the llvm-no-opt-for-large-fns branch December 23, 2025 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants