Description
Feature or enhancement
Proposal:
Here are a few optimizing macros, some of which clang under Linux does not "see", because
- https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fgnuc-version
Sets various macros to claim compatibility with the given GCC version (default is 4.2.1)
- verified on https://godbolt.org/z/Gah6sh8EE
None of these are seen by clang-cl on Windows, because there
- clang-cl does not set
__GNUC__
(most probably because too much code out there would then assume "ah - I am on Linux") - but clang-cl does set
__clang__
IMHO, "syncing" them between GCC/clang on Linux and clang-cl on Windows is preferable.
Neither seen on Linux nor on Windows: #130891 would fix:
Lines 323 to 325 in 98fa4a4
Seen on Linux, not seen on Windows: #131019 would fix:
Lines 1460 to 1462 in 98fa4a4
Seen on Linux, not seen on Windows:
cpython/Modules/expat/expat_external.h
Lines 115 to 117 in 98fa4a4
Neither seen on Linux nor on Windows:
cpython/Modules/expat/expat_external.h
Lines 122 to 124 in 98fa4a4
The last two are in vendored code, but I've temporarily modified it (01183d7) and then reverted again (1c4a55d)
Enabling them all for clang-cl on Windows is performance neutral wrt to the pyperformance benchmark.
Benchmark | clang.release.19.1.1.92e5f826ac | clang.release.19.1.1.16a7f4607e.pyHot |
---|---|---|
Geometric mean | (ref) | 1.01x faster |
Benchmark | clang.pgo.19.1.1.92e5f826ac | clang.pgo.19.1.1.16a7f4607e.pyHot |
---|---|---|
Geometric mean | (ref) | 1.01x slower |
Benchmark | clang.release.20.1.0-rc2.92e5f826ac | clang.release.20.1.0-rc2.16a7f4607e.pyHot |
---|---|---|
Geometric mean | (ref) | 1.01x faster |
Benchmark | clang.pgo.20.1.0-rc2.92e5f826ac | clang.pgo.20.1.0-rc2.16a7f4607e.pyHot |
---|---|---|
Geometric mean | (ref) | 1.00x slower |
Details
Benchmark | clang.release.19.1.1.92e5f826ac | clang.release.19.1.1.16a7f4607e.pyHot |
---|---|---|
telco | 11.9 ms | 10.9 ms: 1.10x faster |
xml_etree_parse | 236 ms | 217 ms: 1.09x faster |
logging_format | 15.8 us | 15.0 us: 1.05x faster |
async_tree_eager | 145 ms | 138 ms: 1.05x faster |
async_tree_none_tg | 375 ms | 358 ms: 1.05x faster |
unpickle_list | 5.83 us | 5.57 us: 1.05x faster |
xml_etree_iterparse | 157 ms | 150 ms: 1.05x faster |
unpickle | 23.0 us | 22.0 us: 1.05x faster |
async_tree_memoization_tg | 460 ms | 442 ms: 1.04x faster |
xml_etree_generate | 142 ms | 137 ms: 1.04x faster |
nqueens | 125 ms | 120 ms: 1.04x faster |
async_tree_memoization | 490 ms | 472 ms: 1.04x faster |
async_tree_io | 876 ms | 844 ms: 1.04x faster |
logging_simple | 14.3 us | 13.8 us: 1.04x faster |
deepcopy_reduce | 3.92 us | 3.78 us: 1.04x faster |
crypto_pyaes | 104 ms | 101 ms: 1.04x faster |
pprint_pformat | 2.18 sec | 2.10 sec: 1.04x faster |
async_tree_none | 391 ms | 378 ms: 1.03x faster |
pprint_safe_repr | 1.06 sec | 1.02 sec: 1.03x faster |
async_tree_eager_memoization | 290 ms | 281 ms: 1.03x faster |
json_dumps | 16.5 ms | 16.0 ms: 1.03x faster |
fannkuch | 580 ms | 562 ms: 1.03x faster |
scimark_sparse_mat_mult | 5.76 ms | 5.59 ms: 1.03x faster |
async_tree_eager_io | 822 ms | 798 ms: 1.03x faster |
xml_etree_process | 96.4 ms | 93.7 ms: 1.03x faster |
async_tree_eager_tg | 307 ms | 299 ms: 1.03x faster |
scimark_fft | 481 ms | 470 ms: 1.03x faster |
coroutines | 31.9 ms | 31.2 ms: 1.02x faster |
async_tree_io_tg | 853 ms | 834 ms: 1.02x faster |
pathlib | 255 ms | 250 ms: 1.02x faster |
typing_runtime_protocols | 224 us | 220 us: 1.02x faster |
django_template | 53.7 ms | 52.8 ms: 1.02x faster |
sympy_expand | 650 ms | 640 ms: 1.02x faster |
unpickle_pure_python | 305 us | 300 us: 1.02x faster |
async_tree_eager_memoization_tg | 411 ms | 405 ms: 1.02x faster |
async_tree_cpu_io_mixed_tg | 752 ms | 741 ms: 1.02x faster |
chaos | 88.2 ms | 86.9 ms: 1.02x faster |
sqlite_synth | 3.57 us | 3.52 us: 1.01x faster |
tomli_loads | 2.70 sec | 2.66 sec: 1.01x faster |
pickle_pure_python | 444 us | 438 us: 1.01x faster |
sqlglot_normalize | 150 ms | 148 ms: 1.01x faster |
regex_compile | 171 ms | 169 ms: 1.01x faster |
mako | 17.3 ms | 17.1 ms: 1.01x faster |
sqlglot_parse | 1.67 ms | 1.65 ms: 1.01x faster |
sympy_sum | 211 ms | 208 ms: 1.01x faster |
hexiom | 8.26 ms | 8.19 ms: 1.01x faster |
sqlglot_transpile | 2.06 ms | 2.04 ms: 1.01x faster |
sqlglot_optimize | 74.0 ms | 73.4 ms: 1.01x faster |
python_startup | 43.2 ms | 42.9 ms: 1.01x faster |
async_generators | 540 ms | 536 ms: 1.01x faster |
gc_traversal | 4.82 ms | 4.79 ms: 1.01x faster |
comprehensions | 23.2 us | 23.1 us: 1.01x faster |
generators | 38.1 ms | 37.8 ms: 1.01x faster |
richards_super | 73.9 ms | 73.4 ms: 1.01x faster |
deepcopy | 376 us | 373 us: 1.01x faster |
genshi_text | 29.8 ms | 29.6 ms: 1.01x faster |
pickle_dict | 32.3 us | 32.2 us: 1.00x faster |
scimark_sor | 168 ms | 169 ms: 1.01x slower |
go | 145 ms | 146 ms: 1.01x slower |
pyflate | 596 ms | 602 ms: 1.01x slower |
logging_silent | 133 ns | 135 ns: 1.01x slower |
dulwich_log | 130 ms | 132 ms: 1.01x slower |
regex_v8 | 35.2 ms | 35.6 ms: 1.01x slower |
spectral_norm | 128 ms | 130 ms: 1.02x slower |
docutils | 3.60 sec | 3.66 sec: 1.02x slower |
sympy_integrate | 26.4 ms | 26.8 ms: 1.02x slower |
scimark_monte_carlo | 90.7 ms | 92.6 ms: 1.02x slower |
float | 102 ms | 105 ms: 1.02x slower |
2to3 | 429 ms | 439 ms: 1.02x slower |
nbody | 151 ms | 155 ms: 1.03x slower |
genshi_xml | 71.7 ms | 74.2 ms: 1.04x slower |
Geometric mean | (ref) | 1.01x faster |
Benchmark | clang.pgo.19.1.1.92e5f826ac | clang.pgo.19.1.1.16a7f4607e.pyHot |
---|---|---|
2to3 | 465 ms | 380 ms: 1.22x faster |
async_generators | 506 ms | 490 ms: 1.03x faster |
coroutines | 27.1 ms | 26.4 ms: 1.03x faster |
pidigits | 233 ms | 228 ms: 1.02x faster |
pickle_dict | 27.8 us | 27.3 us: 1.02x faster |
sympy_sum | 187 ms | 184 ms: 1.02x faster |
typing_runtime_protocols | 186 us | 183 us: 1.02x faster |
raytrace | 309 ms | 305 ms: 1.02x faster |
unpickle | 16.6 us | 16.4 us: 1.01x faster |
genshi_xml | 60.4 ms | 59.5 ms: 1.01x faster |
regex_compile | 151 ms | 149 ms: 1.01x faster |
scimark_sparse_mat_mult | 4.82 ms | 4.77 ms: 1.01x faster |
unpack_sequence | 55.7 ns | 55.1 ns: 1.01x faster |
sqlglot_parse | 1.42 ms | 1.41 ms: 1.01x faster |
sqlglot_transpile | 1.75 ms | 1.73 ms: 1.01x faster |
telco | 9.01 ms | 8.91 ms: 1.01x faster |
logging_format | 12.9 us | 12.8 us: 1.01x faster |
unpickle_list | 5.04 us | 4.99 us: 1.01x faster |
nqueens | 95.3 ms | 94.4 ms: 1.01x faster |
async_tree_eager_io | 720 ms | 714 ms: 1.01x faster |
sympy_expand | 556 ms | 551 ms: 1.01x faster |
scimark_lu | 124 ms | 123 ms: 1.01x faster |
docutils | 3.09 sec | 3.07 sec: 1.01x faster |
chaos | 69.1 ms | 68.7 ms: 1.01x faster |
sqlglot_optimize | 63.5 ms | 63.1 ms: 1.01x faster |
sympy_integrate | 22.9 ms | 22.7 ms: 1.01x faster |
spectral_norm | 106 ms | 105 ms: 1.00x faster |
scimark_fft | 352 ms | 351 ms: 1.00x faster |
deepcopy | 298 us | 300 us: 1.00x slower |
generators | 34.0 ms | 34.2 ms: 1.01x slower |
meteor_contest | 119 ms | 119 ms: 1.01x slower |
logging_silent | 106 ns | 106 ns: 1.01x slower |
tomli_loads | 2.21 sec | 2.22 sec: 1.01x slower |
pickle_pure_python | 367 us | 369 us: 1.01x slower |
regex_effbot | 3.21 ms | 3.24 ms: 1.01x slower |
pyflate | 514 ms | 518 ms: 1.01x slower |
sqlite_synth | 3.41 us | 3.44 us: 1.01x slower |
deltablue | 3.66 ms | 3.69 ms: 1.01x slower |
unpickle_pure_python | 247 us | 249 us: 1.01x slower |
nbody | 126 ms | 128 ms: 1.01x slower |
scimark_sor | 140 ms | 141 ms: 1.01x slower |
mdp | 3.13 sec | 3.16 sec: 1.01x slower |
pprint_safe_repr | 891 ms | 899 ms: 1.01x slower |
go | 126 ms | 127 ms: 1.01x slower |
richards_super | 52.0 ms | 52.6 ms: 1.01x slower |
async_tree_eager | 116 ms | 117 ms: 1.01x slower |
regex_dna | 204 ms | 207 ms: 1.01x slower |
create_gc_cycles | 1.49 ms | 1.51 ms: 1.01x slower |
richards | 45.4 ms | 46.0 ms: 1.01x slower |
deepcopy_memo | 33.4 us | 34.1 us: 1.02x slower |
async_tree_eager_tg | 267 ms | 273 ms: 1.02x slower |
json_loads | 31.2 us | 31.9 us: 1.02x slower |
pprint_pformat | 1.79 sec | 1.85 sec: 1.03x slower |
gc_traversal | 5.03 ms | 5.28 ms: 1.05x slower |
xml_etree_parse | 208 ms | 220 ms: 1.06x slower |
async_tree_io | 759 ms | 832 ms: 1.10x slower |
asyncio_tcp | 1.38 sec | 1.52 sec: 1.10x slower |
xml_etree_process | 78.5 ms | 87.4 ms: 1.11x slower |
xml_etree_generate | 114 ms | 128 ms: 1.11x slower |
async_tree_memoization_tg | 392 ms | 449 ms: 1.15x slower |
async_tree_io_tg | 746 ms | 855 ms: 1.15x slower |
async_tree_memoization | 414 ms | 477 ms: 1.15x slower |
async_tree_none_tg | 325 ms | 382 ms: 1.17x slower |
xml_etree_iterparse | 141 ms | 172 ms: 1.22x slower |
Geometric mean | (ref) | 1.01x slower |
Benchmark | clang.release.20.1.0-rc2.92e5f826ac | clang.release.20.1.0-rc2.16a7f4607e.pyHot |
---|---|---|
spectral_norm | 139 ms | 124 ms: 1.13x faster |
pickle_list | 5.89 us | 5.46 us: 1.08x faster |
sqlite_synth | 3.71 us | 3.51 us: 1.06x faster |
pickle_dict | 32.3 us | 30.7 us: 1.05x faster |
unpickle | 20.8 us | 20.0 us: 1.04x faster |
json_loads | 43.0 us | 41.3 us: 1.04x faster |
unpickle_list | 5.35 us | 5.15 us: 1.04x faster |
mako | 16.9 ms | 16.3 ms: 1.04x faster |
pprint_safe_repr | 1.01 sec | 976 ms: 1.03x faster |
crypto_pyaes | 102 ms | 98.7 ms: 1.03x faster |
coverage | 111 ms | 108 ms: 1.03x faster |
coroutines | 30.3 ms | 29.5 ms: 1.03x faster |
telco | 10.4 ms | 10.2 ms: 1.03x faster |
json_dumps | 15.6 ms | 15.3 ms: 1.02x faster |
asyncio_websockets | 547 ms | 534 ms: 1.02x faster |
scimark_sparse_mat_mult | 5.94 ms | 5.81 ms: 1.02x faster |
pprint_pformat | 2.07 sec | 2.02 sec: 1.02x faster |
unpickle_pure_python | 300 us | 294 us: 1.02x faster |
xml_etree_parse | 218 ms | 214 ms: 1.02x faster |
xml_etree_generate | 135 ms | 133 ms: 1.02x faster |
async_generators | 510 ms | 501 ms: 1.02x faster |
typing_runtime_protocols | 217 us | 213 us: 1.02x faster |
mdp | 3.72 sec | 3.67 sec: 1.02x faster |
scimark_fft | 437 ms | 431 ms: 1.01x faster |
bench_thread_pool | 1.79 ms | 1.77 ms: 1.01x faster |
deepcopy_reduce | 3.71 us | 3.66 us: 1.01x faster |
docutils | 3.56 sec | 3.52 sec: 1.01x faster |
async_tree_memoization_tg | 433 ms | 428 ms: 1.01x faster |
sqlglot_transpile | 2.02 ms | 2.00 ms: 1.01x faster |
genshi_xml | 69.7 ms | 69.0 ms: 1.01x faster |
xml_etree_process | 92.2 ms | 91.3 ms: 1.01x faster |
fannkuch | 539 ms | 535 ms: 1.01x faster |
sqlglot_normalize | 144 ms | 143 ms: 1.01x faster |
float | 102 ms | 101 ms: 1.01x faster |
raytrace | 361 ms | 358 ms: 1.01x faster |
sqlglot_parse | 1.64 ms | 1.63 ms: 1.01x faster |
gc_traversal | 4.84 ms | 4.80 ms: 1.01x faster |
nqueens | 117 ms | 116 ms: 1.01x faster |
meteor_contest | 123 ms | 123 ms: 1.01x faster |
sqlglot_optimize | 71.4 ms | 71.1 ms: 1.00x faster |
comprehensions | 23.0 us | 22.9 us: 1.00x faster |
pidigits | 240 ms | 240 ms: 1.00x faster |
unpack_sequence | 55.0 ns | 55.2 ns: 1.00x slower |
chaos | 84.3 ms | 84.7 ms: 1.00x slower |
dulwich_log | 126 ms | 126 ms: 1.00x slower |
regex_compile | 165 ms | 166 ms: 1.00x slower |
hexiom | 8.01 ms | 8.07 ms: 1.01x slower |
async_tree_cpu_io_mixed_tg | 708 ms | 714 ms: 1.01x slower |
async_tree_eager | 134 ms | 135 ms: 1.01x slower |
richards_super | 73.2 ms | 73.9 ms: 1.01x slower |
deltablue | 4.31 ms | 4.35 ms: 1.01x slower |
asyncio_tcp_ssl | 3.59 sec | 3.64 sec: 1.01x slower |
2to3 | 418 ms | 423 ms: 1.01x slower |
scimark_sor | 159 ms | 162 ms: 1.02x slower |
python_startup | 40.9 ms | 41.7 ms: 1.02x slower |
scimark_lu | 143 ms | 146 ms: 1.02x slower |
async_tree_eager_cpu_io_mixed | 551 ms | 565 ms: 1.03x slower |
go | 146 ms | 150 ms: 1.03x slower |
generators | 38.2 ms | 39.7 ms: 1.04x slower |
nbody | 136 ms | 142 ms: 1.05x slower |
Geometric mean | (ref) | 1.01x faster |
Benchmark | clang.pgo.20.1.0-rc2.92e5f826ac | clang.pgo.20.1.0-rc2.16a7f4607e.pyHot |
---|---|---|
pickle_pure_python | 383 us | 364 us: 1.05x faster |
pprint_safe_repr | 863 ms | 840 ms: 1.03x faster |
regex_effbot | 3.20 ms | 3.13 ms: 1.02x faster |
pickle_list | 4.77 us | 4.66 us: 1.02x faster |
typing_runtime_protocols | 178 us | 174 us: 1.02x faster |
pprint_pformat | 1.78 sec | 1.74 sec: 1.02x faster |
xml_etree_generate | 110 ms | 108 ms: 1.02x faster |
richards | 45.1 ms | 44.3 ms: 1.02x faster |
scimark_sor | 138 ms | 136 ms: 1.01x faster |
gc_traversal | 5.21 ms | 5.15 ms: 1.01x faster |
xml_etree_process | 76.3 ms | 75.5 ms: 1.01x faster |
async_tree_eager | 113 ms | 111 ms: 1.01x faster |
xml_etree_parse | 202 ms | 201 ms: 1.01x faster |
nqueens | 92.3 ms | 91.4 ms: 1.01x faster |
coroutines | 24.9 ms | 24.7 ms: 1.01x faster |
mako | 13.4 ms | 13.3 ms: 1.01x faster |
meteor_contest | 118 ms | 118 ms: 1.00x faster |
unpickle_pure_python | 247 us | 246 us: 1.00x faster |
sqlglot_normalize | 120 ms | 119 ms: 1.00x faster |
deltablue | 3.69 ms | 3.71 ms: 1.00x slower |
sympy_integrate | 22.6 ms | 22.7 ms: 1.00x slower |
deepcopy | 289 us | 291 us: 1.01x slower |
sympy_sum | 181 ms | 182 ms: 1.01x slower |
unpack_sequence | 55.1 ns | 55.4 ns: 1.01x slower |
2to3 | 370 ms | 373 ms: 1.01x slower |
asyncio_tcp_ssl | 3.52 sec | 3.55 sec: 1.01x slower |
sqlite_synth | 3.20 us | 3.22 us: 1.01x slower |
async_tree_eager_io | 701 ms | 707 ms: 1.01x slower |
sqlglot_parse | 1.38 ms | 1.40 ms: 1.01x slower |
pidigits | 228 ms | 230 ms: 1.01x slower |
async_tree_io_tg | 727 ms | 735 ms: 1.01x slower |
dulwich_log | 115 ms | 117 ms: 1.01x slower |
python_startup | 39.4 ms | 39.9 ms: 1.01x slower |
chaos | 67.0 ms | 67.9 ms: 1.01x slower |
raytrace | 299 ms | 303 ms: 1.01x slower |
nbody | 119 ms | 120 ms: 1.01x slower |
async_tree_eager_tg | 260 ms | 264 ms: 1.01x slower |
scimark_lu | 122 ms | 124 ms: 1.02x slower |
python_startup_no_site | 34.0 ms | 34.5 ms: 1.02x slower |
regex_dna | 204 ms | 208 ms: 1.02x slower |
crypto_pyaes | 81.1 ms | 82.7 ms: 1.02x slower |
scimark_fft | 341 ms | 349 ms: 1.02x slower |
scimark_sparse_mat_mult | 4.53 ms | 4.65 ms: 1.03x slower |
sympy_str | 320 ms | 329 ms: 1.03x slower |
bench_thread_pool | 1.63 ms | 1.68 ms: 1.03x slower |
deepcopy_reduce | 2.96 us | 3.06 us: 1.03x slower |
tomli_loads | 2.20 sec | 2.28 sec: 1.03x slower |
pathlib | 232 ms | 241 ms: 1.04x slower |
telco | 8.45 ms | 8.77 ms: 1.04x slower |
unpickle | 15.6 us | 16.2 us: 1.04x slower |
pickle | 13.5 us | 14.3 us: 1.05x slower |
async_tree_memoization | 405 ms | 428 ms: 1.06x slower |
async_tree_io | 740 ms | 784 ms: 1.06x slower |
Geometric mean | (ref) | 1.00x slower |
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response