Skip to content

[Opt] Remove redundant cfg optimization, to fix struct vec crash bug#8691

Merged
feisuzhu merged 4 commits into
taichi-dev:masterfrom
hughperkins:hp/remove-redundant-cfg-opt
Apr 29, 2025
Merged

[Opt] Remove redundant cfg optimization, to fix struct vec crash bug#8691
feisuzhu merged 4 commits into
taichi-dev:masterfrom
hughperkins:hp/remove-redundant-cfg-opt

Conversation

@hughperkins
Copy link
Copy Markdown
Contributor

…ev/taichi/issues/8675

Issue: #8675

Brief Summary

Remove redundant cfg optimization, to fix struct vec crash bug

copilot:summary

Walkthrough

In #8675 , the cfg optimization pass causes a crash. Using following script:

import taichi as ti
ti.init(arch=ti.cpu, offline_cache=False, advanced_optimization=False,
        cpu_max_num_threads=1, debug=True, log_level=ti.DEBUG)


@ti.dataclass
class DataClassTest:
    v: ti.types.vector(3, dtype=ti.f64)

    @ti.func
    def manipulate_elements(self) -> float:
        self.v = [1.23, 2.34, 3.45]
        idx = 1
        return self.v[idx] # crash
        # return self.v[1]  # work as expected

d = DataClassTest.field(shape=())

@ti.kernel
def my_kernel() -> float:
    val = d[None].manipulate_elements()
    return val

def main():
    print('1')
    ret = my_kernel()
    print('ret', ret)
    print('2')

main()

Running on master:

[Taichi Build] Hughs-MacBook-Air:taichi hugh$ python ~/git/taichi-play/8675.py
[Taichi] version 1.8.0, llvm 15.0.7, commit 816eed64, osx, python 3.10.17
[Taichi] Starting on arch=arm64
[D 04/27/25 09:21:42.771 1481615] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:42.771 1481616] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:42.771 1481615] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 04/27/25 09:21:42.771 1481616] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 04/27/25 09:21:42.771 1481617] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:42.771 1481617] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 04/27/25 09:21:42.771 1481618] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:42.771 1481618] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
1
[D 04/27/25 09:21:42.832 1481524] [kernel_compilation_manager.cpp:KernelCompilationManager@56] Create KernelCompilationManager with offline_cache_file_path = /Users/hugh/.cache/taichi/ticache

[E 04/27/25 09:21:42.832 1481524] Received signal 11 (Segmentation fault: 11)



                            * Taichi Core - Stack Traceback *
==========================================================================================
|                       Module |  Offset | Function                                      |
|----------------------------------------------------------------------------------------|
* taichi_python.cpython-310-darwin.so |     136 | taichi::Logger::error(std::__1::basic_ |
                                         | string<char, std::__1::char_traits<char>, std |
                                         | ::__1::allocator<char>> const&, bool)         |
* taichi_python.cpython-310-darwin.so |     372 | taichi::(anonymous namespace)::signal_ |
                                         | handler(int)                                  |

Running with this branch:

[Taichi Build] Hughs-MacBook-Air:taichi hugh$ python ~/git/taichi-play/8675.py
[Taichi] version 1.8.0, llvm 15.0.7, commit 816eed64, osx, python 3.10.17
[Taichi] Starting on arch=arm64
[D 04/27/25 09:21:58.524 1482254] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:58.524 1482256] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:58.524 1482256] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 04/27/25 09:21:58.524 1482253] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:58.525 1482253] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 04/27/25 09:21:58.524 1482255] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 04/27/25 09:21:58.525 1482255] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 04/27/25 09:21:58.524 1482254] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
1
[D 04/27/25 09:21:58.585 1482173] [kernel_compilation_manager.cpp:KernelCompilationManager@56] Create KernelCompilationManager with offline_cache_file_path = /Users/hugh/.cache/taichi/ticache
ret 2.3399999141693115
2
[D 04/27/25 09:21:58.619 1482173] [offline_cache.h:run@199] Start cleaning cache
[D 04/27/25 09:21:58.620 1482173] [offline_cache.h:load_metadata_with_checking@83] Offline cache metadata file /Users/hugh/.cache/taichi/ticache/ticache.tcb not found
[D 04/27/25 09:21:58.620 1482173] [offline_cache.h:operator()@191] Stop cleaning cache
[D 04/27/25 09:21:58.620 1482173] [offline_cache.h:load_metadata_with_checking@83] Offline cache metadata file /Users/hugh/.cache/taichi/ticache/ticache.tcb not found
[Taichi Build] Hughs-MacBook-Air:taichi hugh$

Before simply removing this pass, I was originally going to fix it, however I ran the various tests in tests/python/test_optmization.py, printing out the ir before and after the cfg pass, and I saw no effect of this cfg pass, in any of the tests I tried running.

So, I conclude that:

  • keeping this cfg pass can cause a crash (see above)
  • removing it does not affect any of the optimization test results (presuambly because there is an earlier cfg pass that runs)

copilot:walkthrough

@bobcao3 bobcao3 self-requested a review April 28, 2025 07:44
Comment thread taichi/transforms/compile_to_offloads.cpp Outdated
Copy link
Copy Markdown
Collaborator

@bobcao3 bobcao3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@hughperkins
Copy link
Copy Markdown
Contributor Author

Awesome, thanks! 🙌 Note: I don't have a 'merge' button available, although it looks like everything is passing/approved.

@hughperkins
Copy link
Copy Markdown
Contributor Author

@feisuzhu Do you have any concerns on this PR? More tests perhaps? 😀 Maybe we need at least:

  • a test that fails without this change?

Note that there are already optmization tests in https://github.com/taichi-dev/taichi/blob/master/tests/python/test_optimization.py though I agree they appear to me, by the nature of being quite short relative to the optimization code itself, to plausibly be not very complete?

@feisuzhu
Copy link
Copy Markdown
Contributor

@feisuzhu Do you have any concerns on this PR? More tests perhaps? 😀 Maybe we need at least:

* a test that fails without this change?

Note that there are already optmization tests in https://github.com/taichi-dev/taichi/blob/master/tests/python/test_optimization.py though I agree they appear to me, by the nature of being quite short relative to the optimization code itself, to plausibly be not very complete?

Nope, I'm the owner of CI subsystem. If @bobcao3 says it's ok, it's ok ;]

@feisuzhu feisuzhu merged commit f9169e8 into taichi-dev:master Apr 29, 2025
15 checks passed
@hughperkins
Copy link
Copy Markdown
Contributor Author

Awesome. Thank you 🙌

@hughperkins hughperkins deleted the hp/remove-redundant-cfg-opt branch April 29, 2025 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants