A strategy discussed in discord: > [The easiest way to benefit from parallelism is to use function passes. Then MLIR will process all functions in parallel. This does not work if your pass does module-level changes such as deleting unreachable functions. However, it is perfect if the pass does function local rewrites. 1 Minute compilation time sounds quite slow but I guess it depends a lot on what transformations you are actually running](https://discord.com/channels/636084430946959380/642426447167881246/1185621153730019348) > [Is it possible to create the operations for separate functions in parallel? If so, I could create a work queue for threads to go build each function in order to leverage all available cores. Currently I do this on a single thread.](https://discord.com/channels/636084430946959380/642426447167881246/1185634061302124716) > [I guess in theory you could first create the functions sequentially and then in parallel fill the function bodies yes.](https://discord.com/channels/636084430946959380/642426447167881246/1185635663522041866) > [Filling the functions in parallel works! I'll need to test it with a much larger program to see the benefits though. \[...\] Generating 100% of the IR takes 2 seconds now. My reachable function analysis brings that down to a fraction of a second during generation now and of course the passes have less to work on so they're faster too.](https://discord.com/channels/636084430946959380/642426447167881246/1185700179110809631)