Replies: 1 comment 2 replies
-
|
Thanks a lot for detailed description! It does look like a bug / missing optimization in |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
I have a case like this:
I use the
compute_withmerge the forloop of sum_1 and sum_2.I get the following IR(the IR is after
Unrolling):Pay attention to the
produce input_c:I expect the allocate size of
input_cis 8 * 8, and the forloop extent ofinput_care 8 and 8. But the allocate size ofinput_cis 32 * 64, that is the input_c need compute all data for every outer forloop.So, is there a bug of Halide backend Or is there something wrong with my schedule?
--
I find a way to slove the problem.
clone_in the
input_c.The schedule is:
The IR is:
I get the expected allocate size and forloop extent. But the input_c have double calculation. Is there any way to eliminate this redundant calculation?
Beta Was this translation helpful? Give feedback.
All reactions