Description
Hypothesis: compute_at
and store_at
relationships should be defined at the consumer with respect to its producers, not at a producer with respect to its consumers. So (ignoring changing names to make the change clear), the schedule would say consumer.compute_at(producer, loop_var_of_consumer)
rather than producer.compute_at(consumer, loop_var_of_consumer)
.
Context: @abadams and I were talking briefly about the remaining pain points / incidental complexity in using something like C++ classes in the vein of generators as the way to manage and structure Halide modules.
- Generators (or similar) work quite well for wrapping up and generating code from top-level pipelines.
- Generators (or similar) become more of a mess for modules which will be composed. IIRC there are two major issues:
- It gets verbose to write a long sequence of field assignments to wire up arguments. This is well addressed by stubs, which have hope of being reasonably approximated inline with some template work.
- Scheduling has to be done after algorithm generation, because in general it needs to have access to generator params of the downstream algorithm
Ignoring for a moment the obvious challenge that changing how we track and how the language specifies compute_at/store_at
relationships, would this just solve (2)? It seems like it both makes module composition generally easier and reduces the number of stages (in the staged programming sense) from ~3 (generate algorithm, schedule algorithm, run pipeline) to ~2 (generate and schedule algorithm, run pipeline). That seems like a big mental and practical win.
It also seems like this model of producer-consumer relationships in scheduling (thinking from the consumer’s perspective how the producer is scheduled with respect to it) better matches the lowering process, anyway.
Thinking out loud, the main case it seems like this wouldn’t fully simplify is when you want to compute inner parts of a producer module relative to some outer context further downstream than the consumer module itself. Does this really come up? Still seems to require that fully general case allows deferred scheduling?
Also: does this mean compute_at
should become a function on a module (i.e., on the standard Generator interface)? This would define the logical root for that module, which could be used just by the output Func, or potentially by many Funcs in the sub-pipeline. And this could be done now, as a standard convention for defining and setting up on or a few local-context loop levels as ScheduleParams
, without changing Func
, the innards of the schedule representation and lowering, or tens of thousands of lines of existing schedule code.
Another retrospectively obvious argument I realize in favor of this, and perhaps the most important both for having a clear mental model, and for improving modularity, is that it’s the consumer who knows the characteristics of its access, which determine the potential reuse and redundant compute of the producer.