-
Notifications
You must be signed in to change notification settings - Fork 12
[feat] Support mapping of multi-cycle operations with three strategies (exclusive, distributed, inclusive) #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
sry...when I merge my codes with the updated branch, some codes are duplicate... |
6e031d8
to
3969cbc
Compare
echo "Multi-Cycle Test Failed! The count of DFG nodes in the exclusive strategy should be 11, but got $exclusive_dfg_count." | ||
exit 1 | ||
elif [ "$inclusive_dfg_count" -ne 11 ]; then | ||
echo "Multi-Cycle Test Failed! The count of DFG nodes in the inclusive strategy should be 11, but got $inclusive_dfg_count." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have some/any case that the inclusive II smaller than the exclusive one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to see an example as well. Thank HobbitQia for his extraordinary work!
@@ -64,6 +64,10 @@ CGRANode::CGRANode(int t_id, int t_x, int t_y) { | |||
m_mapped = false; | |||
m_DVFSLatencyMultiple = 1; | |||
m_synced = false; | |||
|
|||
// Indicates whether this CGRA node can execute multiple operations | |||
// simultaneously. (e.g., single-cycle overlaps with multi-cycle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment means exactly "inclusive", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
@@ -293,6 +298,9 @@ bool CGRANode::canOccupy(DFGNode* t_opt, int t_cycle, int t_II) { | |||
if (not t_opt->isMultiCycleExec(getDVFSLatencyMultiple())) { | |||
// Single-cycle opt: | |||
for (int cycle=t_cycle%t_II; cycle<m_cycleBoundary; cycle+=t_II) { | |||
if (!canMultipleOps() && !m_dfgNodesWithOccupyStatus[cycle]->empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the key implementation about "inclusive", right? Then, plz add comment. And please mention why we don't need to specify "exclusive" and "distributed".
@@ -302,6 +310,19 @@ bool CGRANode::canOccupy(DFGNode* t_opt, int t_cycle, int t_II) { | |||
} else { | |||
// Multi-cycle opt. | |||
for (int cycle=t_cycle%t_II; cycle<m_cycleBoundary; cycle+=t_II) { | |||
// Can not support simultaneous execution of multiple operations. | |||
if (!canMultipleOps()) { | |||
int exec_latency = t_opt->getExecLatency(getDVFSLatencyMultiple()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need getDVFSLatencyMultiple()
? I don't quite understand what we are trying to handle here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actully I am not aware of codes about DVFS...I just see in other places we get the latency by the similar way t_opt->isMultiCycleExec(getDVFSLatencyMultiple()
(CGRANode.cpp:298) so I followed it.
I'm so excited to share my updates on the mapper with you. The main changes are listed below:
param.json
(default to exclusive strategy). For exclusive strategy, the multi-cycle operation will occupy the tile exclusively and other nodes cannot be mapped onto this tile until it finishes the computation. For inclusive strategy, different type of operations can be mapped to the same tile if they don't share the same FU or they are pipelinable. This feature is mainly implemented inCGRANode::canOccupy()
(CGRANode.cpp:269-375
). For distributed strategy, the multi-cycle nodes will be splitted into multiple nodes according to their execution latency, which is implemented inDFGsplitNodes()
(DFG.cpp:50-93
).test/multicycle
. Currently, I can only determine whether the kernel is mapped under the distributed strategy, as it increases the size of the DFG. However, I am temporarily unsure how to verify the exclusive and inclusive strategies. For now, I rely solely on the DFG size to make this judgment, which is not a valid criterion.