Skip to content

[feat] Support mapping of multi-cycle operations with three strategies (exclusive, distributed, inclusive) #49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

HobbitQia
Copy link
Collaborator

I'm so excited to share my updates on the mapper with you. The main changes are listed below:

  1. Users can specify the strategy in param.json (default to exclusive strategy). For exclusive strategy, the multi-cycle operation will occupy the tile exclusively and other nodes cannot be mapped onto this tile until it finishes the computation. For inclusive strategy, different type of operations can be mapped to the same tile if they don't share the same FU or they are pipelinable. This feature is mainly implemented in CGRANode::canOccupy() (CGRANode.cpp:269-375). For distributed strategy, the multi-cycle nodes will be splitted into multiple nodes according to their execution latency, which is implemented in DFGsplitNodes() (DFG.cpp:50-93).
  2. Provide test cases in test/multicycle. Currently, I can only determine whether the kernel is mapped under the distributed strategy, as it increases the size of the DFG. However, I am temporarily unsure how to verify the exclusive and inclusive strategies. For now, I rely solely on the DFG size to make this judgment, which is not a valid criterion.
  3. For now I don't include the strategy selection engine in this PR since I think it's simple and crude. If u think it meaningful for current framework, I'm glad to improve it so that it can be integrated into the current framework.

@HobbitQia
Copy link
Collaborator Author

sry...when I merge my codes with the updated branch, some codes are duplicate...

@HobbitQia HobbitQia force-pushed the master branch 2 times, most recently from 6e031d8 to 3969cbc Compare April 25, 2025 09:14
echo "Multi-Cycle Test Failed! The count of DFG nodes in the exclusive strategy should be 11, but got $exclusive_dfg_count."
exit 1
elif [ "$inclusive_dfg_count" -ne 11 ]; then
echo "Multi-Cycle Test Failed! The count of DFG nodes in the inclusive strategy should be 11, but got $inclusive_dfg_count."
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some/any case that the inclusive II smaller than the exclusive one?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see an example as well. Thank HobbitQia for his extraordinary work!

@@ -64,6 +64,10 @@ CGRANode::CGRANode(int t_id, int t_x, int t_y) {
m_mapped = false;
m_DVFSLatencyMultiple = 1;
m_synced = false;

// Indicates whether this CGRA node can execute multiple operations
// simultaneously. (e.g., single-cycle overlaps with multi-cycle)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment means exactly "inclusive", right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -293,6 +298,9 @@ bool CGRANode::canOccupy(DFGNode* t_opt, int t_cycle, int t_II) {
if (not t_opt->isMultiCycleExec(getDVFSLatencyMultiple())) {
// Single-cycle opt:
for (int cycle=t_cycle%t_II; cycle<m_cycleBoundary; cycle+=t_II) {
if (!canMultipleOps() && !m_dfgNodesWithOccupyStatus[cycle]->empty()) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the key implementation about "inclusive", right? Then, plz add comment. And please mention why we don't need to specify "exclusive" and "distributed".

@@ -302,6 +310,19 @@ bool CGRANode::canOccupy(DFGNode* t_opt, int t_cycle, int t_II) {
} else {
// Multi-cycle opt.
for (int cycle=t_cycle%t_II; cycle<m_cycleBoundary; cycle+=t_II) {
// Can not support simultaneous execution of multiple operations.
if (!canMultipleOps()) {
int exec_latency = t_opt->getExecLatency(getDVFSLatencyMultiple());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need getDVFSLatencyMultiple()? I don't quite understand what we are trying to handle here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actully I am not aware of codes about DVFS...I just see in other places we get the latency by the similar way t_opt->isMultiCycleExec(getDVFSLatencyMultiple() (CGRANode.cpp:298) so I followed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

3 participants