Skip to content

Conversation

@changhai0109
Copy link
Contributor

@changhai0109 changhai0109 commented Dec 7, 2023

Summary

This PR contains commits belong to #151. New commits start from 39c83b0. This PR is tested with chakra updated to mlcommons/chakra#76. Not sure if original chakra works.

  • Added attr checking before issuing nodes Do not supported because pending chakra PR for optional attrs not merged.
  • Re-categorized node type to mem/comm/gpu_comp/cpu_comp, updated corresponding hw_resources types.
  • Fixed bug of queue_levels when queues.size()=1 Covered in other PR
  • Updated workload generation script in example runs.

Test Plan

Using runs/example, assuming workload generated with new chakra(with #76, also after updated text converter in #78 to make sure the generated workloads is correct).

Positive case:

cd runs/example
./run.sh

and has output

[2023-12-07 11:55:08.693] [topology::RingTopology] [info] ring of node 0, id: 0 dimension: local total nodes in ring: 64 index in ring: 0total nodes in ring: 1
[2023-12-07 11:55:08.693] [topology::RingTopology] [info] ring of node 0, id: 0 dimension: local total nodes in ring: 64 index in ring: 0total nodes in ring: 1
[2023-12-07 11:55:08.693] [topology::RingTopology] [info] ring of node 0, id: 0 dimension: local total nodes in ring: 64 index in ring: 0total nodes in ring: 1
[2023-12-07 11:55:08.693] [topology::RingTopology] [info] ring of node 0, id: 0 dimension: local total nodes in ring: 64 index in ring: 0total nodes in ring: 1
[2023-12-07 11:55:14.567] [workload] [info] sys[8] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[9] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[10] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[11] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[12] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[13] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[14] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[15] finished, 1082454160 cycles
[2023-12-07 11:55:14.567] [workload] [info] sys[16] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[17] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[18] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[19] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[20] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[21] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[22] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[23] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[24] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[25] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[26] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[27] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[28] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[29] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[30] finished, 1082454160 cycles
[2023-12-07 11:55:14.568] [workload] [info] sys[31] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[32] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[33] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[34] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[35] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[36] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[37] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[38] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[39] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[40] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[41] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[42] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[43] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[44] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[45] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[46] finished, 1082454160 cycles
[2023-12-07 11:55:14.569] [workload] [info] sys[47] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[48] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[49] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[50] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[51] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[52] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[53] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[54] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[55] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[56] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[57] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[58] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[59] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[60] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[61] finished, 1082454160 cycles
[2023-12-07 11:55:14.570] [workload] [info] sys[62] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[63] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[0] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[1] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[2] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[3] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[4] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[5] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[6] finished, 1082454160 cycles
[2023-12-07 11:55:14.571] [workload] [info] sys[7] finished, 1082454160 cycles
[2023-12-07 11:55:14.573] [System] [info] Exiting

negative case

Because there is not attr check thus no negative case for now.

Additional Notes

Should be merged after PR #151 to fast-forward a clean history.
Fixed the bug: in previous example the communication will be routing to skip_invalid because node->is_cpu_op()==True and node->runtime()==0, according to workload.cc:130 it will be invalid, which should not be correct.

@changhai0109 changhai0109 force-pushed the changhai-improve-workload-layer branch 2 times, most recently from 3bdf217 to 8413047 Compare January 19, 2024 21:03
@changhai0109 changhai0109 force-pushed the changhai-improve-workload-layer branch from 8413047 to da258d6 Compare January 25, 2024 21:33
@changhai0109
Copy link
Contributor Author

Need to use updated chakra, however, these PR are pending merged

@jinsun-yoo
Copy link
Collaborator

Will resubmit after rebase & cleanup

@willjwon willjwon added enhancement New feature or request bugfix Bug Fix workload-layer Workload Layer labels Apr 24, 2024
@changhai0109 changhai0109 force-pushed the changhai-improve-workload-layer branch from eacf93e to 3820106 Compare April 29, 2024 23:19
@changhai0109 changhai0109 force-pushed the changhai-improve-workload-layer branch from 3820106 to 703352a Compare May 28, 2024 21:23
@changhai0109
Copy link
Contributor Author

Finished the rebase and clean up.

@changhai0109
Copy link
Contributor Author

ready for review.

@changhai0109
Copy link
Contributor Author

Old PR, dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Bug Fix enhancement New feature or request workload-layer Workload Layer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants