Skip to content
Discussion options

You must be logged in to vote

thread0 need thread T0,8,16,24's shared memory data to load in registers

correct

cutlass has a special shared memory store layout.

correct.

slides 45-48 just deep dive into an example to show why the special layout can avoid bank conflicts when loading from the shared memory to the registers.

Replies: 13 comments 16 replies

Comment options

You must be logged in to vote
3 replies
@MARD1NO
Comment options

@MARD1NO
Comment options

@hwu36
Comment options

Answer selected by MARD1NO
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@hwu36
Comment options

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
12 replies
@linuxlonelyeagle
Comment options

@hwu36
Comment options

@hwu36
Comment options

@linuxlonelyeagle
Comment options

@wukong1992
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
8 participants