-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
Hi, @hsharma35,
I've reading about the code in accelerator.py, I found I'm confused about the compute_cycles.
In get_compute_cycles function:
"""
Compute instruction
args:
ic: Input Channels
oc: Output Channels
ow: Output Width
oh: Output Height
kw: Output Height
kh: Output Height
b: Batch Size
im2col: boolean. If true, we assume the cpu does im2col. Otherwise,
we do convolutions channel-wise
"""
overhead = 0
if im2col:
ni = kw * kh * ic
no = oc
batch = b * oh * ow
compute_cycles = batch * ceil_a_by_b(no, self.M) * \
(ceil_a_by_b(ni, self.N * self.get_perf_factor(iprec, wprec)) + overhead)
else:
compute_cycles = b * ceil_a_by_b(oc, self.M) * \
ow * oh * kw * kh * \
(ceil_a_by_b(ic, self.N * self.get_perf_factor(iprec, wprec)) + overhead)
return compute_cycles
My question are:
- In a systolic array, the partial sums produced by the PEs need to propagate downward to the bottom each cycle. Is the forwarding latency considered (for example, in a 3×3 systolic array, the first output would need to wait three cycles, corresponding to the array’s height)?
- If the above assumption is correct, does the
overheadaccount for this? If not, what exactly is the purpose of theoverhead?
Metadata
Metadata
Assignees
Labels
No labels