Lack of gradient calculation for Feedforward layer in ops.py

The function `convert_to_bwd_base_ops` of `class FeedForwardOp(Op)` only calculates the gradient for updating weight. May I know if there is any reason this function ignores the gradient for the next layer? This implementation of convert_to_bwd_base_ops is showing below:

> 	def convert_to_bwd_base_ops(self):
> 		"""Convert operation to backward base operations"""
> 		self.bwd_base_ops = []
> 
> 		if not self.fwd_base_ops: self.convert_to_fwd_base_ops()
> 
> 		# Incoming gradients are assumed to be in the activation buffer
> 		del_f_size = (self.input_size[0], self.input_size[1], self.ff_weight_size[2])
> 
> 		# Get weight update matrix (del_W = x_[i-1].T * del_i)
> 		ff_op = MatrixMultOp(f'{self.op_name}_f[wgt]', self.config, [], Op.transpose_size(self.input_size), del_f_size, mode='bwd')
> 		self.bwd_base_ops.append(ff_op) 
> 		assert self.ff_weight_size == ff_op.output_size()
> 
> 		self.bwd_base_ops.append(MemoryStoreOp(f'{self.op_name}_f[wgt]-s', self.config, self.ff_weight_size, 'weight', overwrite=True))


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lack of gradient calculation for Feedforward layer in ops.py #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lack of gradient calculation for Feedforward layer in ops.py #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions