Skip to content

Incorrect application of SigmoidPrime to activated sums. #143

@AHartNtkn

Description

@AHartNtkn

In the "Put it all together" section from the lecture for module 1, there's the following code;

# Weighted sum of inputs / weights
weighted_sum = np.dot(inputs, weights)
# Activate!
activated_output = sigmoid(weighted_sum)
# Cac error
error = correct_outputs - activated_output
adjustments = error * sigmoid_derivate(activated_output)

The adjustments should be adjustments = error * sigmoid_derivate(weighted_sum). Intuitively, this is because we want the derivative of our activation function at the points defined by the weighted sum, not at the points after activation. It is the activation function we're taking the derivative of, after all.

The NeuralNetwork class constructed throughout module 2 applies sigmoidPrime to the activated output.

self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden)

should be

self.z2_delta = self.z2_error * self.sigmoidPrime(self.hidden_sum)

and

self.o_delta = self.o_error * self.sigmoidPrime(o)

should be

self.o_delta = self.o_error * self.sigmoidPrime(self.output_sum)

By sheer coincidence, this error cancels out with the error in #134. As it happens, the difference between the incorrect and correct sigmoidPrime implementations is an application of sigmoid. The difference between hidden_sum and activated_hidden is also an application of sigmoid. So that means, self.sigmoidPrime(self.activated_hidden) (using the incorrect definition of sigmoidPrime) is equivalent to self.sigmoidPrime(self.hidden_sum) (using the correct definition for sigmoidPrime). This means that the implementation of NeuralNetwork would break if sigmoid were swapped with just about any other activation function, but it works with sigmoid, seemingly by accident.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions