-
Notifications
You must be signed in to change notification settings - Fork 532
Description
Hi Jordi,
First of all, thanks so much for the notebooks. They really help me to follow through the course.
I have one question in your notebook 4, nnCostFunction -- where J = ... np.sum((np.log(a3.T)*(y_matrix)+np.log(1-a3).T*(1-y_matrix))).
I think this does matrix multiplication --> giving 10*10 matrix (or n_label * n_label). This gives a matrix, let's name this cost-matrix, Jc. This Jc matrix contains not only how a set of predicted values for one label differs from it's corresponding target (diagonal elements), but also how it is differs from targets of other labels (off-diagonal elements). For example, the multiplication would multiply a column of predicted values np.log(a3.T) of one label (e.g. k) with all columns of targets.
Then the code sums all elements of this matrix. This seems to over-calculate J. Instead of summing all the elements, I think only the diagonal elements are needed.
Please use this picture to accommodate my description, which might be confusing.

Please let me know if I misunderstood the code.
Best regards and thanks again,
-Tua