Matrix sum in Neural network's cost function

Hi Jordi,

First of all, thanks so much for the notebooks. They really help me to follow through the course.
I have one question in your **notebook 4**, **nnCostFunction** -- where `J = ... np.sum((np.log(a3.T)*(y_matrix)+np.log(1-a3).T*(1-y_matrix)))`.

I think this does matrix multiplication --> giving 10*10 matrix (or n_label * n_label). This gives a matrix, let's name this cost-matrix, **Jc**. This **Jc** matrix contains not only how a set of predicted values for one label differs from it's corresponding target (diagonal elements), but also how it is differs from targets of other labels (off-diagonal elements). For example, the multiplication would multiply a column of predicted values np.log(a3.T) of one label (e.g. k) with all columns of targets. 

Then the code sums all elements of this matrix. This seems to over-calculate **J**. Instead of summing all the elements, I think only the diagonal elements are needed. 

Please use this picture to accommodate my description, which might be confusing.
![img_20170829_155209](https://user-images.githubusercontent.com/26705716/29824563-3029bc38-8cd2-11e7-9d48-1eef31ebb50f.jpg)

Please let me know if I misunderstood the code.

Best regards and thanks again,
-Tua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Matrix sum in Neural network's cost function #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Matrix sum in Neural network's cost function #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions