Pink: input
Blue: H1
Yellow: H2
Green: Output
Need to make sure that shape of matrices are correct
Output matrix of first step is 1x3
Either trying to get a probability distribution
Sigmoid will give a value between 0 and 1
Softmax will give a value where all values will sub to 1
- Max of 32k, so english language needs to be broken down into tokens
  - Means that you will likely go faster because you are using the tokenized values

We will be using back propagation
- We will calculate a loss based on how far off our answers were from the correct answer

Cross Entropy

Entropy
- Uncertainty
  - An unfair coin (two heads) has an entropy of 0
    - Outcome is certain
  - A fair coin has an entropy of 1
    - 50/50
  - 1000 sided die
    - The outcome is less certain 1 in 1000
- D is the divergence value
- Loss will not be 0 unless you did something wrong
  - There will always be some non-zero ambiguity with text
  - Today will be ______ ___ (could be day, weather, event, etc.)
- Useful for sparse categorical

Provide feedback