Fixed typo in dimensions of bias term #2643
Open
+1
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
In the Vectorization section,
We have
$\mathbf{X} \in \mathbb{R}^{n \times d}$ , $\mathbf{W} \in \mathbb{R}^{d \times q}$ .
Now, since:
$$\mathbf{O} = \mathbf{X} \mathbf{W} + \mathbf{b}$$
$$\hat{\mathbf{Y}} = \mathrm{softmax}(\mathbf{O}),$$
and
we require$\mathbf{b} \in \mathbb{R}^{n \times q}$ . This ensures that the softmax operation can be computed row-wise on $\mathbf{O}$ .