Adjusting parameters by sign and magnitude of gradient

https://github.com/karpathy/micrograd/blame/c911406e5ace8742e5841a7e0df113ecb5d54685/demo.ipynb#L271C13-L271C45

I really appreciate your videos! Such a gift to all of us.

When adjusting parameters after computing the loss, the example multiplies the step size by the sign _and magnitude_ of the gradient. In cases of a steep gradients near local minimum values, a large value will jump the parameter far from the desired solution. In the case of shallow gradients, the parameter will struggle to reach its local minimum in the given number of iterations.

Thus, I think the adjustment should be a step size times the sign of the gradient.

What are your thoughts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjusting parameters by sign and magnitude of gradient #65

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adjusting parameters by sign and magnitude of gradient #65

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions