Attention and layer normalization by PineapplePulp · Pull Request #67 · Iainmon/ChAI

PineapplePulp · 2025-04-14T02:07:26Z

Updated softmax to work with a user-specified axis and added matmul that broadcasts with .expand() in order to implement self-attention. Output projection layers, bias, and multi-head capability will be added in a future sprint, which is why the class is called MultiheadAttention.

Also added ndarray.variance and layer normalization.

PineapplePulp closed this Apr 27, 2025

PineapplePulp force-pushed the main branch from 36369ec to 0b1c8b8 Compare April 27, 2025 23:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention and layer normalization#67

Attention and layer normalization#67
PineapplePulp wants to merge 0 commit into
Iainmon:mainfrom
PineapplePulp:main

PineapplePulp commented Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PineapplePulp commented Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant