Related work

To update this document:

Add bib entries to columnformers.bib.
Make edits to RELATED_WORK_pandoc.md.
Generate RELATED_WORK.md with compiled citations using pandoc by running make.

Topographic neural networks

[1] is a major inspiration for this work. The authors introduce the All-TNN architecture, which is basically a CNN without weight sharing.
[2], [3] are other important works studying the emergence of topography in neural networks.
[4] discusses the biological implausibility of weight sharing and proposes some strategies for training locally connected networks without weight sharing.
[5] shows that imposing topographic constraints on the hidden units of a CNN results in emergent processing “streams” similar to the primate dorsal/ventral stream.

Alternatives to transformers

Attention free transformers (AFT) [6]. The idea of the additive bias in place of the multiplicative query is especially relevant.
RWKV with builds on AFT [7].
Graph attention networks [8].
Capsule networks [9], which have a similar inspiration to what we’re exploring.

General inspiration

The perspective in [10] viewing the cortex as a uniform sheet of computational modules, and thinking of attention as communication.
Geoff Hinton’s discussion of weight sharing and local constrastive distillation in [11].
The discussion of geometry constraining brain function in [12].
Spatially embedded recurrent networks in [13].

References

[1] Z. Lu et al., “End-to-end topographic networks as models of cortical map formation and human visual behaviour: Moving beyond convolutions,” arXiv preprint arXiv:2308.09431, 2023, doi: 10.48550/arXiv.2308.09431.

[2] F. R. Doshi and T. Konkle, “Cortical topographic motifs emerge in a self-organized map of object space,” Science Advances, 2023, doi: 10.1126/sciadv.ade8187.

[3] E. Margalit et al., “A unifying principle for the functional organization of visual cortex,” bioRxiv, 2023, doi: 10.1101/2023.05.18.541361.

[4] R. Pogodin, Y. Mehta, T. Lillicrap, and P. E. Latham, “Towards biologically plausible convolutional networks,” Advances in Neural Information Processing Systems, 2021.

[5] D. Finzi et al., “A single computational objective drives specialization of streams in visual cortex,” bioRxiv, 2023, doi: 10.1101/2023.12.19.572460.

[6] S. Zhai et al., “An attention free transformer,” arXiv preprint arXiv:2105.14103, 2021.

[7] B. Peng et al., “RWKV: Reinventing RNNs for the transformer era,” arXiv preprint arXiv:2305.13048, 2023.

[8] P. Veličković et al., “Graph attention networks,” in International conference on learning representations, 2018.

[9] S. Sabour, N. Frosst, and G. E. Hinton, “Dynamic routing between capsules,” Advances in neural information processing systems, 2017.

[10] A. Karpathy, “Introduction to transformers.” https://youtu.be/XfpMkf4rD6E?si=AM9AWDegUaFB7KCe, 2023.

[11] G. Hinton, “The robot brains season 2 episode 22.” https://www.therobotbrains.ai/who-is-geoff-hinton-part-two, 2022.

[12] J. C. Pang et al., “Geometric constraints on human brain function,” Nature, 2023.

[13] J. Achterberg et al., “Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings,” Nature Machine Intelligence, 2023.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Related work

Topographic neural networks

Alternatives to transformers

General inspiration

References

FilesExpand file tree

RELATED_WORK.md

Latest commit

History

RELATED_WORK.md

File metadata and controls

Related work

Topographic neural networks

Alternatives to transformers

General inspiration

References