This repo contains a comprehensive introduction to information theory for social scientists and other people who do not necessarily have a mathematical background, with a particular emphasis on the role of information theory in language modelling. Moreover, the material also connects these mathematical concepts to relevant interpretations and extensions of them in the social sciences. The material is best accessed through the Quarto rendered tutorial at mikaelbrunila.com/information-theory/, but can also be used by cloning the repo and using the notebooks locally.
This material also functions as an appendix to a number of my academic articles on large language models (LLMs) and information theory, including the articles "Cosine Capital: Large Language Models and the Embedding of All Things" and "Taking AI Into the Tunnels".
As of November 18th 2025, I have only finished the first part that introduces the idea of language as a probability distribution and bits as a representation of these probabilities, "differences which make a difference," in the words of anthropologist Gregory Bateson. I am working on completing the other parts on information theory, followed by a section on the relationship information theory and, respectively, embeddings (using Word2Vec) and attention (using GPT-2).
@online{brunila2025informationtheory,
author = {Brunila, Mikael},
title = {From Bits to Embeddings – A Critical Introduction to Information Theory},
year = {2025},
url = {https://mikaelbrunila.com/information-theory},
note = {Online tutorial},
langid = {en}