finished official first draft of intro text

camila-maura · camila-maura · commit 00c3563d7d0a · 2025-06-22T20:56:05.000-04:00
diff --git a/docs/higher-order/GLM_pynapple_nemos.ipynb b/docs/higher-order/GLM_pynapple_nemos.ipynb
@@ -8,52 +8,83 @@
     "# Generalized Linear Models using Pynapple & NeMos\n",
     "In this notebook, we will use NeMos and Pynapple packages (supported by the [Flatiron Institute](https://neurorse.flatironinstitute.org)), to model spiking neural data using [Generalized Linear Models (GLM)](https://en.wikipedia.org/wiki/Generalized_linear_model). We will explain what GLMs are and which are their components, then use Pynapple and NeMos python packages to preprocess real data from the Allen Institute and use a GLM model to predict spiking neural data as a function of stimuli.\n",
     "\n",
-    "A GLM is a regression model which trains a filter to predict a value (output) as it relates to some other variable (or input). \n",
+    "A GLM is a regression model which trains a filter to predict a value (output) as it relates to some other variable (or input). It is called \"generalized\" because it constitutes a generalization of [ordinary linear regression](https://en.wikipedia.org/wiki/Linear_regression). Ordinary linear regression assumes that a constant change in a predictor leads to a constant change in the response variable, but this assumption is no longer useful for some types of response variables. In particular, when interested in modeling spikes, these are never expected to be negative. Moreover, spikes do not vary in a constant manner (usually, the variance of spike counts changes with the mean firing rate: neurons with average higher firing rates tend to have a higher variability than neurons with average lower firing rates). Thus, if interested in modeling this kind of phenomenon, the model of choice must account for these restrictions (:\n",
     "\n",
-    ":::{admonition} Why are GLMs called GLMs?\n",
-    ":class: info\n",
-    ":class: dropdown\n",
-    "\n",
-    "It is called \"generalized\" because it constitutes a generalization of [ordinary linear regression](https://en.wikipedia.org/wiki/Linear_regression). Ordinary linear regression assumes that a constant change in a predictor leads to a constant change in the response variable, but this assumption is no longer useful for some types of response variables. In particular, when interested in modeling spikes, these are never expected to be negative. Moreover, spikes do not vary in a constant manner (usually, the variance of spike counts changes with the mean firing rate: neurons with average higher firing rates tend to have a higher variability than neurons with average lower firing rates). Thus, if interested in modeling this kind of phenomena, the model of choice must account for these restrictions (:\n",
-    ":::\n",
-    "\n",
-    "\n",
-    "\n",
-    "In the neuroscience context, we can use a particular type of GLM to predict spikes: linear-nonlinear-Poisson (LNP) model.\n",
+    "In the neuroscience context, we can use a particular type of GLM to predict spikes: the linear-nonlinear-Poisson (LNP) model.\n",
     "<figure>\n",
-    "<img src=\"lnp_model.svg\" style=\"width:130%\"/>\n",
+    "<img src=\"lnp_model.svg\" style=\"width:50%\"/>\n",
     "<figcaption align = \"center\"> LNP model schematic. Modified from <a href=\"https://www.nature.com/articles/nature07140\">Pillow et al., 2008</a></figcaption>\n",
     "</figure>\n",
-    " That is, the model receives one or more inputs and then:\n",
-    "\n",
-    "1. Sends them through a linear filter or transformation\n",
+    " This type of model receives one or more inputs and then sends them through a linear  \"filter\" or transformation, passes said transformation through a nonlinearity to get the firing rate and uses that firing rate as the mean of a Poisson process to generate spikes. We will go through each of these steps one by one:\n",
     "\n",
-    "    The input (s) (also known as \"predictor(s)\") are first passed through a linear transformation: \n",
+    "1. Sends them through a linear \"filter\" or transformation\n",
+    "     \n",
+    "    The inputs (also known as \"predictors\") are first passed through a linear transformation:\n",
     "    \n",
-    "    $WX + c$\n",
+    "    $$\n",
+    "    \\begin{aligned}\n",
+    "    L(X) = WX + c\n",
+    "    \\end{aligned}\n",
+    "    $$\n",
+    "\n",
+    "    Where $L$ is the filter, $X$ is the input (in matrix form), $W$ is a matrix and $c$ is a vector.\n",
     "\n",
-    "    This scales (makes bigger or smaller) or shifts (up or down) the input. When there is zero input, this is equivalent to changing the baseline rate of the neuron, which is how the intercept should be interpreted. So far, this is the same treatment of an ordinary linear regression. \n",
+    "    $L$ scales (makes bigger or smaller) or shifts (up or down) the input. When there is zero input, this is equivalent to changing the baseline rate of the neuron, which is how the intercept should be interpreted. So far, this is the same treatment of an ordinary linear regression. \n",
     "\n",
-    "2. Passes the transformation through a nonlinearity to obtain a firing rate.\n",
+    "2. Passes the transformation through a nonlinearity to get the firing rate.\n",
     "    \n",
-    "    The aim of a GLM is to predict spiking activity. In particular, to predict a neuron's firing rate, which must be non-negative. This, as mentioned, is what the non-linearity part of the model handles: by passing the linear transformation through an exponential function, it is assured that it will always be non-negative. \n",
+    "    The aim of a LNP model is to predict the firing rate of a neuron and use it to generate spikes, but if we were only to keep $L(X)$ as it is, we would quickly notice that we could obtain negative values for firing rates, which makes no sense! This is what the nonlinearity part of the model handles: by passing the linear transformation through an exponential function, it is assured that the resulting firing rate will always be non-negative. \n",
     "\n",
-    "    As such, the firing rate, according to GLMs, would be defined as:\n",
+    "    As such, the firing rate in a LNP model is defined:\n",
     "\n",
-    "    $\\lambda =  exp(WX + c)$\n"
+    "    $$\n",
+    "    \\begin{aligned}\n",
+    "    \\lambda =  exp(L(X))\n",
+    "    \\end{aligned}\n",
+    "    $$\n",
+    "\n",
+    "    where $\\lambda$ is a vector containing the firing rates corresponding to each timepoint."
    ]
   },
   {
    "cell_type": "markdown",
    "id": "2a3ef026",
    "metadata": {},
    "source": [
-    ":::{admonition} A note on non-linearity\n",
+    ":::{admonition} A note on nonlinearity\n",
+    ":class: info\n",
+    ":class: dropdown\n",
+    "\n",
+    "\n",
+    "In NeMoS, the nonlinearity is kept fixed. We default to the exponential, but a small number of other choices, such as soft-plus, are allowed. The allowed choices guarantee both the non-negativity constraint described above, as well as convexity, i.e. a single optimal solution. In principle, one could choose a more complex nonlinearity, but convexity is not guaranteed in general.\n",
+    ":::"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7d32d362",
+   "metadata": {},
+   "source": [
+    ":::{admonition} What is the difference between a \"link function\" and the \"nonlinearity\"?\n",
     ":class: info\n",
     ":class: dropdown\n",
     "\n",
+    "The link function states the relationship between the linear predictor and the mean of the distribution function. If $g$ is a link function, $L(⋅)$ is the linear predictor and $\\lambda$ the mean of the distribution function:\n",
+    "\n",
+    "$$\n",
+    "\\begin{aligned}\n",
+    "g(\\lambda) = L(⋅)\n",
+    "\\end{aligned}\n",
+    "$$\n",
+    "\n",
+    "$$\n",
+    "\\begin{aligned}\n",
+    "\\lambda = g^{-1}(L(⋅))\n",
+    "\\end{aligned}\n",
+    "$$\n",
+    "\n",
+    "the \"nonlinearity\" is the name for the inverse of the link function $g^{-1}(⋅)$.\n",
     "\n",
-    "In NeMoS, the non-linearity is kept fixed. We default to the exponential, but a small number of other choices, such as soft-plus, are allowed. The allowed choices guarantee both the non-negativity constraint described above, as well as convexity, i.e. a single optimal solution. In principle, one could choose a more complex non-linearity, but convexity is not guaranteed in general.\n",
     ":::"
    ]
   },
@@ -66,23 +97,69 @@
     "\n",
     "    A Poisson process is a special type of point process, in which the events are statistically independent. With these type of GLMs, each spike train is a sample from a Poisson process with the mean equal to the firing rate, i.e., output of the linear-nonlinear parts of the model. \n",
     "\n",
-    "    Remember, spiking is a stochastic process. That means that a given firing rate can give rise to a variety of different spike trains. Given that this is a stochastic process that could produce an infinite number of possible spike trains, how do we compare our model against the single observed spike train we have? We use the log-likelihood. This quantifies how likely it is to observe the given spike train for the computed firing rate: if $y(t)$ is the spike counts and $\\lambda(t)$ the firing rate, the equation for the log-likelihood is\n",
+    "    Remember, spiking is a stochastic process. That means that a given firing rate can give rise to a variety of different spike trains. Given that this is a stochastic process that could produce an infinite number of possible spike trains, how do we compare our model against the single observed spike train we have? We use the log-likelihood. This quantifies how likely it is to observe the given spike train for the computed firing rate: if $y(t)$ is the spike counts and $\\lambda(t)$ the firing rate at time $t$, the equation for the log-likelihood is\n",
+    "\n",
+    "    $$\n",
+    "    \\begin{aligned}\n",
+    "\n",
+    "    \\sum_{t}logP(y(t)|\\lambda(t)) = \\sum_{t}y(t)log(\\lambda(t)) - \\lambda(t)\n",
     "\n",
-    "    $\\sum_{t}logP(y(t)|\\lambda(t) = \\sum_{t}y(t)log(\\lambda(t))) - \\lambda(t) - log(y(t)!)$"
+    "    \\end{aligned}\n",
+    "    $$\n",
+    "\n",
+    "    This is the objective function of the GLM model: we are trying to find the firing rate that maximizes the likelihood of the observed spike train."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02063283",
+   "metadata": {},
+   "source": [
+    ":::{admonition} A note on log-likelihood of a Poisson distribution\n",
+    ":class: info\n",
+    ":class: dropdown\n",
+    "\n",
+    "To be precise, the log-likelihood of a Poisson distribution is:\n",
+    "\n",
+    "$$\n",
+    "\\begin{aligned}\n",
+    "\n",
+    "\\sum_{t}logP(y(t)|\\lambda(t)) = \\sum_{t}y(t)log(\\lambda(t)) - \\lambda(t) - \\sum_{t}log(x(t)!)\n",
+    "\n",
+    "\\end{aligned}\n",
+    "$$\n",
+    "\n",
+    "However, one can see that the term $- \\sum_{t}log(x(t)!)$ does not depend on $\\lambda$, and thus is independent of the model. Because of that, it is normally ignored.\n",
+    "\n",
+    ":::"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b2d41fee",
+   "metadata": {},
+   "source": [
+    ":::{admonition} What is the \"link function\" in the case of a LNP model?\n",
+    ":class: info\n",
+    ":class: dropdown\n",
+    "\n",
+    "In the case of a LNP model, the distribution function is a Poisson process with a mean of $\\lambda$. The \"non-linearity\", as mentioned before, is an exponential. So the \"link function\" would be the inverse, the logarithm!\n",
+    ":::"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "55cf8020",
    "metadata": {},
    "source": [
-    ":::{admonition} More resources\n",
+    ":::{admonition} More resources on GLMs\n",
     ":class: seealso\n",
     ":class: dropdown\n",
     "\n",
-    "If you would like to learn more in depth about GLMs, you can refer to:\n",
+    "If you would like to learn more about GLMs, you can refer to:\n",
     "\n",
     "- [Nemos GLM tutorial](https://nemos.readthedocs.io/en/latest/background/plot_00_conceptual_intro.html): for a bit more detailed explanation of all the components of a GLM within the NEMOS framework, as well as some nice visualizations of all the steps of the input transformation!\n",
+    "- [Neuromatch Academy GLM tutorial](https://compneuro.neuromatch.io/tutorials/W1D3_GeneralizedLinearModels/student/W1D3_Tutorial1.html): for a bit  more detailed explanation of the components of a GLM, slides and some coding exercises to ensure comprehension.\n",
     "- [Youtube video on LNP](https://www.youtube.com/watch?v=i62gffPrZYA): Although outside of the context of Neuroscience, it does go step by step explaining what LNPs are with visualizations and notes on limitations of the model (:\n",
     ":::"
    ]
@@ -92,8 +169,7 @@
    "id": "c5945cfd",
    "metadata": {},
    "source": [
-    "\n",
-    "We will be analyzing data from the [Visual Coding - Neuropixels dataset](https://portal.brain-map.org/circuits-behavior/visual-coding-neuropixels), published by the Allen Institute. This dataset uses [extracellular electrophysiology probes](https://www.nature.com/articles/nature24636) to record spikes from multiple regions in the brain during passive visual stimulation. For simplicity, we will focus on the activity of neurons in the visual cortex (VISp) during passive visual stimulation: full-field flashes, of color either black or white. \n",
+    "For this tutorial, we will be analyzing data from the [Visual Coding - Neuropixels dataset](https://portal.brain-map.org/circuits-behavior/visual-coding-neuropixels), published by the Allen Institute. This dataset uses [extracellular electrophysiology probes](https://www.nature.com/articles/nature24636) to record spikes from multiple regions in the brain during passive visual stimulation. For simplicity, we will focus on the activity of neurons in the visual cortex (VISp) during passive visual stimulation: full-field flashes, of color either black or white. \n",
     "\n",
     "Our aim is to model spiking activity from neurons (spiking rates and spike trains) from the Primary Visual cortex (VISp) as a function of the presented visual stimuli. Moreover, we will also compare this model including an extra predictor: spike history, to see whether adding history as a predictor improves the performance of the model."
    ]
@@ -1959,27 +2035,6 @@
     "### References"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "228bc865",
-   "metadata": {},
-   "source": [
-    "### Delete later\n",
-    "\n",
-    "<div class=\"admonition info\">\n",
-    "<p class=\"admonition-title\">Resources</p>\n",
-    "<p>\n",
-    "If you would like to learn more in depth about GLMs, you can refer to:\n",
-    "\n",
-    "<p>\n",
-    "\n",
-    "- [Nemos GLM tutorial](https://nemos.readthedocs.io/en/latest/background/plot_00_conceptual_intro.html): for a bit more detailed explanation of all the components of a GLM within the nemos framework, as well as some nice visualizations of all the steps of the input transformation!</p>\n",
-    "\n",
-    "\n",
-    "</p>\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "79dec3fc",
@@ -2019,8 +2074,16 @@
     "- improve responsiveness explanation/notation\n",
     "- Should I z-score firing rates so its easier to compare? in single unit raster\n",
     "- Add smoothing to plotting added to hist!!!\n",
-    "- Add single cell prediction perievent"
+    "- Add single cell prediction perievent\n",
+    "- Read all of the texts\n",
+    "- center image"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "338acb72",
+   "metadata": {},
+   "source": []
   }
  ],
  "metadata": {