|
1 | 1 | # Assignment 3: Context Maintenance and Retrieval (CMR)
|
2 | 2 |
|
3 | 3 | ## Overview
|
4 |
| -In this assignment, you will implement the **Context Maintenance and Retrieval (CMR) model** as described in [Polyn, Norman, & Kahana (2009)](https://www.dropbox.com/scl/fi/98pui63j3o62xu96ciwhy/PolyEtal09.pdf?rlkey=42sc17ll573sm83g4q8q9x9nq&dl=1). CMR is a **context-based model of memory search**, extending the **Temporal Context Model (TCM)** to explain how **temporal, semantic, and source context** jointly influence recall. You will fit your implementation to Polyn et al. (2009)'s task-switching free recall data and evaluate how well the model explains the observed recall patterns. |
| 4 | +In this assignment, you will implement the **Context Maintenance and Retrieval (CMR) model** as described in [Polyn, Norman, & Kahana (2009)](https://www.dropbox.com/scl/fi/98pui63j3o62xu96ciwhy/PolyEtal09.pdf?rlkey=42sc17ll573sm83g4q8q9x9nq). CMR is a **context-based model of memory search**, extending the **Temporal Context Model (TCM)** to explain how **temporal, semantic, and source context** jointly influence recall. You will fit your implementation to Polyn et al. (2009)'s task-switching free recall data and evaluate how well the model explains the observed recall patterns. |
5 | 5 |
|
6 | 6 | ## Data Format and Preprocessing
|
7 | 7 | The dataset comprises sequences of presented and recalled words (concrete nouns) from multiple trials of a free recall experiment. As they were studying each word, participants were either asked to judge the referent's *size* (would it fit in a shoebox?) or *animacy* (does it refer to a living thing?). The dataset also includes information about the similarities in meaning between all of the stimuli (semantic similarities).
|
8 | 8 |
|
9 | 9 | Code for downloading and loading the dataset into Python, along with a more detailed description of its contents, may be found in the [template notebook for this assignment](https://github.com/ContextLab/memory-models-course/blob/main/content/assignments/Assignment_3%3AContext_Maintenance_and_Retrieval_Model/cmr_assignment_template.ipynb).
|
10 | 10 |
|
11 |
| -## Model Description |
12 |
| - |
13 |
| -You will implement and fit a **Context Maintenance and Retrieval (CMR)** model that consists of: |
14 |
| - |
15 |
| -### 1. **Encoding Stage: Context Representation** |
16 |
| -- Each studied item is **associated with a context representation** composed of: |
17 |
| - - **Temporal context** (slowly drifting across time). |
18 |
| - - **Semantic context** (pre-existing associative relations). |
19 |
| - - **Source context** (internal or external cues such as task or modality). |
20 |
| - |
21 |
| -- When an item is presented, its **item features** interact with the **current context**. Context is **gradually updated** via: |
22 |
| - |
23 |
| - $$ c_i = \rho_i c_{i-1} + \beta c_{in} $$ |
24 |
| - |
25 |
| - where: |
26 |
| - - $c_i$ is the context at position $i$, |
27 |
| - - $\rho_i$ scales how much prior context persists, |
28 |
| - - $\beta$ controls the **drift rate** of context, |
29 |
| - - $c_{in}$ represents the new item-driven input. |
30 |
| - |
31 |
| -- **Semantic context** is pre-defined by prior associations, while **temporal and source contexts evolve dynamically**. |
32 |
| - |
33 |
| -### 2. **Retrieval Stage: Memory Search** |
34 |
| -Recall is modeled as a **probabilistic competition** where **activated context cues** retrieve stored items: |
35 |
| - |
36 |
| -#### **Activation of Memory Traces** |
37 |
| -- Each item has an **activation score** based on how well it matches the current context: |
38 |
| - |
39 |
| - $$ f_{in} = M_{CF} c_i $$ |
40 |
| - |
41 |
| - where: |
42 |
| - - $M_{CF}$ represents learned **context-to-item** associations. |
43 |
| - - Higher $f_{in}$ means a greater probability of recall. |
44 |
| - |
45 |
| -- **Items compete** for recall through a **leaky, competitive accumulation process**: |
46 |
| - |
47 |
| - $$ x_s = (1 - \tau \kappa - \tau \lambda N) x_{s-1} + \tau f_{in} + \epsilon $$ |
48 |
| - |
49 |
| - where: |
50 |
| - - $\kappa$ controls **memory decay**, |
51 |
| - - $\lambda$ determines **inhibition between competing items**, |
52 |
| - - $\tau$ sets the **speed of evidence accumulation**. |
53 |
| - |
54 |
| -- Once an item **crosses a recall threshold**, it is **recalled and updates context**, influencing **subsequent recalls**. |
55 |
| - |
56 |
| -### 3. **Fitting the Model** |
57 |
| -The following **three key parameters** should be optimized to best match human recall data: |
58 |
| -1. **Context Drift Rate ($\beta$)**: How quickly temporal and source context updates. |
59 |
| -2. **Memory Decay ($\kappa$)**: Governs how quickly items lose activation. |
60 |
| -3. **Inhibition ($\lambda$)**: Controls competition between retrieved items. |
61 |
| - |
62 |
| -## Implementation Tasks |
63 |
| - |
64 |
| -### **Step 1: Implement Context Evolution** |
65 |
| -- Implement **temporal and source context updates** during study. |
66 |
| -- Encode **semantic relations** using a **fixed association matrix**. |
67 |
| - |
68 |
| -### **Step 2: Implement the Recall Process** |
69 |
| -- Compute **retrieval strengths** using **context-to-item activation**. |
70 |
| -- Implement the **competitive accumulation** process. |
71 |
| -- Simulate **context reinstatement** from retrieved items. |
72 |
| - |
73 |
| -### **Step 3: Load and Process Data** |
74 |
| -- Read in the **recall dataset** and parse trials correctly. |
75 |
| -- Compute **observed recall probabilities** for comparison. |
76 |
| - |
77 |
| -### **Step 4: Fit Model Parameters** |
78 |
| -- Optimize **$\beta$ (context drift)**, **$\kappa$ (memory decay)**, and **$\lambda$ (inhibition)** to best match recall patterns. |
79 |
| - |
80 |
| -### **Step 5: Generate Key Plots** |
81 |
| -For each combination of **list length** and **presentation rate**, generate the following plots: |
82 |
| - |
83 |
| -#### **Plot 1: Probability of First Recall as a Function of Serial Position** |
84 |
| -- **X-axis:** Serial position in the presented list. |
85 |
| -- **Y-axis:** Probability that the item is the **first item** recalled. |
86 |
| -- Overlay **model predictions** on observed data. |
87 |
| - |
88 |
| -#### **Plot 2: Conditional Recall Probability as a Function of Lag** |
89 |
| -- **X-axis:** Lag ($l$) (distance between successive recalls). |
90 |
| -- **Y-axis:** Probability of recalling an item at position $i+l$ after recalling $i$. |
91 |
| -- Overlay **model predictions** on observed data. |
92 |
| - |
93 |
| -#### **Plot 3: Overall Probability of Recall as a Function of Serial Position** |
94 |
| -- **X-axis:** Serial position in the presented list. |
95 |
| -- **Y-axis:** Probability of recall at **any output position**. |
96 |
| -- Overlay **model predictions** on observed data. |
97 |
| - |
98 |
| -Each plot should include: |
99 |
| -- **Observed human recall data** (solid line). |
100 |
| -- **Model predictions** (dashed line). |
101 |
| -- **Error bars** (95% confidence intervals). |
102 |
| - |
103 |
| -### **Step 6: Evaluate the Model** |
104 |
| -- Write a **brief discussion** (3-5 sentences) addressing: |
| 11 | +## High-level Model Description |
| 12 | + |
| 13 | +The Context Maintenance and Retrieval (CMR) model comprises three main components: |
| 14 | + |
| 15 | +### 1. **Feature layer ($F$)** |
| 16 | + |
| 17 | +The feature layer represents the experience of the *current moment*. It comprises a representation of the item being studied (an indicator vector of length number-of-items + 1) concatenated with a representation of the current "source context" (also an indicator vector, of length number-of-sources). |
| 18 | + |
| 19 | +### 2. **Context layer ($C$)** |
| 20 | + |
| 21 | +The context layer represents a *recency-weighted average* of experiences up to now. Analogous to the feature layer, the context layer comprises a representation of temporal context (a vector of length number-of-items + 1, representing a transformed version of the item history) concatenated with a representation of the source context (a vector of length number-of-sources, representing a transformed version of the history of sources). |
| 22 | + |
| 23 | +### 3. **Association matrices** |
| 24 | + |
| 25 | +The feature and context layers of the model interact through a pair of association matrices: |
| 26 | + |
| 27 | +- $M^{FC}$ controls how activations in $F$ affect activity in $C$ |
| 28 | +- $M^{CF}$ controls how activations in $C$ affect activity in $F$ |
| 29 | + |
| 30 | +## Model dynamics |
| 31 | + |
| 32 | +### Encoding |
| 33 | + |
| 34 | +Items are presented one at a time in succession; all of the steps in this section are run for each new item. As described below, following a task shift an "extra" (non-recallable) item is "presented," causing $c_i$, $M^{FC}$, and $M^{CF}$ to update. |
| 35 | + |
| 36 | +1. As each new item (indexed by $i$) is presented, the feature layer $F$ is set to $f_i = f_{item} \oplus f_{source}$, where: |
| 37 | + - $f_{item}$ is an indicator vector of length number-of-items + 1. Each item is assigned a unique position, along with an additional "dummy" item that is used to represent non-recallable items. |
| 38 | + - $f_{source}$ is an indicator vector of length number-of-sources. Each possible "source" (i.e., unique situation or task experienced alongside each item) gets one index. |
| 39 | + - $\oplus$ is the concatenation operator. |
| 40 | + |
| 41 | +2. Next, the feature activations project onto the context layer: |
| 42 | + - We compute $c^{IN} = M^{FC} f_i$ |
| 43 | + - Then we evolve context using $c_i = \rho_i c_{i - 1} + \beta c^{IN}$, where |
| 44 | + - $\rho_i = \sqrt{1 + \beta^2\left[ \left( c_{i - 1} \cdot c^{IN}\right)^2 \right]} - \beta \left( c_{i - 1} \cdot c^{IN}\right)$. |
| 45 | + - In setting $\rho_i$, the computations are performed separately for the "item" and "source" parts of context, where $\beta_{enc}^{temp}$ is used to evolve context for the item features and $\beta_{source}$ is used to evolve context for the source features. |
| 46 | + - After a task shift, a "placeholder" item is "presented", and a fourth drift rate parameter ($d$) is used in place of $\beta$. |
| 47 | + |
| 48 | +3. Next, we update $M^{FC}$ (initialized to all zeros): |
| 49 | + - Let $\Delta M^{FC}_{exp} = c_i f_i^T$ |
| 50 | + - $M^{FC} = (1 - \gamma^{FC}) M^{FC}_{pre} + \gamma^{FC} \Delta M^{FC}_{exp}$ |
| 51 | + |
| 52 | +4. Also update $M^{CF}$: |
| 53 | + |
| 54 | + - $M^{CF}_{pre}$ is fixed at the matrix of LSA $\cos \theta$ across words, multiplied by $s$ |
| 55 | + - $M^{CF}_{exp}$ is initialized to all zeros |
| 56 | + - Let $\Delta M^{CF}_{exp} = \phi_i L^{CF} f_i c_i^T$, where |
| 57 | + - $L^{\text{CF}} = |
| 58 | +\left[ |
| 59 | +\begin{array}{cc} |
| 60 | +L_{tw}^{\text{CF}} & L_{ts}^{\text{CF}} \\ |
| 61 | +L_{sw}^{\text{CF}} & L_{ss}^{\text{CF}} |
| 62 | +\end{array} |
| 63 | +\right]$ |
| 64 | + - $t$ represents temporal context |
| 65 | + - $s$ represents source *context* if listed first, or source *features* if listed second |
| 66 | + - $w$ represents item features |
| 67 | + - $L^{CF}_{sw}$ is a parameter of the model (all set to the same value-- size is number-of-sources by (number-of-items + 1)) |
| 68 | + - $L^{CF}_{ts}$ is set to all zeros; size: (number-of-items + 1) by number-of-sources |
| 69 | + - $L^{CF}_{ss}$ is set to all zeros; size: number-of-sources by number-of-sources |
| 70 | + - $L^{CF}_{tw}$ is set to all ones; size: (number-of-items + 1) by (number-of-items + 1) |
| 71 | + - $\phi_i = \phi_s \exp\{-\phi_d (i - 1\}$ + 1, where $i$ is the serial position of the current item |
| 72 | + - $M^{CF} = M^{CF}_{pre} + M^{CF}_{exp}$ |
| 73 | + |
| 74 | +### Retrieval |
| 75 | + |
| 76 | +Recall is guided by *context* using a *leaky accumulator*. Given the current context, the leaky accumulator process runs until either (a) any item crosses a threshold value of 1 (at which point the item is recalled, its features are reinstated in $F$, context is updated as described below, and the retrieval process restarts), **or** (b) more than 9000 time steps elapse without any item crossing the threshold (1 timestep is roughly equivalent to 1 ms). |
| 77 | + |
| 78 | +1. First compute $f^{IN} = M^{CF} c_i$, where $c_i$ is the current context |
| 79 | + |
| 80 | +2. Next, use $f^{IN}$ to guide the leaky accumulator: |
| 81 | + - Initialize $x_s$ to a vector of number-of-items zeros ($s$ indexes the number of steps in the accumulation process) |
| 82 | + - While no not-yet-recalled element (also ignoring the last "unrecallable" item) of $x_s$ is greater than or equal to 1: |
| 83 | + - Set $x_s = x_{s - 1} + \left( f^{IN} - \kappa x_{s - 1} - \lambda N x_{s - 1} \right) d \tau + \epsilon \sqrt{d \tau}$, where |
| 84 | + - $dt = 100$ |
| 85 | + - $d \tau = \frac{dt}{\tau}$ |
| 86 | + - $N_{ij} = 0$ if $i = j$ and $1$ otherwise. |
| 87 | + - $\epsilon \sim \mathcal{N}\left(0, \eta \right)$ |
| 88 | + - If any *already recalled* item crosses the threshold, reset its value to 0.95 (this simulates "[inhibition of return](https://en.wikipedia.org/wiki/Inhibition_of_return)"). |
| 89 | + - If any elements of $x_s$ drops below 0, reset those values to 0. |
| 90 | + - When an item "wins" the recall competition: |
| 91 | + - Reinstate its features in $F$ (as though we were presenting that item as the next $f_i$) |
| 92 | + - Update context from $f_i$ using the same equation for $c_i$ as during presentation. |
| 93 | + - Don't update $M^{CF}$ or $M^{FC}$. |
| 94 | + - Recall the item. |
| 95 | + |
| 96 | +## **Fitting the Model** |
| 97 | + |
| 98 | +In total, there are 13 to-be-learned parameters of CMR (each is a scalar): |
| 99 | +1. $\beta_{enc}^{temp}$: drift rate of temporal context during encoding |
| 100 | +2. $\beta_{rec}^{temp}$: drift rate of temporal context during recall |
| 101 | +3. $\beta^{source}$: drift rate of source context (during encoding) |
| 102 | +4. $d$: temporary contextual drift rate during "placeholder item" presentations after source changes |
| 103 | +5. $L_{sw}^{CF}$: scale of associative connections between source context and item features |
| 104 | +6. $\gamma^{FC}$: relative contribution of $\Delta M_{exp}^FC$ vs. $M_{pre}^{FC}$ |
| 105 | +7. $s$: scale factor applied to semantic similarities when computing $M_{pre}^{CF}$ |
| 106 | +8. $\phi_s$: primacy effect scaling parameter |
| 107 | +9. $\phi_d$: primacy effect decay parameter |
| 108 | +10. $\kappa$: decay rate of leaky accumulator |
| 109 | +11. $\lambda$: lateral inhibition parameter of leaky accumulator |
| 110 | +12. $\eta$: noise standard deviation in leaky accumulator |
| 111 | +13. $\tau$: time constant for leaky accumulator |
| 112 | + |
| 113 | +Fit the model to the following curves and measures from the Polyn et al. (2009) dataset (provided in the template notebook): |
| 114 | + - Probability of first recall |
| 115 | + - Serial position curve |
| 116 | + - Lag-CRP |
| 117 | + - Temporal clustering factor |
| 118 | + - Source clustering factor |
| 119 | + |
| 120 | +There are several possible ways to accomplish this. My recommended approach is: |
| 121 | +1. Split the dataset into a training set and a test set |
| 122 | +2. Compute the above curves/measures for the training set and concatenate them into a single vector |
| 123 | +3. Use [scipy.optimize.minimize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize) to find the set of model parameters that minimizes the mean squared error between the observed curves and the CMR-estimated curves (using the given parameters). |
| 124 | +4. Compare the observed performance vs. CMR-estimated performance (using the best-fitting parameters) for the test data |
| 125 | + |
| 126 | + |
| 127 | +## Summary of Implementation Tasks |
| 128 | +1. Use the descriptions above to implement CMR in Python |
| 129 | +2. Write code for constructing the behavioral curves/measures listed above |
| 130 | +3. Fit CMR's parameters to the dataset provided in the template notebook (compare with Table 1 in Polyn et al., 2009) |
| 131 | +4. Plot the observed vs. CMR-estimated curves/measures |
| 132 | +5. Write a **brief discussion** (3-5 sentences) addressing: |
105 | 133 | - **Does the model explain the data well?**
|
106 | 134 | - **Which patterns are well captured?**
|
107 | 135 | - **Where does the model fail, and why?**
|
108 | 136 | - **Potential improvements or limitations of CMR.**
|
109 | 137 |
|
110 |
| -## Grading Criteria |
111 |
| -- **Model Implementation (50%)**: |
112 |
| - - Correct encoding, retrieval, and stopping rule logic. |
113 |
| - - Accurate probability calculations. |
114 |
| - - Clear and well-documented code. |
115 |
| - |
116 |
| -- **Plot Accuracy (45%)**: |
117 |
| - - **15%** for each of the three required plots. |
118 |
| - - Must correctly overlay observed data and model predictions. |
119 |
| - |
120 |
| -- **Interpretation and Discussion (5%)**: |
121 |
| - - Thoughtful analysis of model fit. |
122 |
| - - Identifies strengths and weaknesses of CMR. |
123 |
| - |
124 | 138 | ## Submission Instructions
|
125 |
| -- Submit a **Google Colaboratory notebook** (or similar) that includes: |
| 139 | +- Submit (on [canvas](https://canvas.dartmouth.edu/courses/71051/assignments/517355)) a **Google Colaboratory notebook** (or similar) that includes: |
126 | 140 | - Your **full implementation** of the CMR model.
|
127 | 141 | - **Markdown cells** explaining your code, methodology, and results.
|
128 | 142 | - **All required plots** comparing model predictions to observed data.
|
|
0 commit comments