Advanced analysis and visualization of free recall data in Python.
Features:
- A large library of advanced analyses, tested against published benchmarks
- Flexible analysis customization and plotting
- Tools for exploratory analysis of large datasets
- Extensive automated testing to ensure analysis correctness
- Based around a simple and flexible table-based data format
- Comprehensive documentation and user guide
The name Psifr is pronounced "cipher". It's taken from Psi, in reference to the field of psychology, and FR for free recall.
If you use Psifr, please help support open-source scientific software by citing it in your publications.
Morton, N. W., (2020). Psifr: Analysis and visualization of free recall data. Journal of Open Source Software, 5(54), 2669, https://doi.org/10.21105/joss.02669
You can install the latest stable version of Psifr using pip:
pip install psifrYou can also install the development version directly from the code repository on GitHub:
pip install git+https://github.com/mortonne/psifrTo plot a serial position curve for a sample dataset:
from psifr import fr
df = fr.sample_data('Morton2013')
data = fr.merge_free_recall(df)
recall = fr.spc(data)
g = fr.plot_spc(recall)See the user guide for detailed documentation on importing and analyzing free recall datasets.
Psifr expects data to be in a simple standard format. For example, if subject 1 studied a list of the words "absence", "hollow", "pupil", then recalled "pupil", "absence", the data would be represented in a spreadsheet like this:
| subject | list | trial_type | position | item |
|---|---|---|---|---|
| 1 | 1 | study | 1 | absence |
| 1 | 1 | study | 2 | hollow |
| 1 | 1 | study | 3 | pupil |
| 1 | 1 | recall | 1 | pupil |
| 1 | 1 | recall | 2 | absence |
Datasets can then be read into Python using Pandas and scored for analysis using Psifr:
import pandas as pd
from psifr import fr
raw = pd.read_csv('my_data.csv') # read raw data from a CSV file into a Pandas DataFrame
data = fr.merge_free_recall(raw) # score recall data and prepare for analysis in PsifrSee importing data and scoring data for details.
A range of analyses can be used to help characterize recall performance and recall dynamics:
- Serial position effects:
- Intrusions:
- Probability of intrusions at different list lags:
pli_list_lag
- Probability of intrusions at different list lags:
- Temporal clustering:
- Semantic clustering:
- Semantic distance conditional response probability (distance-CRP):
distance_crp - Semantic clustering score:
distance_rank
- Semantic distance conditional response probability (distance-CRP):
- Category clustering:
- Category clustering score:
category_crp - Adjusted ratio of clustering (ARC) and list-based clustering (LBC):
category_clustering
- Category clustering score:
- Compound clustering:
- Compound lag-CRP:
lag_crp_compound - Compound semantic clustering:
distance_rank_shifted
- Compound lag-CRP:
Each of these analyses can be customized to group trials by condition, filter to only include specific recalls, and more. See the User guide and API reference for details.
In R, the {psifrr} package can be used to call Psifr analyses. It requires a Python installation, but allows free recall analysis to be done in R Studio without requiring users to learn Python.
Some of the analyses supported by Psifr are based on analyses implemented in the Matlab toolbox EMBAM.
pybeh is a direct Python port of EMBAM that supports a wide range of analyses.
Quail runs automatic scoring of free recall data, supports calculation and plotting of some common free recall measures, and has tools for measuring the "memory fingerprint" of individuals.
See Google Scholar for a list of publications using Psifr.
Contributions are welcome to suggest new features, add documentation, and identify bugs. See the contributing guidelines for an overview.
