Programmer: Farhan Bin Faisal
Lab: Meltzer Lab
Date Created: 10 November 2022
This repository contains the full stimulus-generation pipeline and experiment link for a verbal N-back training paradigm. It documents how word and pseudoword lists were created and how they are integrated into the PsychoJS/Pavlovia task.
- Overview
- Repository Structure
- Research Poster
- Experiment Paradigm
- Stimulus Generation Pipeline
- Dependencies
- Attribution
The aim of this project is to generate frequency- and syllable-matched word and pseudoword lists for a verbal N-back task, and to use those stimuli in an online training paradigm implemented with PsychoJS and hosted on Pavlovia.
The pipeline:
- Preprocesses a large English word list (filtering by frequency, syllable count, profanity, etc.).
- Uses SOS to generate matched word lists.
- Uses Wuggy to generate pseudoword lists corresponding to each word list.
- Integrates the resulting stimuli into the N-back experiment.
-
preprocess.ipynb
Jupyter notebook for cleaning and filtering the initial word list, lemmatizing items, and preparing SOS-compatible input. -
wordListMaker.m
MATLAB script that interfaces with SOS to generate multiple matched word lists from the preprocessed vocabulary. -
wuggy/postSOS.ipynb
Jupyter notebook that calls Wuggy output and assembles nonword lists corresponding to each word list; flags items that require manual pseudoword generation. -
generateConditions.ipynb
(If used in your project) Notebook for assembling final condition files for the N-back task from the word and pseudoword lists. -
wordLists/
Directory containing the word lists generated by SOS. -
nonWordLists/
Directory containing pseudoword lists (generated by Wuggy and/or manually).
Below is the poster summarizing the N-back training project:
The N-back training task was implemented in JavaScript using PsychoJS and hosted on Pavlovia.
A demo version can be accessed at:
👉 Run the N-back task on Pavlovia
For the demo, please use the following credentials:
session:1participantID:19995
This notebook prepares the word list for SOS:
- Loads a CSV file of candidate words into a pandas DataFrame.
- Filters rows by frequency:
- Retains words with 4.0 < Zipf value < 5.0.
- Computes syllable counts:
- Uses
cmu_dictfrom NLTK.
- Uses
- Filters by syllable count:
- Retains words with 1 or 2 syllables.
- Lemmatizes words:
- Uses WordNetLemmatizer to replace each word with its lemma.
- Discards:
- Words not found in
cmu_dict. - Profane words and proper names.
- Words not found in
- Formats the final DataFrame into SOS-compatible input.
- Outputs a tab-delimited text file:
- Output:
sos_input.txt
- Output:
This MATLAB script uses SOS to generate matched word lists:
- Takes
sos_input.txtas input. - Uses SOS to generate 18 lists of 10 words each.
- Matching criteria:
- Lists are matched on Zipf frequency and syllable count.
- Constraints:
- Soft constraints: used to optimize matching on Zipf value and syllables.
- Hard constraints: enforce minimum syllable count and frequency thresholds.
- Output:
- Word lists saved in the
wordLists/directory.
- Word lists saved in the
This notebook generates pseudoword lists for each word list:
- Iterates through each file in
wordLists/. - Uses Wuggy to derive nonword (pseudoword) lists corresponding to each word list.
- Outputs:
- Nonword lists saved to
nonWordLists/.
- Nonword lists saved to
- Logging:
- Prints the filenames of any word lists containing items that cannot be converted automatically into pseudowords.
- These flagged words must be handled manually.
For words that Wuggy cannot automatically convert:
-
Download Wuggy from:
http://crr.ugent.be/programs-data/wuggy -
Recommended settings:
- Language: Orthographic English
- Match syllable length
- Match word length
- Match transition frequency
- Match 2 out of 3 segments
-
Manually input the flagged words and export pseudowords.
-
Replace the missing entries in the corresponding
nonWordListsfiles.
To run the full pipeline, you will need:
-
Python
pandasnltk(withcmudictand WordNet corpora downloaded)- Jupyter Notebook / JupyterLab
-
MATLAB
- SOS (for stimulus optimization) configured to run with
wordListMaker.m
- SOS (for stimulus optimization) configured to run with
-
Wuggy
- Standalone application from UGent for pseudoword generation
-
PsychoJS / Pavlovia
- For hosting and running the N-back experiment online
This project was developed in the Meltzer Lab by Farhan Bin Faisal as part of an N-back training study.
Please cite or acknowledge the Meltzer Lab and this repository if you reuse the paradigm or stimulus pipeline in your own work.