This repository accompanies the study "Encoding Imagism? Measuring Literary Imageability, Visuality and Concreteness via Multimodal Word Embeddings."
The main directory contains three experiments, each with its own .py file.
- In
experiment_2.py, set the target file at the top of the script (Hemingway, Woolf, or the Chicago corpus). functions.pyprovides helpers for generating embeddings and dictionary-based scores.- The
datadirectory holds the textual input, precomputed embeddings (data/embeddings/), and precomputed dictionary scores (data/measures/). These can be regenerated by deleting the corresponding files and rerunning the scripts. resourcescontains the dictionaries used in the analyses.figsandresultsare populated automatically when the experiments are executed.
- Runs comfortably on CPU only; on an Apple M3 (16GB RAM) it completes in a few minutes.
- No GPU needed — all CLIP embeddings are precomputed and loaded directly.
- Requires Python 3.9+ and standard scientific libraries:
numpy,pandas,matplotlib,seaborn,scipy,scikit-learn,spacy,torch,transformers - Install the spaCy model:
python -m spacy download en_core_web_sm