This repository contains the code, data, and analysis for the Master's thesis project "Trends in Art History" by Tessel Haagen. The project applies text mining and topic modeling techniques to historical art texts to uncover trends and patterns.
-
src/
Source code for data processing, topic modeling, and utility functions. -
notebooks/
Jupyter notebooks for exploratory data analysis, model development, and visualization. -
data/
Contains raw and processed data (large files are gitignored). -
figures/
Output figures and plots generated from the analysis. -
survey answers/
Contains survey data, topic analysis CSVs, and related scripts/notebooks. -
pyproject.toml,poetry.lock
Project dependencies and environment management using Poetry.
-
Clone the repository:
git clone <repo-url> cd Tessel_Haagen_Trends_In_Art_History
-
Install dependencies: Make sure you have Poetry installed.
poetry install
-
Run Jupyter notebooks:
poetry run jupyter notebook
Open and run notebooks in the
notebooks/orsurvey answers/directories.
- Topic modeling with BERTopic and KeyBERT.
- Sentiment analysis and survey evaluation.
- Visualization of topic and sentiment trends over time.
- Scripts for preprocessing and analyzing historical art texts.
Some data directories (e.g., data/ecco/, data/eebo/) are excluded from version control due to size. You can get the data from https://textcreationpartnership.org/faq/#faq05
Tessel Haagen
Master's Thesis, Text Mining