This repository contains the reproduction code for the paper Quantifying Global Foreign Affairs with a Multimodal Dataset of Diplomatic Websites by Nihat Mugurtay, Kaan Guray Sirin , Mehrdad Heshmat Najafabad, Ahmet Taha Kahya, Fazli Goktug Yilmaz , Yasser Zouzou, Batuhan Bahceci , Ayca Demir , Dogukan Tosun, Meltem Müftüler-Baç, Onur Varol.
Detailed description about the data can be found on Harvard Dataverse. Please refer to the dataset’s README or the journal paper for any details regarding data fields, folder structures or the content.
This repository contains three main folders:
figures/: Contains the code and Jupyter Notebooks for reproducing the figures presented in the paper. Figure subfolders may also contain external data.Sample-WebScraping/: Includes sample scraper and parser scripts for demonstrating the data collection process of the dataset. The samples cover three approaches: dynamic webpages, static webpages, and webpages requiring a proxy.statistics/: Contains summary statistics of the dataset and the code used to generate it.
Required Python libraries are listed in requirements.txt. Since the dependencies are very standard, most users’ existing Python environments should already have the required packages installed.
To set up a virtual environment and install the required libraries:
python -m venv venv
source venv/bin/activate # On macOS/Linux
venv\Scripts\activate # On Windows
pip install -r requirements.txt(in no particular order)
- Nihat Mugurtay
- Kaan Guray Sirin
- Mehrdad Heshmat Najafabad
- Ahmet Taha Kahya
- Fazli Goktug Yilmaz
- Onur Varol
@article{,
title={},
author={},
journal={},
year={},
url={}
}This work is supported by TUBITAK under the grant agreement 223K173. We also thank TUBITAK 121C220 for their partial support.