This repository contains my personal module report for DCU, 2024. The experiments are reproduced from From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models presented at ACL 2023. I utilized newer models, Llama-3 and GPT-4o, for comparison, and fine-tuned GPT-2 using the POLITICS news dataset.
Code of this part can be found in PoliLean Public. And the requirement of enviroment as well.
For datasets used to fineture, visit POLITICS.
This folder contains two Python scripts: one for generating a smaller training set and another for fine-tuning the model.
Prompts, statements, example response and score are provided in this folder.