Skip to content

Repository for the sentiment classification project in the "Computational Intelligence Lab" lecture during FS22

Notifications You must be signed in to change notification settings

jstuder3/cil_taskforce

Repository files navigation

Contrastive Learning for Sentiment Classification

Computational Intelligence Lab Project 2022, ETH Zürich

Justin Studer, Hrishikesh Ghodkih, Joel Neuner-Jehle

The paper can be found here.

Project Description

Contrastive Learning provides a framework for comparing many data types with each other. We propose to use it for Sentiment Classification of informal text snippets and find that it performs on-par with methods that rely on the same base model, with the potential to outperform them when using larger datasets. We also explore the viability of hard sample mining in this context and find that it does not offer any benefits.

The following figure shows the basic idea of the MoCo framework which we used for this project:

Frequently Asked Questions

Q: Where can I find the code?

The code can be found under "code/contrastive approach/" and the appendix with the ablations we did on the alternative methods is "Paper Appendix (Ablations on other methods).pdf".

For the code, "binary_classification_baseline.py" is what we call BCE BERT in the paper, "grubert.py" is GRUBERT, "main.py" is the contrastive approach with MoCo, "main_visualization.py" is what we used to visualize the latent space and "naive_bayes_baseline.py" is what we used for Linear SVC, Logistic Regression, Bernouli NB and Gaussian NB.

Q: What the does the code do?

The basic model we used is bert-base-uncased, which is a ~110M parameter transformer model that was pretrained for natural language tasks, like the one we have at hand. The input data is tokenized and fed through that model.

binary_classification_baseline.py is the baseline which we used as a comparison. It just uses regular binary cross-entropy as a training objective.

main.py contains the contrastive learning version which was (mostly) inspired by "MoCoV1" (link) and DyHardCode (link).

As a TLDR: The contrastive loss objective essentially tries to make the encodings of corresponding elements "as close as possible" in some sense. It pulls together corresponding pairs and pushes apart stuff stuff that does not belong to each other. For our problem, the goal is that in the end there will be two clusters in the latent space, one corresponding to positive-sentiment texts and one to negative-sentiment texts.

Q: How do I run the code?

Make sure to unzip twitter-datasets before running anything. Navigate to "code/contrastive approach/" (this is important because relative paths were used!), then just do

python main.py

or whatever your favourite way of running python code is.

Requirements to run:

Python 3.9+
torch 1.11+
transformers 4.9.1+
An internet connection (because the pre-trained model needs to be downloaded first)

You can inspect the training progress either in your console or via tensorboard. The logs should be written into the (possibly not yet existing) folder "code/contrastive approach/runs" (maybe you'll have to make that yourself before running so it doesn't crash, idk).

About

Repository for the sentiment classification project in the "Computational Intelligence Lab" lecture during FS22

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages