Skip to content

jnadal14/Detoxification_LLM_Dataflow

Repository files navigation

COLX 565 Project – Agentic Detoxification Dataflow

Group Repository: Tim and Jacob

This repository contains the code, data, and analysis for our COLX 565 Final project focused on agentic workflows for text sentiment analysis and detoxification. The goal of this project is to explore and evaluate the effectiveness of an agentic dataflow approach to reduce toxicity in text while preserving meaning and sentiment.

Project Structure

detoxification_agentic_dataflow/
├── agentic_workflow.ipynb              # Main notebook outlining the detoxification workflow
├── Annotation_Statstics_notebook.ipynb # Analysis of annotation statistics
├── Detoxification Annotations.csv      # Annotated detoxification dataset 
├── results/
│   ├── sentiment_results.csv           # Sentiment analysis results
│   └── toxicity_results.csv            # Toxicity analysis results
├── data/
│   ├── multilingual-sentiment-test-solutions.csv  # Gold standard test set for multilingual sentiment
│   └── toxic-test-solutions.csv                   # Gold standard test set for toxicity classification

Notebooks

  • agentic_workflow.ipynb
    Contains scripts to load and process the datasets. Employs UBC-NLP/toucan-base for text translation from African languages to English and ibm-granite/granite-3.2-2b-instruct for sentiment/toxicity classification, explanation generation, and detoxification.

  • Annotation_Statstics_notebook.ipynb
    Contains scripts to process Detoxification Annotations.csv, calculating different metrics for agreement, discussed further in Final_Report.pdf.

Dataset

  • multilingual-sentiment-test-solutions.csv: Ground-truth data for evaluating multilingual sentiment analysis.
  • toxic-test-solutions.csv: Ground-truth data for evaluating toxicity classification accuracy.
  • Detoxification Annotations.csv: A subset of toxicity_results.csv containing rows identified as "toxic", with three appended columns; an LLM-generated detoxified version of the original sentence and one detoxification score (1-10) from each author of this project.

Results

  • sentiment_results.csv: Sentences from multilingual-sentiment-test-solutions.csv along with their id, target class label, predicted class label ("positive", "mixed" or "neutral"), and explanation (generated by the LLM).
  • toxicity_results.csv: Sentences from toxic-test-solutions.csv along with their id, target class label, predicted class label ("toxic" or "non-toxic"), and an explanation (generated by the LLM).
  • Final_Report.pdf: ACL formatted report discussing our observations and findings.

About

Code to detect and translate African languages to English, perform sentiment analysis and detoxification if necessary.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors