Skip to content

Unsupervised Learning-based analysis on voter abstention data from Brazilian elections. Using clustering techniques, we aim to identify behavioral patterns in abstention reasons across demographic and regional variables.

License

Notifications You must be signed in to change notification settings

heitornolla/Analysis-of-Voter-Abstention-through-Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Analysis of Voter Abstention on Brazil's Presidential Elections through Data Clustering

Open in Colab

Disclaimer: This is a personal project intended for educational purposes only

This project performs an unsupervised learning analysis on official voter abstention justification data from Brazilian elections. Using clustering techniques, it seeks to identify behavioral patterns in abstention reasons across demographic and regional variables. You can read a paper on the work and our findings here (available only in PT-BR):

Read in Overleaf

Overview

The project performs data preprocessing and cleaning, transforming it for machine learning. It then applies the K-Prototypes clustering algorithm to discover hidden structures in the data. Visualizations are shown to help interpret the results and understand the composition of the identified clusters.

Project Steps

1. Data Loading & Cleaning

  • Load the .csv dataset containing official voter justification records.
  • Drop irrelevant or uniform columns (e.g., protocol numbers, identical values).
  • Map binary values (SIM/NAO) to numeric format.
  • Handle missing or ambiguous values like "NAO INFORMADO".

2. Encoding & Scaling

  • Ordinal encoding of ordered categories.
  • One-hot encoding of nominal categorical variables.

3. Clustering

  • Apply Elbow Method to understand optimal amount of clusters.
  • KMeans was originally used, but later replaced for KPrototypes.

4. Cluster Analysis

  • Evaluate and interpret each cluster based on feature distributions.

About

Unsupervised Learning-based analysis on voter abstention data from Brazilian elections. Using clustering techniques, we aim to identify behavioral patterns in abstention reasons across demographic and regional variables.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published