Skip to content

methedjangoguy/detecting-fake-Instagram-accounts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Detecting Fake Instagram Accounts with Data Analysis and Machine Learning

Project Overview

This basic project aims to identify and classify Instagram accounts as either authentic or fake. By leveraging data analysis and machine learning techniques, we develop a model capable of discerning patterns and characteristics typical of fraudulent accounts. The goal is to enhance the integrity of user interactions on the platform by minimizing the presence of spam and malicious entities.

Table of Contents

Motivation

The proliferation of fake accounts on social media platforms has been a growing concern, affecting user experience by promoting spam and disseminating misinformation. Instagram, being one of the largest social networks, is not immune to this challenge. This project is driven by the need to safeguard the community and ensure a trustworthy social environment for users.

Data Source and Collection

The dataset used in this project was compiled from publicly available sources, consisting of various features indicative of account behavior and characteristics. It includes information on follower counts, post frequency, bio descriptions, and other relevant attributes.

Note: Due to privacy considerations, personal information was anonymized and handled in accordance with data protection regulations.

Features Used

  • Number of followers
  • Number of people followed
  • Number of posts
  • Presence of a profile picture
  • Length and content of the bio
  • Username complexity (ratio of numbers to letters)

Methodology

My approach combines exploratory data analysis (EDA) to understand the dataset and feature engineering to prepare the data for modeling. We then apply several machine learning algorithms, including Logistic Regression, Random Forest, and Gradient Boosting, to classify accounts. Model performance is evaluated using accuracy, precision, recall, and F1 score metrics.

Installation

To set up the project environment, follow these steps:

git clone https://github.com/methedjango/detecting-fake-Instagram-accounts.git
cd detecting-fake-Instagram-accounts

Usage

To run the analysis and model training scripts, open the ipynb file:

jupyter notebook

Results and Discussion

The findings indicate that certain features, such as the ratio of followers to following and the presence of a bio, are significant indicators of account authenticity. The Random Forest model demonstrated the highest efficacy, achieving an accuracy of XX%.

Future Work

  • Integrate real-time detection capabilities into the Instagram API.
  • Explore additional features, such as engagement rates and the sentiment of comments.
  • Implement deep learning techniques for more sophisticated pattern recognition.

Contributing

I welcome contributions to improve the accuracy and efficiency of our fake account detection model. Please refer to CONTRIBUTING.md for guidelines on submitting pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Special thanks to the Instagram community for supporting this project.

About

The goal is to enhance the integrity of user interactions on the platform by minimising the presence of spam and malicious Insta Ids..

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors