This project focuses on detecting fraudulent credit card transactions using machine learning techniques. The dataset used contains anonymized transaction data, and the goal is to build a model that can accurately distinguish between fraudulent and non-fraudulent transactions.
The dataset used for this project can be accessed via Google Drive at the following link:
Credit Card Fraud Detection Dataset
- CreditCardFraud_Intern.ipynb: Jupyter notebook containing data preprocessing, analysis, visualization, and machine learning model training.
- README.md: Project documentation.
To run this project, you need to have the following dependencies installed:
pip install pandas numpy matplotlib seaborn scikit-learn google-colab
- Clone the repository:
git clone https://github.com/SAMI-CODEAI/CreditCardFraudDetection.git
- Navigate to the project directory:
cd CreditCardFraudDetection
- Open the Jupyter notebook:
jupyter notebook CreditCardFraud_Intern.ipynb
- Mount Google Drive in the notebook and load the dataset from the provided link.
- Handling missing values
- Exploratory data analysis (EDA)
- Feature scaling and engineering
The following models are implemented and evaluated:
- Logistic Regression
- Linear Regression
- Performance evaluation using accuracy, confusion matrix, and classification report
The model's performance is analyzed by:
- Percentage of fraudulent and non-fraudulent transactions
- Visualization of transaction amounts over time
- Correlation heatmap of dataset features
The project includes the following visualizations:
- Time vs. Amount plot
- Distribution curve of transaction amounts
- Correlation heatmap