Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added project/data/ev_final.xlsx
Binary file not shown.
40 changes: 40 additions & 0 deletions project/exercise/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Here's an improved and polished version of the README content:

---

# Learning Goals
- Gain proficiency in **Exploratory Data Analysis (EDA)**.
- Understand and apply **data fraud analysis techniques**.
- Learn to identify anomalies in datasets effectively.

# Exercise Statement
**Objective**: Conduct a comprehensive data fraud analysis on a **battery swap service dataset**.
The dataset contains details of battery swaps across various stations in a city. Your tasks include:
- Identifying potential fraudulent activities, such as revenue losses due to inconsistencies in swap data.
- Proposing effective solutions for detecting and preventing such fraud.

This exercise will not only enhance your analytical skills but also provide practical experience in applying machine learning models for anomaly detection.

# Prerequisites
Before starting this exercise, ensure you have a foundational understanding of:
- Data manipulation techniques using **Python** and **Pandas**.
- Concepts and implementations of **K-Means clustering** and **Isolation Forests** for anomaly detection.

# Dataset Summary
The dataset for this exercise provides real-world data on battery swap activities across city stations. It contains variables such as:
- Swap station ID
- Timestamp of battery swaps
- Battery charge levels before and after swaps
- Revenue details

This data allows you to apply fraud detection techniques and design automated alerts to minimize revenue losses.

# (Optional) Suggested/Proposed Solutions
A potential solution involves the use of **K-Means clustering** to group similar data points and **Isolation Forests** to detect outliers representing anomalies. If required, I can create a pull request with a detailed solution.

# (Optional) Further Links & Credits
This exercise and solution proposal stemmed from insights shared during a **DL2020 lab session**. Additional resources on fraud analysis techniques can be found [here](#).

---

This structure is clearer and more engaging, providing a professional tone while ensuring the content is informative and accessible.
879 changes: 879 additions & 0 deletions project/solution/project.ipynb

Large diffs are not rendered by default.