Skip to content

Commit 7ed0c61

Browse files
committed
adding a readme for the data analysis
1 parent 46f4203 commit 7ed0c61

File tree

1 file changed

+31
-0
lines changed

1 file changed

+31
-0
lines changed

4_data_analysis/README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,32 @@
11
# Data Analysis
2+
3+
This folder contains the Jupyter notebook used to **model and analyze**
4+
student performance data.
5+
6+
## Dataset
7+
8+
- **File used:** [`cleaned_sed_dataset.csv`](2_data_preparation\cleaned_data\cleaned_sed_dataset.csv)
9+
- Same cleaned dataset from the data preparation stage.
10+
11+
## Notebook Contents
12+
13+
- Loads the prepared dataset
14+
- Selects relevant features for modeling
15+
- Splits data into training and testing sets
16+
- Trains a linear regression model
17+
- Evaluates predictions using Mean Squared Error (≈107) and R² score (≈0.69)
18+
- Visualizes actual vs. predicted average marks
19+
- Extracts and plots feature coefficients to interpret importance
20+
21+
## Purpose
22+
23+
This analysis moves beyond EDA to:
24+
25+
- Build a simple, interpretable baseline model
26+
- Identify which engagement features most influence student marks
27+
- Evaluate predictive power and understand limitations
28+
29+
## How to Run
30+
31+
Open the notebook in Jupyter or Google Colab and run all cells. Ensure
32+
`cleaned_sed_dataset.csv` is present in the expected path.

0 commit comments

Comments
 (0)