Skip to content

Commit 54ecf78

Browse files
authored
Merge pull request #73 from MIT-Emerging-Talent/milestone4
Milestone 4: Student Engagement Analytics - Complete Communication Strategy
2 parents 3e7a25f + b1695b4 commit 54ecf78

14 files changed

+19048
-179
lines changed

1_datasets/README.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,77 @@ The following raw datasets have been collected and are stored in the
1515
* `Student_grade_aggregated.csv`: Aggregated grade information per student.
1616
* `Student_grade_detailed.csv`: Detailed grade information per student per course.
1717

18+
### Data Dictionaries
19+
20+
Below are the data dictionaries for the files in this dataset, outlining column
21+
names, data types, unique values, and missing values. These serve as initial
22+
documentation for understanding the dataset.
23+
24+
| Column Name | Data Type | Unique Values | Missing Values | Description |
25+
|:------------|:----------|--------------:|---------------:|:------------|
26+
| Unnamed: 0 | int64 | 12139424 | 0 | Index column from original CSV. |
27+
| component | object | 36 | 0 | The Moodle component involved in the activity |
28+
| action | object | 37 | 0 | The specific action performed by the user |
29+
| target | object | 74 | 0 | The specific item or context of the action |
30+
| userid | int64 | 16128 | 0 | Unique identifier for the student. |
31+
| courseid | int64 | 2826 | 0 | Unique identifier for the course. |
32+
| timecreated | object | 6040037 | 0 | Timestamp of when the activity occurred. |
33+
34+
#### `Student_activity_summary.csv` Data Dictionary
35+
36+
| Column Name | Data Type | Unique Values | Missing Values | Description |
37+
|:------------|:----------|--------------:|---------------:|:------------|
38+
| `Unnamed: 0` | int64 | 16128 | 0 | Index column from original CSV. |
39+
| `userid` | float64 | 16128 | 0 | Unique identifier for the student. |
40+
| `number_of_courses` | float64 | 10 | 0 | # of courses students enrolled in. |
41+
| `average_marks` | float64 | 1589 | 0 | Average obtained - across all courses. |
42+
| `average_login` | float64 | 1024 | 0 | Average number of logins. |
43+
| `weekend_login` | float64 | 291 | 0 | Average # of logins during weekends. |
44+
| `weekday_login` | float64 | 948 | 0 | Average # of logins during weekdays. |
45+
| `midnight_login` | float64 | 344 | 0 | Average # of logins during midnight. |
46+
| `early_morning_login` | float64 | 291 | 0 | Average # logins early morning |
47+
| `late_morning_login` | float64 | 291 | 0 | Average # of logins late morning |
48+
| `afternoon_login` | float64 | 344 | 0 | Average nu#mber of logins afternoon. |
49+
| `evening_login` | float64 | 102 | 0 | Average # of logins during evening. |
50+
| `night_login` | float64 | 102 | 0 | Average number of logins during night |
51+
| `no_of_viewed_courses` | float64 | 1148 | 0 | Average # of courses viewed. |
52+
| `no_of_attendance_taken` | float64 | 102 | 0 | Average # attendances taken. |
53+
| `no_of_all_files_downloaded` | float64 | 102 | 0 | Avg # files downloaded. |
54+
| `no_of_assignments` | float64 | 102 | 0 | Avg # of assignments submitted. |
55+
| `no_of_forum_created` | float64 | 102 | 0 | Average # of forum posts created. |
56+
| `number_of_quizzes` | float64 | 102 | 0 | Average # of quizzes available. |
57+
| `no_of_quizzes_completed` | float64 | 102 | 0 | Average # of quizzes compl. |
58+
| `no_of_quizzes_attempt` | float64 | 102 | 0 | Average # of quizzes attempted. |
59+
60+
#### `Student_grade_aggregated.csv` Data Dictionary
61+
62+
| Column Name | Data Type | Unique Values | Missing Values | Description |
63+
|:------------|:----------|--------------:|---------------:|:------------|
64+
| `Unnamed: 0` | int64 | 16128 | 0 | Index column from original CSV. |
65+
| `userid` | int64 | 16128 | 0 | Unique identifier for the student. |
66+
| `number_of_courses` | float64 | 10 | 0 | # of courses the student is enrolled in.|
67+
| `total_marks` | float64 | 1589 | 0 | Total marks obtained across all courses.|
68+
69+
#### `Student_grade_detailed.csv` Data Dictionary
70+
71+
| Column Name | Data Type | Unique Values | Missing Values | Description |
72+
|:------------|:----------|--------------:|---------------:|:------------|
73+
| `Unnamed: 0` | int64 | 16609 | 0 | Index column from original CSV. |
74+
| `userid` | int64 | 16128 | 0 | Unique identifier for the student. |
75+
| `courseid` | int64 | 2826 | 0 | Unique identifier for the course. |
76+
| `formatted agreed mark` | object | 1589 | 0 | Formatted agreed mark for the course.|
77+
| `actual grade` | object | 10 | 0 | Actual letter grade for the course. |
78+
| `faculty` | object | 10 | 0 | Faculty the course belongs to. |
79+
80+
### Data Cleaning and Preprocessing Notes
81+
82+
* The `SED_Student_log.csv` is a large file and will require significant processing
83+
to extract meaningful features for engagement. This will involve timestamp parsing,
84+
grouping by user and course, and counting specific actions.
85+
* Missing values and data types will need careful handling during the cleaning process.
86+
* The `userid` and `courseid` columns will be crucial for joining the log data
87+
with the grade and activity summary files.
88+
1889
### Data Selection Process
1990

2091
Our team evaluated several publicly available datasets to identify the most

5_communication_strategy/README.md

Lines changed: 180 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,180 @@
1-
# Communication Strategy
1+
# Milestone 4: Communicating Results
2+
3+
## Overview
4+
5+
This folder contains our comprehensive communication strategy for translating
6+
research findings on student engagement patterns into actionable
7+
insights for educational stakeholders.
8+
9+
## Key Research Finding
10+
11+
**Student engagement metrics show a 91.4% correlation with academic performance,
12+
enabling early identification of at-risk students through behavioral data analysis.**
13+
This finding provides educational institutions with a clear intervention
14+
target for improving student outcomes.
15+
16+
## Communication Artifacts
17+
18+
### 📊 Interactive Dashboard
19+
20+
**Live Deployments:**
21+
22+
- **Primary Dashboard:** [https://fixed-dashboard.streamlit.app](https://fixed-dashboard.streamlit.app)
23+
- **Enhanced UI Version:** [https://depth-dashboard.streamlit.app](https://depth-dashboard.streamlit.app)
24+
- **Aurora Theme:** [https://aurora-dashboard.streamlit.app](https://aurora-dashboard.streamlit.app)
25+
26+
**Note:** All three versions offer the same core functionality with different
27+
UI/UX themes and minor feature variations.
28+
29+
## Primary Communication Artifact
30+
31+
A comprehensive Streamlit web application that enables stakeholders to:
32+
33+
- Explore student engagement data interactively
34+
- Visualize risk assessment tools with realistic 92% model accuracy
35+
- Access predictive modeling results with zero data leakage
36+
- Review actionable recommendations based on evidence
37+
38+
**Target Audiences:**
39+
40+
- Educational Technology Directors
41+
- Academic Affairs Leadership
42+
- Student Success Coordinators
43+
- Faculty Members
44+
45+
### 📖 [Complete User Guide](dashboard_user_guide.md)
46+
47+
Comprehensive documentation explaining every dashboard feature, including:
48+
49+
- Step-by-step navigation guide
50+
- Data interpretation instructions
51+
- Troubleshooting support
52+
- Best practices for each stakeholder type
53+
54+
### 📋 Supporting Documentation
55+
56+
#### [Target Audience Analysis](target_audience_analysis.md)
57+
58+
Detailed analysis of stakeholder needs, capabilities, and constraints including:
59+
60+
- Primary and secondary audiences
61+
- Communication objectives
62+
- Reaching strategies
63+
- Success metrics
64+
65+
#### [Communication Strategy](communication_strategy.md)
66+
67+
Comprehensive strategy document outlining:
68+
69+
- Message framework
70+
- Distribution approach
71+
- Supporting materials
72+
- Timeline and success metrics
73+
74+
#### [Executive Summary](executive_summary.md)
75+
76+
Concise 2-page brief for leadership containing:
77+
78+
- Key findings and business impact
79+
- Implementation roadmap
80+
- Expected outcomes
81+
- Next steps
82+
83+
## Research Foundation
84+
85+
Our communication strategy is built on robust technical analysis:
86+
87+
**Dataset:** 16,909 students across 50+ engineered features
88+
**Analysis:** Comprehensive EDA and machine learning with data leakage prevention
89+
**Key Finding:** 91.4% correlation between engagement metrics and
90+
academic performance
91+
**Model Accuracy:** 92% with ultra-clean feature engineering
92+
93+
## Implementation Approach
94+
95+
### Phase 1: Stakeholder Engagement
96+
97+
- Dashboard demonstrations
98+
- One-on-one meetings with decision-makers
99+
- Pilot program discussions
100+
101+
### Phase 2: Pilot Implementation
102+
103+
- Selected institutional deployments
104+
- Intervention protocol testing
105+
- Performance monitoring
106+
107+
### Phase 3: Scale and Optimize
108+
109+
- Broader implementation
110+
- Best practice documentation
111+
- Continuous improvement
112+
113+
## Target Outcomes
114+
115+
**Short-term (3 months):**
116+
117+
- Dashboard adoption by educational institutions
118+
- Pilot program implementation
119+
- Stakeholder engagement and feedback
120+
121+
**Medium-term (6 months):**
122+
123+
- Measurable improvement in student engagement metrics
124+
- Policy changes based on findings
125+
- Additional research collaborations
126+
127+
**Long-term (12 months):**
128+
129+
- Scaled implementation across institutions
130+
- Published best practices
131+
- Sustainable analytics programs
132+
133+
## Communication Channels
134+
135+
### Primary: Interactive Dashboard
136+
137+
- Real-time data exploration
138+
- Hands-on insight validation
139+
- Stakeholder-specific views
140+
141+
### Supporting Channels
142+
143+
- Executive briefings
144+
- Educational conferences
145+
- Academic publications
146+
- Professional networks
147+
148+
## Success Indicators
149+
150+
- **Engagement:** Dashboard usage, stakeholder meetings, pilot requests
151+
- **Adoption:** Policy changes, technology implementations, training programs
152+
- **Impact:** Improved student outcomes, institutional success metrics
153+
154+
## Technical Requirements
155+
156+
**Dashboard Dependencies:**
157+
158+
'''
159+
streamlit >= 1.10.0
160+
pandas >= 1.3.0
161+
plotly >= 5.0.0
162+
scikit-learn >= 1.0.0
163+
numpy >= 1.21.0
164+
joblib >= 1.0.0
165+
'''
166+
167+
**Data Requirements:**
168+
169+
- `cleaned_sed_dataset.csv` (from 2_data_preparation/cleaned_data/)
170+
- `best_random_forest_classifier.joblib` (pre-trained model)
171+
- Minimum features: 50+ engagement metrics with zero data leakage
172+
173+
## Contact and Collaboration
174+
175+
This communication strategy demonstrates our commitment to translating research
176+
into real-world educational improvements. The combination of robust technical
177+
analysis and stakeholder-focused communication tools provides a foundation for
178+
evidence-based educational interventions.
179+
180+
---
15.3 MB
Binary file not shown.

0 commit comments

Comments
 (0)