Skip to content

pmchrislee/student-question-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Student Question Classification & Routing System

🎯 Project Overview

An intelligent ML-powered system that automatically classifies and prioritizes student questions in AI/ML education contexts. This project addresses the real challenge of managing hundreds of student questions efficiently by categorizing them into technical domains and urgency levels.

πŸ” Problem Statement

In AI education settings, instructors receive numerous questions across various topics (Python basics, ML algorithms, debugging, conceptual understanding). Manually triaging these questions is time-consuming and inconsistent. This system uses NLP and machine learning to:

  1. Classify questions by topic (Python/Programming, Machine Learning, Deep Learning, Data Processing, Conceptual/Theory)
  2. Assess urgency level (Critical/Blocking, High Priority, Normal, Low Priority)
  3. Route to appropriate resources or instructors based on classification

πŸ› οΈ Technical Approach

Data Collection & Preparation

  • Synthetic dataset generation based on real educational patterns
  • Data includes: question text, category labels, urgency levels
  • Train/test split with stratification to maintain class balance
  • Text preprocessing: lowercasing, tokenization, handling code snippets

Feature Engineering

  • TF-IDF vectorization for text representation
  • Tuned parameters: max_features=5000, ngram_range=(1,2)
  • Captures both single words and bigrams for better context
  • Handles code-specific terminology and technical vocabulary

Model Selection & Training

  • Primary Model: Logistic Regression with L2 regularization
  • Alternative explored: Random Forest for comparison
  • Multi-class classification with balanced class weights
  • Hyperparameter tuning via grid search

Model Evaluation

  • Classification metrics: Precision, Recall, F1-score
  • Confusion matrix analysis to identify misclassification patterns
  • Cross-validation to ensure generalization
  • Performance analysis across different question types

πŸ“Š Results

Category Classification Performance

                    precision    recall  f1-score   support

 Conceptual/Theory       1.00      1.00      1.00        40
   Data Processing       1.00      1.00      1.00        40
     Deep Learning       1.00      1.00      1.00        40
  Machine Learning       1.00      1.00      1.00        40
Python/Programming       1.00      1.00      1.00        40

          accuracy                           1.00       200
         macro avg       1.00      1.00      1.00       200
      weighted avg       1.00      1.00      1.00       200

Analysis: Perfect classification on test set with 79.8% mean confidence. The model strongly leverages technical keywords to distinguish question categories.

Urgency Classification Performance

              precision    recall  f1-score   support

    Critical       0.47      0.50      0.48        28
        High       0.59      0.58      0.59        50
         Low       0.34      0.67      0.45        18
      Normal       0.77      0.63      0.69       104

    accuracy                           0.60       200
   macro avg       0.54      0.60      0.55       200
weighted avg       0.64      0.60      0.62       200

Analysis: Lower accuracy (60%) reflects the inherent subjectivity of urgency assessment. Main confusion occurs between Normal and High priority questions, which aligns with real-world ambiguity.

πŸš€ Usage

Installation

pip install -r requirements.txt

Training the Model

python train.py

Making Predictions

python predict.py "How do I fix this AttributeError in my neural network?"

Running Evaluation

python evaluate.py

πŸ“ Project Structure

student-question-classifier/
β”œβ”€β”€ README.md                 # Project documentation
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ generate_data.py     # Synthetic data generation
β”‚   └── questions.csv        # Generated training data
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ preprocessing.py     # Text preprocessing utilities
β”‚   β”œβ”€β”€ feature_engineering.py  # TF-IDF and feature extraction
β”‚   └── models.py            # Model definitions and training
β”œβ”€β”€ train.py                 # Main training script
β”œβ”€β”€ evaluate.py              # Model evaluation script
β”œβ”€β”€ predict.py               # Inference script
β”œβ”€β”€ models/                  # Saved model artifacts
β”‚   β”œβ”€β”€ category_classifier.pkl
β”‚   β”œβ”€β”€ urgency_classifier.pkl
β”‚   └── vectorizer.pkl
└── notebooks/
    └── exploratory_analysis.ipynb  # Data exploration and visualization

πŸ”§ Technical Challenges & Solutions

Challenge 1: Class Imbalance

Problem: Not all question categories appear equally frequently in real educational settings. Solution: Applied class weighting in the model to ensure minority classes receive appropriate attention during training.

Challenge 2: Code Snippet Handling

Problem: Questions containing code snippets have different linguistic patterns than natural language. Solution: Preserved code structure in preprocessing while still extracting semantic meaning through character n-grams.

Challenge 3: Multi-label Ambiguity

Problem: Some questions span multiple categories (e.g., "How do I implement gradient descent in Python?") Solution: Built separate models for category and urgency to allow independent classification. Future work could explore multi-label classification.

Challenge 4: Urgency Assessment

Problem: Urgency is contextual and subjective compared to topic classification. Solution: Trained on patterns like "not working", "error", "urgent", "deadline" combined with question sentiment analysis.

πŸ“ˆ Future Improvements

  1. Deep Learning Approach: Implement BERT-based classification for better semantic understanding
  2. Active Learning: Incorporate instructor feedback to continuously improve classification
  3. Multi-label Support: Allow questions to belong to multiple categories simultaneously
  4. Confidence Scores: Add probability outputs to flag uncertain classifications for manual review
  5. Real-time API: Deploy as a REST API for integration with learning management systems
  6. Expanded Features: Include student history, previous questions, and course progress context

πŸŽ“ Educational Context

This project emerged from real challenges in teaching AI/ML courses where:

  • 50-100+ students generate 200+ questions per course
  • Questions range from basic Python syntax to advanced ML theory
  • Response time directly impacts student learning and retention
  • Instructors need to prioritize high-impact interventions

The classification system enables:

  • Automated routing to teaching assistants based on expertise
  • Priority queuing for critical blocking issues
  • Self-service recommendations by matching to FAQ/documentation
  • Analytics on common confusion points to improve curriculum

🀝 Contributing

This is a learning project, but suggestions and improvements are welcome:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with clear commit messages
  4. Submit a pull request with description

πŸ“ License

MIT License - feel free to use this for educational purposes.

πŸ‘€ Author

Christopher Lee

πŸ™ Acknowledgments

  • Built as part of learning journey in practical ML engineering
  • Inspired by real challenges in AI education delivery
  • Thanks to the open-source ML community for excellent tools and resources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages