Skip to content

Swizknife/AUTO-EIT-Automated-scoring-system-for-elicited-imitation-task-responses-

Repository files navigation

🚀 AutoEIT: Meaning-Based Scoring of Sentence Transcriptions

📌 Overview

This project implements an automated evaluation system for elicited imitation (EIT) task responses, scoring learner transcriptions against prompt sentences using a meaning-based rubric.

The system focuses on semantic understanding rather than exact matching, combining SBERT embeddings, feature engineering, and a neural network for robust scoring.


🎯 Key Features

  • Semantic similarity using Sentence-BERT (SBERT)
  • Hybrid feature engineering (semantic + lexical + structural)
  • Feature interaction modeling (absolute difference, element-wise product)
  • Attention-based neural network for scoring
  • Class imbalance handling (oversampling + weighted loss)
  • Reproducible pipeline with structured outputs

🧠 Methodology

1. Sentence Representation

  • Used SBERT (all-MiniLM-L6-v2)
  • Generated 384-dimensional embeddings for:
    • Prompt sentences
    • Learner transcriptions

2. Feature Engineering

Embedding-Based Features

  • Prompt embeddings
  • Learner embeddings
  • Absolute difference
  • Element-wise product

Linguistic Features

  • Cosine similarity
  • Edit distance (Levenshtein)
  • Word overlap ratio
  • Length difference

3. Feature Fusion

All features are combined into a 1540-dimensional vector: [384 + 384 + 384 + 384 + 4] = 1540


4. Model Architecture

Custom PyTorch model (EITScorer):

  • Fully connected layers: 1540 → 512 → 256
  • Batch Normalization
  • Attention mechanism
  • Dropout (0.3)
  • Output: 5 classes (scores 0–4)

5. Handling Class Imbalance

  • Stratified train-test split
  • Oversampling minority classes
  • Class-weighted loss function

📊 Evaluation Strategy

  • Consistency checks across similar errors
  • Manual validation for semantic correctness
  • Edge case testing (paraphrases, incomplete responses)
  • Feature ablation insights

📂 Project Structure

AutoEIT/ ├── python.ipynb ├── Soumya_TestResults.csv ├── Soumya_TestResults.pdf ├── auto_eit_clean_dataset.csv ├── README.md


⚙️ Tech Stack

  • Python
  • PyTorch
  • Sentence-Transformers (SBERT)
  • Scikit-learn
  • RapidFuzz
  • Pandas / NumPy

▶️ How to Run

pip install -r requirements.txt

jupyter notebook


🚧 Challenges

  • Capturing meaning beyond surface-level matching
  • Handling class imbalance in scoring labels
  • Designing robust multi-feature representations
  • Preventing overfitting

🌱 Future Work

  • Fine-tune SBERT on domain-specific data
  • Add explainable scoring (feature importance)
  • Integrate LLM-based evaluation
  • Deploy as an API

🤝 Acknowledgment

Developed as part of the AutoEIT Evaluation Test for GSoC application (CERN Human-AI Team).


⭐ Contribution

Feel free to fork, improve, or suggest enhancements!

About

Meaning-based evaluation system for sentence transcriptions using SBERT embeddings, feature engineering, and a neural network with attention. Combines semantic, lexical, and structural signals to generate robust rubric-aligned scores.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors