Skip to content

This pipeline analyzes educational text passages to identify optimal learning intervention points by mapping text content to specific educational skills and competencies. The system employs LLMs to perform intelligent skill mapping and rating.

License

Notifications You must be signed in to change notification settings

pleonova/edu-skills-tagging

Repository files navigation

📚 Educational Intervention Skill Tagging

Overview

This application analyzes educational text passages to identify optimal intervention points by mapping content to specific academic skills and providing targeted discussion questions for follow-up learning. It uses large language models (LLMs) to intelligently detect, rate, and explain skill alignment—helping educators personalize instruction and improve learning outcomes.


Sample Output

User Interface Mock

See full output here: output/combined_data_final.xlsx


🧠 System Design & Methodology

🧩 1. Skill-Based Analysis Framework

  • Uses a comprehensive taxonomy of 69 educational competencies from input/skills.csv
  • Skills span a wide range of domains:
    • Science (e.g., life sciences, physics, earth science)
    • Social Studies (e.g., history, geography, civics)
    • Language Arts
    • Mathematics
    • Arts & Physical Education
    • Digital Literacy

🤖 2. AI-Powered Skill Assessment

  • Powered by Groq LLM API (llama3-70b-8192) through llm_service.py
  • Uses structured prompt templates to ensure consistency
  • Low temperature (0.01) for deterministic, repeatable outputs

Each model response includes:

  • Identified skill tag(s)
  • Alignment rating (scale of 0–10)
  • Pedagogical explanation
  • Highlighted text excerpt supporting the alignment

🎯 3. Intervention Point Detection

The system pinpoints passages that:

  • Strongly align with specific skills (ratings: 9–10)
  • Show partial alignment or emerging understanding (ratings: 5–6)
  • Offer opportunities for teacher-led discussion or review
  • Map multiple skills to the same passage when relevant

💬 4. Follow up Discussion

The system generates targeted discussion points that:

  • Reinforce key concepts through guided questioning
  • Connect skills across different subject areas
  • Promote critical thinking with open-ended prompts
  • Support differentiated instruction with varying difficulty levels

🛠️ Technical Implementation

  • Python-based processing pipeline
  • Structured prompt engineering - JSON Output
  • Various Pompt Techniques (Few-Shot Learning, Tooling, Chaining)
  • LLM output stored and analyzed using DataFrames
  • Embedding-based dataset joins to reduce hallucinations
  • Final output: Excel reports for easy review & collaboration

▶️ How to Run

  1. Sign up at Groq and obtain a free API key.

  2. Set up your environment variable:

    export GROQ_API_KEY='your-api-key-here'
  3. Set up the virtual environment:

    # Create a new virtual environment
    python -m venv skill-venv
    
    # Activate the virtual environment
    source skill-venv/bin/activate  # On macOS/Linux
    # or
    .\skill-venv\Scripts\activate  # On Windows
    
    # Install dependencies
    pip install -r requirements.txt
  4. Run the skill alignment script:

    python run_01_align_skills_to_stories.py

    This will process the stories from input/stories.csv and generate skill alignments.

  5. Combine the data:

    python run_02_combine_data.py

    This will generate the final combined output in output/combined_data_final.xlsx.

  6. (Optional) Generate discussion questions:

    python run_03_generate_discussion_questions.py

    This will create additional discussion questions in output/discussion_questions.xlsx.

The final outputs will be available in the output/ directory:

🤖 LLM Service Implementation

The llm_service.py file provides a robust implementation for processing educational content using the Groq LLM API. Here's a detailed breakdown of its functionality:

  1. Key Prompt Components:

    • Skills Augmented Analysis: Analyzes text passages to identify and rate educational skills
    • Discussion Question Generation: Creates targeted questions based on identified skills
    • Few-Shot Learning: Uses example-based prompting from examples/few_shot_examples_discussion_questions.json
    • Custom Tooling: Supports GPT-4 function calling for structured outputs
  2. Output Structure & Sample Output:

    • Skills Analysis Output Structure:
      {
        "skills": [
          {
            "skill": "skill description",
            "explanation": "why it is aligned",
            "story_excerpt": "where in the story to stop to review this skill",
            "rating": 0-10
          }
        ]
      }
      Sample Output:
      {
        "skills": [
          {
            "skill": "Knows about transportation",
            "explanation": "The story mentions going in a car and on a train, showing an understanding of different modes of transportation.",
            "story_excerpt": "Some days, Dad and I go in the car. Dad drives. I ride. Some days, Dad and I go on the train.",
            "rating": 10
          }
        ]
      }
    • Discussion Questions Output Structure:
      [
        {
          "question": "question text",
          "type": "Recall/Comprehension/Application",
          "instructional_purpose": "purpose of the question"
        }
      ]
      Sample Output:
      {
        "questions": [
          {
            "question": "What are two ways the family travels?",
            "type": "Recall",
            "instructional_purpose": "Assess whether the student can recall the modes of transportation mentioned in the story."
          },
          {
            "question": "Why did the family choose to take the train for their vacation?",
            "type": "Comprehension",
            "instructional_purpose": "Assess whether the student understands the reason behind the family's transportation choice."
          },
          {
            "question": "What other ways can people travel besides cars and trains?",
            "type": "Application",
            "instructional_purpose": "Requires the student to think about other modes of transportation beyond what was mentioned in the story."
          }
        ]
      }
  3. Quality Control:

    • Prompt Templates: Implements structured prompt templates
    • Validation: Uses JSON schema validation
    • Error Handling: Includes comprehensive error handling and retry mechanisms
    • Debugging: Supports debugging through message printing
  4. Sample Usage:

    python llm_service.py

To see a sample output for one story, checkout the output/sample_prompt_chain.txt file, which demonstrates the full processing pipeline from story analysis to question generation.


🧪 Planned Human Review & Quality Control

🔍 Evaluation with Human Ratings

  • Randomly sample and review LLM-generated outputs.
  • Human raters evaluate skill alignment, clarity, and pedagogical value.

🧠 Prompt Optimization & Edge Case Analysis

  • Compare human and model ratings to better engineer prompts.
  • Identify skill categories or content formats where the model underperforms.

🤖 LLM-as-a-Judge for Scalable Review

  • Use a separate model with a prompt that mimics human evaluation behavior to assess content.
  • Helps reduce reliance on manual reviews for future outputs.

🧪 Lightweight A/B Testing

  • Run controlled comparisons of LLM-generated interventions.
  • Use engagement or comprehension metrics to assess effectiveness.

📈 Planned Improvements

  • Improve LLM output validation and error handling
  • Implement a scalable LLM-as-a-Judge system for reviews
  • Add another prompt for skills assessment
  • Add dynamic text highlighting based on skill strength
  • Integrate student engagement metrics for optimization
  • Visualize and track skill dependencies across stories

About

This pipeline analyzes educational text passages to identify optimal learning intervention points by mapping text content to specific educational skills and competencies. The system employs LLMs to perform intelligent skill mapping and rating.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages