feat/mcq-generation

# feat/mcq-generation

Implement the MCQ question generation service. For each slide in a module, this service constructs a generation prompt combining the slide's enriched markdown with the module's style guide fingerprint, sends it to Claude Sonnet 4.6, and parses the structured JSON response into `Question` and `QuestionOption` records in the database. Both standard mode (single correct answer, 4 options) and hard mode (multiple correct answers, 4–6 options) questions are generated. Generation is submitted via the Anthropic Batch API for cost efficiency.

## Context

Refer to the project context document for the full tech stack, architecture, and design principles. This issue depends on `feat/enriched-markdown-store` (slides must have `enriched_markdown`) and `feat/past-exam-ingestion` (module must have a `style_guide`) being completed. The `batch_jobs` table introduced in `feat/image-description` is reused here with `job_type == mcq_generation`.

**Key design decisions:**
- Generation is **per slide** — one batch request item per slide. This is the natural chunk boundary and ensures questions are correctly attributed to their source slide and lecture.
- The style guide JSON from `feat/past-exam-ingestion` is injected as a prefix in every generation prompt — it is not re-extracted per call.
- The LLM must return **only a JSON array** of question objects — no preamble, no markdown fences. The prompt enforces this strictly.
- Each slide generates **3–5 questions** by default (configurable): a mix of standard and hard mode questions. The exact split is determined by the LLM based on slide content richness.
- Questions are only generated once per slide. Regeneration is supported but requires explicit user action (covered in `feat/regeneration`, Milestone 6).
- Generated questions must be validated before storage — malformed responses are retried once, then marked as failed.

## Todos

### Backend

- [ ] Add to `core/config.py`:
  - `MCQ_GENERATION_MODEL` (default: `"claude-sonnet-4-6"`)
  - `MCQ_GENERATION_MAX_TOKENS` (default: `2000`)
  - `MCQ_QUESTIONS_PER_SLIDE` (default: `4`)

- [ ] Create `/backend/app/services/mcq_generation_service.py` with:

  **`build_generation_prompt(slide: Slide, style_guide: StyleGuide, mode: str) -> str`**
  - Constructs the full prompt with three sections:
    1. **Style guide prefix** — pastes the style guide's `style_summary` JSON and instructs the model to match this style
    2. **Slide content** — pastes the slide's `enriched_markdown`
    3. **Generation instructions** — instructs the model to generate exactly `MCQ_QUESTIONS_PER_SLIDE` questions
  - The generation instructions must specify:
    - Return **only** a JSON array — no text before or after, no markdown code fences
    - Each question object must follow this exact schema:
      ```json
      {
        "question_text": "string",
        "question_type": "single" | "multi",
        "options": ["string", "string", "string", "string"],
        "correct_indices": [0],
        "difficulty": "easy" | "medium" | "hard",
        "topic_tags": ["string"]
      }
      ```
    - For **standard mode** (`mode == "standard"`):
      - `question_type` must be `"single"`
      - `options` must have exactly 4 elements
      - `correct_indices` must have exactly 1 element
    - For **hard mode** (`mode == "hard"`):
      - Mix `"single"` and `"multi"` types — at least 40% should be `"multi"`
      - `options` array length must vary between 4 and 6 per question — do not make all questions the same length
      - `"multi"` questions must have between 2 and `n_options - 1` correct answers
      - **Never** reveal the number of correct answers in the `question_text`
      - Distractors must be plausible and closely related to correct answers — not obviously wrong
    - For both modes:
      - Questions must be based **only** on the provided slide content
      - Difficulty should reflect the `difficulty_distribution` from the style guide
      - `topic_tags` should be 1–3 short lowercase strings (e.g. `["sorting", "time-complexity"]`)
      - Do not repeat the same question concept across questions for the same slide

  **`validate_question_json(raw_json: str) -> list[dict]`**
  - Parses the JSON string — raises `ValidationError` if not valid JSON
  - Validates each question object has all required fields with correct types
  - Validates `correct_indices` are all valid indices into `options`
  - Validates `question_type == "single"` has exactly 1 correct index
  - Validates `question_type == "multi"` has 2 or more correct indices
  - Validates `n_options` (len of `options`) is 4, 5, or 6
  - Returns the validated list of question dicts

  **`store_questions(slide_id: UUID, questions: list[dict], db: AsyncSession) -> list[Question]`**
  - For each validated question dict:
    - Creates a `Question` record with all fields populated
    - Sets `n_options = len(options)`
    - Creates one `QuestionOption` record per option with correct `index` value
  - Returns the list of created `Question` records

  **`build_batch_requests(module_id: UUID, mode: str, db: AsyncSession) -> list[dict]`**
  - Fetches all slides for the module where `enriched_markdown` is not null
  - Fetches the module's `StyleGuide`
  - For each slide, calls `build_generation_prompt()` and constructs an Anthropic batch request dict with `custom_id = str(slide.id)`
  - Returns the full list of batch request dicts ready for submission

  **`submit_generation_batch(module_id: UUID, mode: str, db: AsyncSession) -> str`**
  - Calls `build_batch_requests()` to get all request dicts
  - Submits via `client.messages.batches.create(requests=[...])`
  - Creates a `BatchJob` record with `job_type == mcq_generation`, stores `mode` in a `metadata` JSONB field on the `batch_jobs` table
  - Returns the Anthropic batch ID

  **`process_generation_results(batch_job_id: UUID, db: AsyncSession) -> GenerationResult`**
  - Retrieves batch results from Anthropic
  - For each result:
    - Extracts `slide_id` from `custom_id`
    - Calls `validate_question_json()` on the response text
    - If valid: calls `store_questions()` — sets slide's question generation status to `complete`
    - If invalid: attempts one retry by calling the API synchronously with an explicit correction instruction
    - If retry also fails: logs the raw response, marks slide generation status as `failed`
  - Updates `BatchJob` to `complete`
  - Returns `GenerationResult` dataclass: `{total_slides, succeeded, failed, total_questions_created}`

- [ ] Add `generation_status` and `generation_mode` fields to the `slides` table (new Alembic migration):
  ```
  generation_status (enum: not_started / processing / complete / failed, default: not_started)
  generation_mode (enum: standard / hard / null, nullable)
  ```

- [ ] Add `metadata` (JSONB, nullable) to the `batch_jobs` table (new Alembic migration) to store arbitrary job context (e.g. generation mode)

- [ ] Create API endpoints in `/backend/app/api/routes/questions.py` (all protected):
  - `POST /api/modules/{module_id}/generate-questions` — accepts `{"mode": "standard" | "hard"}` body, calls `submit_generation_batch()` as a background task, returns `{"batch_job_id": "...", "total_slides": n}`
  - `GET /api/modules/{module_id}/questions` — returns all questions for the module, supports query params: `?lecture_id=`, `?difficulty=`, `?question_type=`, `?limit=`, `?offset=`
  - `GET /api/modules/{module_id}/generation-status` — returns per-slide generation status summary and overall progress
  - `GET /api/questions/{question_id}` — returns a single question with all options

- [ ] Extend the background polling task from `feat/image-description` to also poll `mcq_generation` batch jobs and call `process_generation_results()` when complete

### Frontend

- [ ] On the module detail page, add a "Generate Questions" section that:
  - Is enabled only when `module.processing_status == complete` (all slides enriched) and a style guide exists
  - Shows a mode selector: "Standard" / "Hard Mode" toggle
  - Shows a "Generate" button that calls `POST /api/modules/{module_id}/generate-questions`
  - Switches to a progress display after clicking: `X / Y slides processed` by polling `GET /api/modules/{module_id}/generation-status` every 10 seconds
  - Shows a completion summary: `N questions generated across L lectures`
- [ ] Create a `QuestionCard` component (`/frontend/components/questions/QuestionCard.tsx`) that:
  - Displays the question text
  - Renders options as radio buttons (standard) or checkboxes (hard mode multi)
  - Shows a difficulty badge (easy / medium / hard) with colour coding (green / yellow / red)
  - Shows topic tags as small pill labels
  - Is used in both the question browser and the study session (Milestone 5)
- [ ] Create a question browser page at `/modules/[id]/questions` that:
  - Lists all generated questions for the module
  - Supports filtering by lecture, difficulty, and question type
  - Renders each question using `QuestionCard` in read-only/preview mode (answers not revealed)

## Acceptance Criteria

- [ ] `POST /api/modules/{module_id}/generate-questions` with `mode: "standard"` creates questions where every question has `question_type == "single"`, exactly 4 options, and exactly 1 correct index
- [ ] `POST /api/modules/{module_id}/generate-questions` with `mode: "hard"` creates a mix including `"multi"` questions with 4–6 options and 2+ correct indices
- [ ] Hard mode questions never reveal the number of correct answers in the question text (spot-check 10 questions)
- [ ] Every generated question has `correct_indices` that are valid indices into its `options` array
- [ ] Every generated question is correctly attributed to its source `slide_id`, `lecture_id`, and `module_id`
- [ ] `topic_tags` and `difficulty` are populated for every question
- [ ] A slide with generation failure does not block other slides from completing — failed slides are marked individually
- [ ] `GET /api/modules/{module_id}/questions?lecture_id=X` returns only questions from that lecture
- [ ] `GET /api/modules/{module_id}/generation-status` correctly reflects per-slide progress
- [ ] The question browser page renders questions correctly with difficulty badges and topic tags
- [ ] Calling `POST /api/modules/{module_id}/generate-questions` when no style guide exists returns HTTP 400
- [ ] Calling `POST /api/modules/{module_id}/generate-questions` when not all slides are enriched returns HTTP 400 with a count of how many slides are not yet ready

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat/mcq-generation #11

feat/mcq-generation

Context

Todos

Backend

Frontend

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat/mcq-generation #11

Description

feat/mcq-generation

Context

Todos

Backend

Frontend

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions