Skip to content

[Feature] Add structured Socket.IO training event contract (run_id, epoch, metrics, phase) instead of message-index payloads #274

@Shivampal157

Description

@Shivampal157

Problem

Current training progress events are emitted as loosely structured payloads (message + numeric test index), and the frontend stores them by array index.

This makes the contract fragile and hard to extend:

  • UI behavior depends on parsing free-form strings like "Starting" / "Finish"
  • No stable run_id in event payload to scope updates
  • No explicit epoch/batch/metric fields for charting
  • Hard to support multi-run comparison and robust progress visualizations

Why this matters

A stable telemetry contract is required for:

  • real-time training charts
  • run comparison dashboard
  • reliable retry/reconnect handling
  • future tuning/interpretability workflows

String-based progress messages are hard to validate and can silently break UI.

Current behavior (observed)

  • Backend emits generic message events from CustomProgressBar
  • Frontend in Training.jsx maps updates using resultValues[parseInt(resp.test)]
  • Progress state is not strongly typed and not run-scoped

Proposed solution

Introduce versioned structured events for training:

  • training_started
  • training_update
  • training_complete
  • training_error

Each event should include a typed payload, e.g.:

{
  "run_id": "<uuid-or-id>",
  "phase": "train|eval",
  "epoch": 3,
  "batch": 12,
  "steps": 100,
  "metrics": {
    "loss": 0.42,
    "accuracy": 0.88,
    "val_loss": 0.50,
    "val_accuracy": 0.84
  },
  "message": "optional human-readable message",
  "timestamp": "ISO-8601"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions