Skip to content

Logging Per-Sample Evaluation Loss During Inference in Swift PT for Qwen3-VL #9518

@somesh2002

Description

@somesh2002

Checklist / 检查清单

  • I have searched existing issues, and this is a new question or discussion topic. / 我已经搜索过现有的 issues,确认这是一个新的问题与讨论。

Question Description / 问题描述

I am currently using the Swift PT pre-training framework to pre-train a Qwen3-VL-2B model.

During inference/evaluation, I would like to leverage Swift's internal loss computation to obtain the loss value for each individual sample in the evaluation dataset. My goal is to analyze model behavior at the sample level and identify examples with particularly high or low loss.

At the moment, Swift reports only the average loss across the entire evaluation dataset. However, I need to generate a log containing the loss value for every sample.

Is there an existing configuration, callback, or recommended approach within Swift to record and export per-sample evaluation losses? If not, what would be the best way to modify the evaluation pipeline to achieve this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions