Make `only_unmask_final` flag configurable for SFT.

**Is your feature request related to a problem? Please describe.**
Hey Folks,
I'm working on a multi-turn SFT task and I'd like to train only the final assistant message given the previous messages in context. To achieve this, I believe I have to set the `only_unmask_final` bool flag([Ref.](https://github.com/NVIDIA-NeMo/RL/blob/10374866395aa0a305afcc0b749a4d126b956733/nemo_rl/data/llm_message_utils.py#L147)) to True. However, current implementation of SFT does not support this, [Ref.](https://github.com/NVIDIA-NeMo/RL/blob/10374866395aa0a305afcc0b749a4d126b956733/nemo_rl/algorithms/sft.py#L430).

**Describe the solution you'd like**
The proposal is to make this flag configurable in [SFTConfig](https://github.com/NVIDIA-NeMo/RL/blob/10374866395aa0a305afcc0b749a4d126b956733/nemo_rl/algorithms/sft.py#L62) by introducing `only_unmask_final` boolean and consume that during SFT validation and training.

**Describe alternatives you've considered**
Other alternatives considered does not help wider audience.

**Additional context**

I have a rough implementation and I can submit a PR for review. Please let me know if I'm missing something here or need further clarification.

Thanks


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `only_unmask_final` flag configurable for SFT. #2219

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Make only_unmask_final flag configurable for SFT. #2219

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Make `only_unmask_final` flag configurable for SFT. #2219