Fix formatting function to ensure correct fine-tuning in gemma-peft.md by dboyker · Pull Request #3260 · huggingface/blog

dboyker · 2026-01-26T20:42:39Z

Thanks for the nice tutorial on gemma + peft !!

After following it, the script throws this warning:

You passed a dataset that is already processed (contains an input_ids field) together with a formatting function. Therefore formatting_func will be ignored. Either remove the formatting_func or pass a dataset that is not already processed.

As formatting_func is ignored, the model is not correctly fine-tuned. The script works but there is no guarantee that the model ouput the format Quote: [...] Author: [...].

The line 115 is the one which prevents this:
data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True). Executing it adds the input_ids column to the dataset which, together with the formatting_function arg, then trigger the warning as seen here: https://github.com/huggingface/trl/blob/main/trl/trainer/sft_trainer.py#L938-L944

Removing the line 115 is safe regarding the tokenization. Indeed, the tokenizer is infered in the SFTTrainer __init__: https://github.com/huggingface/trl/blob/main/trl/trainer/sft_trainer.py#L639-L650

In addition, formatting_func is modified in this PR to avoid raising the following error: AttributeError: 'list' object has no attribute 'endswith'. It now returns a string instead of a list (+ it does not slice the quote and author).

Fix formatting function in gemma-peft.md

3ad83ec

dboyker mentioned this pull request Jan 26, 2026

Gemma-peft tutorial does not perform expected fine-tuning #3261

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix formatting function to ensure correct fine-tuning in gemma-peft.md#3260

Fix formatting function to ensure correct fine-tuning in gemma-peft.md#3260
dboyker wants to merge 1 commit intohuggingface:mainfrom
dboyker:patch-1

dboyker commented Jan 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dboyker commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dboyker commented Jan 26, 2026 •

edited

Loading