`Trainer.push_to_hub()` with PEFT doesn't work when the base model is loaded from local disk

### System Info


- `transformers` version: 4.44.2
- Platform: Linux-5.15.0-1066-aws-x86_64-with-glibc2.31
- Python version: 3.12.0
- Huggingface_hub version: 0.25.1
- Safetensors version: 0.4.5
- Accelerate version: 0.34.2
- Accelerate config:    not found
- PyTorch version (GPU?): 2.4.1+cu121 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: False

### Who can help?

@muellerzr @SunMarc

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

```python
model_path = "path/to/my/model/on/disk"
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    quantization_config=bnb_config,
    device_map=device_map,
    use_auth_token=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
# some code to create PeftConfig 
...

# You can even use a regular trainer instead 
trainer = SFTTrainer(
        model=model,
        args=training_arguments,
        tokenizer=tokenizer,
        train_dataset=train_dataset,
        peft_config=peft_config,
        dataset_text_field="text",
        max_seq_length=args.max_seq_length,
        packing=args.packing,
        callbacks=callbacks,
    )

trainer.train()
trainer.push_to_hub()
```

I have a `hub_model_id` present in the `TrainingArgs` when I run my scripts. If I set `push_to_hub=True` in TrainingArgs, it ends up throwing a ValueError like 
```
 File "/home/ob-workspace/metaflow-checkpoint-examples/nim_lora/finetune_hf_peft.py", line 127, in sft
    trainer.push_to_hub(
  File "/home/ob-workspace/micromamba/envs/metaflow/linux-64/51041216a07a03b/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 481, in push_to_hub
    return super().push_to_hub(commit_message=commit_message, blocking=blocking, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ob-workspace/micromamba/envs/metaflow/linux-64/51041216a07a03b/lib/python3.12/site-packages/transformers/trainer.py", line 4353, in push_to_hub
    return upload_folder(
           ^^^^^^^^^^^^^^
  File "/home/ob-workspace/micromamba/envs/metaflow/linux-64/51041216a07a03b/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(args, kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/ob-workspace/micromamba/envs/metaflow/linux-64/51041216a07a03b/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 1485, in _inner
    return fn(self, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ob-workspace/micromamba/envs/metaflow/linux-64/51041216a07a03b/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 4972, in upload_folder
    add_operations = self._prepare_upload_folder_additions(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ob-workspace/micromamba/envs/metaflow/linux-64/51041216a07a03b/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 9478, in _prepare_upload_folder_additions
    self._validate_yaml(
  File "/home/ob-workspace/micromamba/envs/metaflow/linux-64/51041216a07a03b/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 9542, in _validate_yaml
    raise ValueError(f"Invalid metadata in README.md.\n{message}") from e
ValueError: Invalid metadata in README.md.
- "base_model" with value "/tmp/metaflow_models_model_reference_c5stocp_" is not valid. Use a model id from https://hf.co/models.
```

While reading the code I also noticed that if we passed a `finetuned_from` argument to the `push_to_hub` function, the trainer passes them down to the `create_model_card` [function](https://github.com/huggingface/transformers/blob/15a4d24805a8d003ab7ce0758a7ede4142f011a9/src/transformers/trainer.py#L4584), but that function ends up letting Peft [change the generated card with a new one](https://github.com/huggingface/transformers/blob/15a4d24805a8d003ab7ce0758a7ede4142f011a9/src/transformers/trainer.py#L4446-L4447). The problem here is that PeftModel's [create_or_update_model_card](https://github.com/huggingface/peft/blob/8d9ecbed080eed79f32b279bdec211767919bd94/src/peft/peft_model.py#L1302)  is not accounting for the value of `base_model` set in the card and because of that the card is invalid and Huggingface doesn't allow pushing the model to the hub. 

I have a fix in the PeFT library to fix this : huggingface/peft#2124 



### Expected behavior

Overall the expected behavior for me when the model being loaded from disk and pushed to hub is : 

If a model is loaded from local disk and then trained with Peft (or any other HF extensions/trainer), push to hub should work. For accommodating this the library should : 
1. Distinguish between names and paths when setting certain information such as `base_model`. 
2. Allow passing information like `base_model` in the TrainingArguments so that `push_to_hub=True` can work with the trainer. Currently `push_to_hub=True` wont work in the PEFT scenarios because the README.md created by the PeftModel's [create_or_update_model_card](https://github.com/huggingface/peft/blob/8d9ecbed080eed79f32b279bdec211767919bd94/src/peft/peft_model.py#L1302) Overrides the value with something which can be a name or a path. And if it is a path (like this issue), it will just crash!
3. Ensure that `finetuned_from` can be passed down explicitly to extensions of the library like PEFT or does something like huggingface/peft#2124 (don't write a `base_model` value in the card if there is a `base_model` set in it)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`Trainer.push_to_hub()` with PEFT doesn't work when the base model is loaded from local disk #33922

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trainer.push_to_hub() with PEFT doesn't work when the base model is loaded from local disk #33922

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`Trainer.push_to_hub()` with PEFT doesn't work when the base model is loaded from local disk #33922