eval_loss not found when training a peft model using trainer.py

### System Info

transformers version: 4.43.3
Python 3.10.12
Ubuntu

Issue is with the trainer.py since it does not check the base_model for Peft cases to get the label information

Have fixed in local but raising this to fix in the branch

### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Run a vision transformer training using Lora

### Expected behavior

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/trainer.py", line 1341, in _save_checkpoint
    metric_value = metrics[metric_to_check]
KeyError: 'eval_loss'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/vision/run_image_classification.py", line 510, in <module>
    main()
  File "/home/vision/run_image_classification.py", line 480, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/trainer.py", line 553, in train
    return inner_training_loop(
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/trainer.py", line 1052, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, _grad_norm, model, trial, epoch, ignore_keys_for_eval)
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/trainer.py", line 1269, in _maybe_log_save_evaluate
    self._save_checkpoint(model, trial, metrics=metrics)
  File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/trainer.py", line 1343, in _save_checkpoint
    raise KeyError(
KeyError: "The `metric_for_best_model` training argument is set to 'eval_loss', which is not found in the evaluation metrics. The available evaluation metrics are: ['eval_runtime', 'eval_samples_per_second', 'eval_steps_per_second', 'epoch', 'memory_allocated (GB)', 'max_memory_allocated (GB)', 'total_memory_available (GB)']. Consider changing the `metric_for_best_model` via the TrainingArguments."


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

eval_loss not found when training a peft model using trainer.py #33420

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

eval_loss not found when training a peft model using trainer.py #33420

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions