DebertaForMaskedLM  amp_bf16/amp_fp16  training/inference accuracy/performance got failed

### 🐛 Describe the bug

python benchmarks/dynamo/huggingface.py --accuracy --amp -d xpu -n10 --amp-dtype bfloat16  --training --only DebertaForMaskedLM --backend=inductor

xpu  eval  DebertaForMaskedLM                 
Traceback (most recent call last):
  File "/home/sdp/actions-runner/_work/torch-xpu-ops/pytorch/benchmarks/dynamo/common.py", line 1905, in validate_model
    self.model_iter_fn(model, example_inputs)
  File "/home/sdp/actions-runner/_work/torch-xpu-ops/pytorch/benchmarks/dynamo/huggingface.py", line 522, in forward_pass
    return mod(**inputs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/transformers/models/deberta/modeling_deberta.py", line 989, in forward
    outputs = self.deberta(
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/transformers/models/deberta/modeling_deberta.py", line 797, in forward
    encoder_outputs = self.encoder(
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/transformers/models/deberta/modeling_deberta.py", line 608, in forward
    hidden_states, att_m = layer_module(
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/transformers/models/deberta/modeling_deberta.py", line 525, in forward
    attention_output, att_matrix = self.attention(
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/transformers/models/deberta/modeling_deberta.py", line 460, in forward
    self_output, att_matrix = self.self(
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sdp/miniforge3/envs/e2e_ci/lib/python3.10/site-packages/transformers/models/deberta/modeling_deberta.py", line 290, in forward
    attention_scores = attention_scores.masked_fill(~(attention_mask), torch.finfo(query_layer.dtype).min)
RuntimeError: value cannot be converted to type at::BFloat16 without overflow

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/sdp/actions-runner/_work/torch-xpu-ops/pytorch/benchmarks/dynamo/common.py", line 3986, in run
    ) = runner.load_model(
  File "/home/sdp/actions-runner/_work/torch-xpu-ops/pytorch/benchmarks/dynamo/huggingface.py", line 458, in load_model
    self.validate_model(model, example_inputs)
  File "/home/sdp/actions-runner/_work/torch-xpu-ops/pytorch/benchmarks/dynamo/common.py", line 1907, in validate_model
    raise RuntimeError("Eager run failed") from e
RuntimeError: Eager run failed


### Versions

Envirnoments:
Device: PVC 1100
torch-xpu-ops: https://github.com/etaf/pytorch-inductor-xpu/commits/xpu_inductor_windows/
python: 3.10
TRITON_COMMIT_ID: c4a79a1960ba1c247c2548cbd3abf6a728b3ce6f
TORCH_COMMIT_ID: 1ba49a78edafa61e2ce4f80d147576e66566eec3
TORCHBENCH_COMMIT_ID: 373ffb19dc470f4423a3176a4133f8f4b3cdb5bd
TORCHVISION_COMMIT_ID: d23a6e1664d20707c11781299611436e1f0c104f
TORCHAUDIO_COMMIT_ID: f084f34bbb743fada85f66b0ed8041387565e69c
DRIVER_VERSION: 1.23.10.49.231129.50
KERNEL_VERSION: 5.15.0-73-generic #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023
BUNDLE_VERSION: 2025.0.1.20241113
OS_PRETTY_NAME: Ubuntu 22.04.2 LTS
GCC_VERSION: 11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DebertaForMaskedLM amp_bf16/amp_fp16 training/inference accuracy/performance got failed #1389

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DebertaForMaskedLM amp_bf16/amp_fp16 training/inference accuracy/performance got failed #1389

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions