Skip to content

AttributeError: 'FullyShardedDataParallelPlugin' object has no attribute 'activation_checkpointing' #25988

@scissorstail

Description

@scissorstail

System Info

Traceback (most recent call last):
  File "/workspace/run/run_llm.py", line 717, in <module>
    main()
  File "/workspace/run/run_llm.py", line 644, in main
    trainer = Trainer(
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 342, in __init__
    self.create_accelerator_and_postprocess()
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 3900, in create_accelerator_and_postprocess
    "activation_checkpointing", fsdp_plugin.activation_checkpointing
AttributeError: 'FullyShardedDataParallelPlugin' object has no attribute 'activation_checkpointing'

# post accelerator creation setup
if self.is_fsdp_enabled:
fsdp_plugin = self.accelerator.state.fsdp_plugin
fsdp_plugin.limit_all_gathers = self.args.fsdp_config.get(
"limit_all_gathers", fsdp_plugin.limit_all_gathers
)
fsdp_plugin.activation_checkpointing = self.args.fsdp_config.get(
"activation_checkpointing", fsdp_plugin.activation_checkpointing
)
if fsdp_plugin.activation_checkpointing and self.args.gradient_checkpointing:
raise ValueError(
"The activation_checkpointing in FSDP config and the gradient_checkpointing in training arg "
"can't be set to True simultaneously. Please use FSDP's activation_checkpointing logic "
"when using FSDP."
)

The 'FullyShardedDataParallelPlugin' class in accelerate version v0.22.0 does not have 'activation_checkpointing'. but the main branch does.

v0.22.0
https://github.com/huggingface/accelerate/blob/6b3e559926afc4b9a127eb7762fc523ea0ea656a/src/accelerate/utils/dataclasses.py#L778

main
https://github.com/huggingface/accelerate/blob/739b135f8367becb67ffaada12fe76e3aa60fefd/src/accelerate/utils/dataclasses.py#L783

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Expected behavior

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions