Bring back `set_epoch` for Accelerate-based dataloaders #26850

muellerzr · 2023-10-16T19:40:13Z

What does this PR do?

This PR brings back the set_epoch logic and solves the last remaining regression from the accelerate integration. Resuming training should now be exactly identical. (Introduced due to an incomplete implementation in #24028)

Linked with huggingface/accelerate#2057

Fixes # (issue)

fixes #26541

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@LysandreJik @ArthurZucker

HuggingFaceDocBuilderDev · 2023-10-16T19:57:56Z

The documentation is not available anymore as the PR was closed or merged.

ArthurZucker

The tests seem to error out, so waiting for this.
Could you link to which pr removed it?

src/transformers/trainer.py

Co-authored-by: Arthur <[email protected]>

…26850) * Working tests! * Fix sampler * Fix * Update src/transformers/trainer.py Co-authored-by: Arthur <[email protected]> * Fix check * Clean --------- Co-authored-by: Arthur <[email protected]>

muellerzr added 3 commits October 16, 2023 19:11

Working tests!

cec09c8

Fix sampler

1af2d89

Fix

e280dca

muellerzr requested review from ArthurZucker and LysandreJik October 16, 2023 19:40

ArthurZucker reviewed Oct 17, 2023

View reviewed changes

src/transformers/trainer.py Show resolved Hide resolved

Update src/transformers/trainer.py

54fea16

Co-authored-by: Arthur <[email protected]>

This was referenced Oct 18, 2023

Let iterable dataset shard have a length if implemented huggingface/accelerate#2066

Merged

Allow for samplers to be seedable and reproducable huggingface/accelerate#2057

Merged

muellerzr added 2 commits October 24, 2023 19:59

Fix check

6ab2f86

Clean

6afe340

muellerzr requested a review from ArthurZucker October 24, 2023 20:11

ArthurZucker approved these changes Oct 25, 2023

View reviewed changes

LysandreJik approved these changes Oct 26, 2023

View reviewed changes

LysandreJik merged commit 9041240 into main Oct 26, 2023

LysandreJik deleted the muellerzr-dataloader branch October 26, 2023 09:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bring back `set_epoch` for Accelerate-based dataloaders #26850

Bring back `set_epoch` for Accelerate-based dataloaders #26850

Uh oh!

muellerzr commented Oct 16, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Oct 16, 2023 •

edited

Loading

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Bring back set_epoch for Accelerate-based dataloaders #26850

Bring back set_epoch for Accelerate-based dataloaders #26850

Uh oh!

Conversation

muellerzr commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Bring back `set_epoch` for Accelerate-based dataloaders #26850

Bring back `set_epoch` for Accelerate-based dataloaders #26850

muellerzr commented Oct 16, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 16, 2023 •

edited

Loading