Drop unnecessary tokens in GPT2Model generation #39016

null-pointer-access · 2025-06-24T18:31:59Z

What does this PR do?

In the current GPT2 implementation, the LMHead module processes all tokens during prefill, even though only the final token’s output is used for generation. This PR aligns the behavior with LlamaModel by computing the LMHead output only for the last token, reducing unnecessary computation during prefill.

Fixes #38977

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case. LMHead is processing redundant tokens in prefill #38977
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Co-authored-by: Anonymous <[email protected]>

zucchini-nlp

Perfect, looks good to me!

Would be nice to do a pass over old common models and update them as well, we will leave it to later PRs :)

HuggingFaceDocBuilderDev · 2025-06-25T08:29:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Conless and others added 2 commits June 24, 2025 11:27

Drop unnecessary tokens in GPT2Model generation.

e0b3ee4

Co-authored-by: Anonymous <[email protected]>

Merge branch 'huggingface:main' into main

dc1e43f

null-pointer-access mentioned this pull request Jun 24, 2025

LMHead is processing redundant tokens in prefill #38977

Closed

zucchini-nlp approved these changes Jun 25, 2025

View reviewed changes

zucchini-nlp enabled auto-merge (squash) June 25, 2025 08:17

zucchini-nlp merged commit 7b38073 into huggingface:main Jun 25, 2025
20 checks passed

ydshieh mentioned this pull request Jun 26, 2025

fix test_compare_unprocessed_logit_scores #39053

Merged

philiproeleveld mentioned this pull request Sep 18, 2025

Adding logits_to_keep to older models #40984

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drop unnecessary tokens in GPT2Model generation #39016

Drop unnecessary tokens in GPT2Model generation #39016

Uh oh!

null-pointer-access commented Jun 24, 2025

Uh oh!

zucchini-nlp left a comment •

edited

Loading

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Drop unnecessary tokens in GPT2Model generation #39016

Drop unnecessary tokens in GPT2Model generation #39016

Uh oh!

Conversation

null-pointer-access commented Jun 24, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

zucchini-nlp left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zucchini-nlp left a comment •

edited

Loading