Add sampling support to group beam search #38653

gspeter-max · 2025-06-07T04:17:07Z

Feature Description

This PR implements the feature request to add sampling capabilities (e.g., Top-K, Top-P, temperature) to Group Beam Search, which was previously a purely greedy algorithm.

Problem

Currently, _group_beam_search in the GenerationMixin is implemented as a deterministic process. After applying the diversity penalty to the logits, it always selects the highest-probability tokens using torch.topk. This prevents users from leveraging the creative and diverse outputs that stochastic sampling methods provide, which is especially useful for tasks like biological sequence or code generation.

Solution

This implementation modifies _group_beam_search by adding a conditional path that is triggered when generation_config.do_sample=True. The new sampling path includes the following logic:

Applies All Processors & Warpers: It correctly applies all LogitsProcessors (including the ForcedDiversityLogitsProcessor) and then applies the LogitsWarpers (for Temperature, Top-K, Top-P) to the scores.
Safe Candidate Selection: It safely calculates the number of candidates to sample by taking the min() of what the beam_scorer requires and the number of tokens available after warping, preventing potential torch.multinomial errors.
Stochastic Sampling: It uses torch.multinomial to stochastically sample candidate tokens from the resulting probability distribution.
Score Gathering: It gathers the log-scores of the sampled tokens to ensure compatibility with the rest of the beam search algorithm.

Additionally, the validation check in generation/configuration_utils.py that previously raised a ValueError for do_sample=True with group beam search has been removed to enable this new feature.

Testing

The feature has been tested locally by running model.generate with do_sample=True and various sampling parameters (temperature, top_k, top_p). The tests confirm that:

The code runs without errors.
The generated output is stochastic and differs from the deterministic greedy output.
The generated output changes on subsequent runs, confirming that sampling is active.

--- Generating with Greedy Group Beam Search ---
Setting pad_token_id to eos_token_id:50256 for open-end generation.

Greedy Outputs:
0: The best way to learn about large language models is to learn about the language.
1: The best way to learn about large language models is to learn about the language.
2: The best way to learn about large language models is to look at a few examples of how to use them.
The first step is to look at a few examples of how to use them. The second step is to look at a few examples of how to use them. The third step
3: The best way to learn about large language models is to look at a few examples of how to use them.
The first step is to look at a few examples of how to use them. The first step is to look at a few examples of how to use them. The second step

--- Generating with Sampling Group Beam Search ---

Sampling Outputs:
0: The best way to learn about large language models is to look at a few simple examples of how a language can be used to understand languages. For example, a simple example can look at a simple example of how a language can be used to understand languages. For example, a simple example can look at
1: The best way to learn about large language models is to find new ways to work around them.
When you start making your own languages, you should be careful not to think about how you are doing this.
You should not be just focusing on the tools that you
2: The best way to learn about large language models is to look at a few simple examples of how a language can be used to understand languages. For example, a simple example can look at a simple example of how an interpreter can be used to understand languages.
3: The best way to learn about large language models is to find new ways to work around them.
When you start making your own languages, you should be careful not to think about how you are doing this.
You should not be just focusing on the tools that are

--- Generating with MORE RANDOM Sampling Group Beam Search ---

More Random Sampling Outputs:
0: The best way to learn about large language models is through the research paper published in Psychological Science .
I was so lucky (because everyone else I was lucky to be with has been with) that it was very helpful to do some small tasks so I couldn't stop feeling inspired by
1: The best way to learn about large language models is through the research paper published in Psychological Science .
I was so lucky (because everyone else I was lucky to be with has been with) that it was very helpful to do some small tasks so I couldn't stop feeling very good
2: The best way to learn about large language models is to take a look at a few simple language modeling tutorials you can be sure you’ll learn a lot about the languages and frameworks that you’ll be used to writing them. These tutorials can be found in The Cucumber, Python
3: The best way to learn about large language models is to take a look at a few simple language modeling tutorials you can be sure you’ll learn a lot about the languages and frameworks that you’ll be used to writing them. These tutorials can be found in The A Language Modeler blog

Rocketknight1 · 2025-06-09T13:16:14Z

cc @gante

gspeter-max and others added 6 commits June 6, 2025 22:48

'update'

f0714cb

Delete .vscode directory

fc902da

Update utils.py

13aa0d4

Merge branch 'main' into main

e1e2407

Update utils.py

4b6d6e9

style: Apply automatic code formatting

38a70cd

gspeter-max changed the title ~~Add sampling support to group beam search #38648~~ Add sampling support to group beam search Jun 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add sampling support to group beam search #38653

Add sampling support to group beam search #38653

Uh oh!

gspeter-max commented Jun 7, 2025

Uh oh!

Rocketknight1 commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add sampling support to group beam search #38653

Are you sure you want to change the base?

Add sampling support to group beam search #38653

Uh oh!

Conversation

gspeter-max commented Jun 7, 2025

Feature Description

Problem

Solution

Testing

Uh oh!

Rocketknight1 commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants