Fix empty output when temp is too low #2937
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, when the temp is too low, we found the generated text would be empty in
v0.3.0:And it would raise
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0inv0.2.4.Here are some findings after debugging:
v0.3.0doesn't raise RuntimeError is because vllm has implemented its own _multinomial method inv0.3.0in this PR, butv0.2.4still uses torch.multinomial, which checks if there is Nan in tensors and throwsRuntimeError.We think it would be confusing if the output kept being empty when users set the temperature too low. It would be nice to have the temperature max out the temperature to 1e-2 or 1e-3 to avoid the empty output. This PR:
logitsto avoid numerical errors.