Skip to content

[Bug] Models generate whitespace-only output when temperature is in range [1e-4, 1e-5], regardless of model type #3063

@saumya-saran

Description

@saumya-saran

Setting the temperature in a particular range causes vllm to generate whitespace-only outputs. Values above/below this range work correctly. I have seen this with facebook/opt-125m, fine-tuned mistral-7B models, codellama-13B, and several other models. It seems like this is an issue with vllm rather than the particular model:

To reproduce:
python -m vllm.entrypoints.openai.api_server --model facebook/opt-125m

Send request:

curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "facebook/opt-125m",
"prompt": "San Francisco is a",
"max_tokens": 7,
"temperature": <temperature>
}'

With temperature:
1e-3: Generates " great place to live. I"
1e-4: Generates "<s><s><s><s><s><s><s>"
1e-5: Generates "<s><s><s><s><s><s><s>"
1e-6: Generates " great place to live. I"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions