You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Setting the temperature in a particular range causes vllm to generate whitespace-only outputs. Values above/below this range work correctly. I have seen this with facebook/opt-125m, fine-tuned mistral-7B models, codellama-13B, and several other models. It seems like this is an issue with vllm rather than the particular model:
To reproduce: python -m vllm.entrypoints.openai.api_server --model facebook/opt-125m
With temperature: 1e-3: Generates " great place to live. I" 1e-4: Generates "<s><s><s><s><s><s><s>" 1e-5: Generates "<s><s><s><s><s><s><s>" 1e-6: Generates " great place to live. I"