-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Description
SvelteKit WebUI always overrides server side configured model sampling settings like temperature, top_p, top_k, min_p and many others.
Each model typically recommends it's own temperature and sampling settings, so for example running gpt-oss as suggested in @ggerganov guide #15396
llama-server --temp 1.0 --top-p 1.0 --top-k 0 --min-p 0.001 --model gpt-oss-20b-mxfp4.gguf
Yet with WebUI settings reset to default, following parameters are sent to server:
srv log_server_r: request: {"messages":[{"role":"user","content":"hi"}],"stream":true,"reasoning_format":"auto","temperature":0.8,"max_tokens":-1,"dynatemp_range":0,"dynatemp_exponent":1,"top_k":40,"top_p":0.95,"min_p":0.05,"xtc_probability":0,"xtc_threshold":0.1,"typ_p":1,"repeat_last_n":64,"repeat_penalty":1,"presence_penalty":0,"frequency_penalty":0,"dry_multiplier":0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":-1,"samplers":["top_k","typ_p","top_p","min_p","temperature"],"timings_per_token":true}
After editing WebUI settings and leaving most of the relevant input fields empty, zeroes are sent to server, which probably is a bug.
srv log_server_r: request: {"messages":[{"role":"user","content":"hi"}],"stream":true,"reasoning_format":"auto","temperature":0,"max_tokens":-1,"dynatemp_range":0,"dynatemp_exponent":0,"top_k":0,"top_p":0,"min_p":0,"xtc_probability":0,"xtc_threshold":0,"typ_p":0,"repeat_last_n":0,"repeat_penalty":0,"presence_penalty":0,"frequency_penalty":0,"dry_multiplier":0,"dry_base":0,"dry_allowed_length":0,"dry_penalty_last_n":0,"timings_per_token":true}
Expected default behaviour would be empty relevant fields in WebUI settings and only minimal required params sent to server, thus respecting server-side model configuration.
{"messages":[{"role":"user","content":"hi"}],"stream":true,"timings_per_token":true}
Overriding model settings WebUI side should be concious opt-in decision by user, sending a copy of hardcoded values by default is rather confusing.