-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Labels
AutoDeploy<NV> AutoDeploy Backend<NV> AutoDeploy BackendInference runtime<NV>General operational aspects of TRTLLM execution not in other categories.<NV>General operational aspects of TRTLLM execution not in other categories.bugSomething isn't workingSomething isn't working
Description
For models listed below, trtllm runtime generates repetitive outputs and less coherent than demollm's outputs:
EleutherAI/pythia-6.9b
HuggingFaceTB/SmolVLM2-2.2B-Instruct
allenai/OLMo-2-1124-7B-SFT
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
meta-llama/CodeLlama-7b-Python-hf
meta-llama/Llama-3.2-1B-Instruct
mistralai/Mistral-Large-Instruct-2407
mistralai/Mistral-Nemo-Instruct-2407
nvidia/Llama-3.1-Minitron-4B-Width-Base
Let's take a closer look to see if there's any misconfiguration with trtllm runtime.
This analysis is based on Jun 1st dashboard run. View the coverage google sheet Tab 6/1/2025 for specific outputs.
Metadata
Metadata
Assignees
Labels
AutoDeploy<NV> AutoDeploy Backend<NV> AutoDeploy BackendInference runtime<NV>General operational aspects of TRTLLM execution not in other categories.<NV>General operational aspects of TRTLLM execution not in other categories.bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done