You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a few models that return structured output by utilizing special tokens as delimiters. As of now, vLLM always skips special tokens during decoding. Would it be possible to add skip_special_tokens as a generation parameter?
TGI sort of supports this by giving you the option to return individual tokens with their IDs and a boolean indicating whether they are special or not.