Skip to content

Count token should not disallow specials #7539

@pwilkin

Description

@pwilkin

App Version

3.26.2

API Provider

Not Applicable / Other

Model Used

Qwen Coder Plus

Roo Code Task Links (Optional)

No response

🔁 Steps to Reproduce

I'm editing a chat parser for llama.cpp where some special tokens appear in the code. Any task regarding those tokens fails with:

2025-08-29 23:59:47.493 [error] Error: The text contains a special token that is not allowed: <|endoftext|>
at E2.exports.__wbindgen_error_new (/home/ilintar/.vscode/extensions/node_modules/.pnpm/[email protected]/node_modules/tiktoken/lite/tiktoken_bg.cjs:366:17)
at null. (wasm://wasm/0040f28e:1:169233)
at null. (wasm://wasm/0040f28e:1:496290)
at s5t.encode (/home/ilintar/.vscode/extensions/node_modules/.pnpm/[email protected]/node_modules/tiktoken/lite/tiktoken_bg.cjs:231:18)
at tiktoken (/home/ilintar/.vscode/extensions/rooveterinaryinc.roo-cline-3.26.2/utils/tiktoken.ts:27:28)
at qpn (/home/ilintar/.vscode/extensions/rooveterinaryinc.roo-cline-3.26.2/utils/countTokens.ts:43:10)

💥 Outcome Summary

The TikToken code is supposed just to count tokens, so it should not do any verifications on the tokens used in the queries. A disallowed_special=() option or its equivalent should be passed to the code to guarantee that this error doesn't happen.

📄 Relevant Logs or Errors (Optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue/PR - TriageNew issue. Needs quick review to confirm validity and assign labels.bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions