Report of performance regression
@litone01 reported a performance regression on the branch of spec decode (#24322), where models run slower than on main even without using spec decode.
His branch: https://github.com/litone01/vllm/tree/origin/feature/spec-decode-draft-model-debug
This issue is for tracking progress.