Skip to content

Conversation

@shen-shanshan
Copy link
Collaborator

@shen-shanshan shen-shanshan commented May 5, 2025

What this PR does / why we need it?

Enable speculative decoding with structured outputs, adapted from vllm-project/vllm#14702.

Does this PR introduce any user-facing change?

Find more details at vllm-project/vllm#14702.

How was this patch tested?

TODO:

  • Test tests/v1/entrypoints/llm/test_struct_output_generate.py after spec decode supported in vllm-ascend V1.

@shen-shanshan shen-shanshan marked this pull request as draft May 5, 2025 01:51
@shen-shanshan shen-shanshan changed the title [V1][Structured Output][Spec Decode] Enable Speculative Decoding with Structured Outputs [V1][Core] Enable Speculative Decoding with Structured Outputs May 5, 2025
@shen-shanshan shen-shanshan marked this pull request as ready for review May 8, 2025 07:38
Signed-off-by: shen-shanshan <[email protected]>
@shen-shanshan shen-shanshan changed the title [V1][Core] Enable Speculative Decoding with Structured Outputs [V1][Structured Output] Enable Speculative Decoding with Structured Outputs May 8, 2025
@shen-shanshan shen-shanshan marked this pull request as draft May 9, 2025 01:23
@wangxiyuan wangxiyuan mentioned this pull request Jun 4, 2025
76 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant