[Roadmap] vLLM Roadmap Q4 2025

This page is accessible via [roadmap.vllm.ai](http://roadmap.vllm.ai/)

This is a living document! For each item here, we intend to link the RFC as well as discussion Slack channel in the [vLLM Slack](https://slack.vllm.ai)

---

*In the [Q3 2025](https://github.com/vllm-project/vllm/issues/20336), we fully removed V0 code path and made vLLM excel in large scale serving with mature wide E and prefill disaggregation. In this quarter, our goal is to continue to drive down the CPU overhead, enhance vLLM on frontier clusters, and strengthen our RL integrations.* 

We list help wanted item as 🙋in areas that the committer group is seeking more dedicated contributions. 

### Engine Core

* [ ] Async Scheduling on by default
* [ ] Optimize Input Preparation (“Persistent Batch V2”)  
* [ ] Multimodal Processing Simplification  
* [ ] 🙋Speculative Decoding Enhancements (Suffix Decoding, CUDA Graph/torch.compile support)  

### Large Scale Serving

* [ ] Elastic Experts
* [ ] Transfer KV Cache through CPU  
* [ ] 🙋GB200 
* [ ] 🙋AMD support
* [ ] 🙋EPLB algorithm testing and enhancements in production

### Reinforcement Learning

* [ ] Full determinism and batch invariance
* [ ] Add more testing cases for popular integrations   
* [ ] Custom checkpoint loader, custom model format  
* [ ] Simple data parallel router for scale out
* [ ] 🙋Enhance weight loading speed for syncing and resharding  
* [ ] 🙋Study a way to enable multi-turn long horizon scheduling to avoid preemption

### Performance and Experience Enhancement

* [ ] Continue to drive down startup time (#feat-startup-ux)
* [ ] Refactor tool use parsing to leverage grammar structural tag
* [ ] Refactor CI 
* [ ] Turn on torch compile fusion by default, no extra flags on default case
* [ ] Prefix caching for Hybrid models (https://github.com/vllm-project/vllm/issues/26201)
* [ ] 🙋Model Bash: profile and optimize newer model architectures on different hardwares (NVIDIA Hopper, Blackwell, AMD MI3xx) (Slack Channel: #sig-model-bash)
    * DeepSeek V3.2
    * Qwen3MoE, Qwen3 VL, Qwen3 Next
    * gpt-oss

---

If any of the items you wanted is not on the roadmap, your suggestion and contribution is strongly welcomed! Please feel free to comment in this thread, open feature request, or create an RFC.

Historical Roadmap: #20336, #15735, #11862, #9006, #5805, #3861, #2681, #244

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Roadmap] vLLM Roadmap Q4 2025 #26376

Engine Core

Large Scale Serving

Reinforcement Learning

Performance and Experience Enhancement

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Roadmap] vLLM Roadmap Q4 2025 #26376

Description

Engine Core

Large Scale Serving

Reinforcement Learning

Performance and Experience Enhancement

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions