Skip to content

Development Roadmap (2025 Q4) #12780

@hnyls2002

Description

@hnyls2002

SGLang Roadmap — 2025 Q4

Contributions and feedback are welcome. Join Slack.

Focus

  • Feature compatibility & reliability: Full compatibility and production-level reliability across P/D disaggregation, all parallelisms, speculative decoding, HiCache, and load balancing.
  • Usability: Easy installation on NV/AMD/TPU/CPU; simple large-scale deployment (k8s, OME).
  • Kernel optimization for next-gen hardware (GB300/GB200, B300/B200, MI350/MI355, TPU).
  • Reinforcement learning framework integration and training-inference mismatch mitigation.

Base Engine Features

Image

Parallelism

Server Reliability

Kernel

Speculative Decoding

  • General speculative algorithm abstraction to support multiple algorithms
  • Hybrid algorithm combining Eagle and ngram
  • Adaptive algorithm that adjusts speculative parameters during runtime
  • Slack: #spec-decoding

PD Disaggregation

KV Cache System & Memory Pool

Diffusion (Multimodal Generation)

Multimodal Models

Quantization

Multi-LoRA Serving

RL Framework Integration

Slack: #reinforcement-learning, #slime-rl-framework

Hardware

Model Coverage

Model Gateway & API Layer

  • Support multimodality and image processor in gRPC mode

  • Support PII and classify API for classifying intent and complexity of the input

  • Semantic Routing Support

  • Allow Gateway to actively listen to SGLang server's KV cache events to better handle routing decisions in gRPC mode

  • Allow SGLang server to start with both gRPC and HTTP server

  • Model Gateway terminal UI

  • Reactive UI to launch workers remotely; this should support both local machine and remote

  • Natively support Anthropic Message API instead of wrapping around chat completion in gRPC mode

  • Gateway SDK, supporting golang, python, and node.js for every rust crate (policies, tokenizer, parsers etc)

  • Metrics enhancement, including tracing, model specific metrics (TTFT, TPOT etc)

  • PoC: @slin1237 @CatherineSue
    Issue: SGLang Autonomous Model Gateway Roadmap #13098
    Slack: #router-sig

CI / Release / Maintenance

  • Improve CI monitor workflow

    • Automatically track accuracy & performance metrics with standard format
    • Regression detection & alerts
  • Improve nightly tests

    • Add more models (Deepseek, GPT-OSS, Qwen3-next)
  • Full feature coverage CI with all combinations (every two days)

Slack: #ci-cd-build-release, #help-desk

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions