[Bug]: V1 engine ignores logits processors and min-p sampling

### Your current environment

**vLLM Version**: 0.7.0

### Model Input Dumps

_No response_

### 🐛 Describe the bug

# Issue: V1 engine ignores custom logits processors **and** does not implement min-p sampling

**Problem**  
1. **Custom logits processors**: In the new V1 engine, specifying a `logits_processor` in `SamplingParams` for `LLM.generate()` has no effect. The code in [`gpu_model_runner.py`](https://github.com/vllm-project/vllm/blob/main/vllm/v1/worker/gpu_model_runner.py) never passes any sampling metadata into `self.model.compute_logits(...)`, so the logits processor is silently ignored.

2. **Min-p**: Similarly, `min_p` (a sampling parameter supported in V0 akin to `top_k` and `top_p`) is not applied at all in V1. The [`sampler.py`](https://github.com/vllm-project/vllm/blob/main/vllm/v1/sample/sampler.py) for the new engine appears to skip it entirely, so it never factors into the final token selection.

If those features are not yet supported, consider at least raising a warning or error to avoid silent failures.

**Possible Fix for Logits Processor Issue**  
1. **Create a new data class** to hold relevant metadata for `self.model.compute_logits(...)`. 
    - Could simply hold request ids and and request states (`CachedRequestState`). 
2. **Collate metadata** inside `GPUModelRunner.execute_model(...)`. 
3. **Patch** LogitsProcessor.forward() inside [`logits_processor.py`](https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/logits_processor.py) to handle the new V1 metadata class alongside old V0 SamplingMetadata class.
4. **Define** `LogitsProcessor._apply_logits_processor_v1(...)` or something similar to properly handle preprocessed `hidden_states` tensor in V1 model runner, as opposed to re-using the V0 version.

**Possible Fix for Min-p Issue**
1. **Add min_p attribute** to `InputBatch` in [`gpu_input_batch.py`](https://github.com/vllm-project/vllm/blob/main/vllm/v1/worker/gpu_input_batch.py). 
2. **Add min_p field** to `SamplingMetadata` data class in [`metadata.py`](https://github.com/vllm-project/vllm/blob/main/vllm/v1/sample/metadata.py). 
3. **Modify forward function** of `Sampler` in [`sampler.py`](https://github.com/vllm-project/vllm/blob/main/vllm/v1/sample/sampler.py) to apply min-p filtering. 

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Bug]: V1 engine ignores logits processors and min-p sampling #12678

Your current environment

Model Input Dumps

🐛 Describe the bug

Issue: V1 engine ignores custom logits processors and does not implement min-p sampling

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

[Bug]: V1 engine ignores logits processors and min-p sampling #12678

Description

Your current environment

Model Input Dumps

🐛 Describe the bug

Issue: V1 engine ignores custom logits processors and does not implement min-p sampling

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions