[Model] Enable encoder DP for MiniCPM-V #23948

ZJY0516 · 2025-08-29T16:09:22Z

Purpose

Enable encoder DP for MiniCPM-V

Related to: #22743

Test Plan

MiniCPM-V-2_6

vllm serve ./hf/MiniCPM-V-2_6/ -tp 2 --trust-remote-code --mm_encoder_tp_mode data

vllm bench serve \
--endpoint-type openai-chat \
--endpoint /v1/chat/completions \
--model "./hf/MiniCPM-V-2_6/" \
--dataset-name random-mm \
--random-mm-base-items-per-request 3 \
--tokenizer ~/hf/MiniCPM-V-2_6/ --trust-remote-code \
--max-concurrency 1 --num-prompts 50

Test Result

encoder dp
============ Serving Benchmark Result ============
Successful requests:                     50        
Maximum request concurrency:             1         
Benchmark duration (s):                  92.09     
Total input tokens:                      51034     
Total generated tokens:                  3652      
Request throughput (req/s):              0.54      
Output token throughput (tok/s):         39.66     
Total Token throughput (tok/s):          593.85    
---------------Time to First Token----------------
Mean TTFT (ms):                          987.47    
Median TTFT (ms):                        955.18    
P99 TTFT (ms):                           1295.46   
-----Time per Output Token (excl. 1st token)------
Mean TPOT (ms):                          11.86     
Median TPOT (ms):                        11.86     
P99 TPOT (ms):                           11.96     
---------------Inter-token Latency----------------
Mean ITL (ms):                           11.69     
Median ITL (ms):                         11.83     
P99 ITL (ms):                            12.59     
==================================================

default
============ Serving Benchmark Result ============
Successful requests:                     50        
Maximum request concurrency:             1         
Benchmark duration (s):                  95.14     
Total input tokens:                      51034     
Total generated tokens:                  3730      
Request throughput (req/s):              0.53      
Output token throughput (tok/s):         39.21     
Total Token throughput (tok/s):          575.64    
---------------Time to First Token----------------
Mean TTFT (ms):                          1025.98   
Median TTFT (ms):                        1021.61   
P99 TTFT (ms):                           1637.95   
-----Time per Output Token (excl. 1st token)------
Mean TPOT (ms):                          11.90     
Median TPOT (ms):                        11.87     
P99 TPOT (ms):                           12.15     
---------------Inter-token Latency----------------
Mean ITL (ms):                           11.74     
Median ITL (ms):                         11.85     
P99 ITL (ms):                            12.62     
==================================================

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: zjy0516 <[email protected]>

gemini-code-assist

Code Review

This pull request correctly enables encoder data parallelism for MiniCPM-V models by passing a use_data_parallel flag to the Idefics2VisionTransformer. However, there are two key issues. First, the supports_encoder_tp_data flag is only set for MiniCPMV4_0, which means this feature will likely not be enabled for versions 2.5, 2.6, and 4.5, despite code changes suggesting they should support it. This is a correctness issue. Second, as detailed in a specific comment, there is significant code duplication in init_vision_module across multiple classes, which impacts maintainability.

gemini-code-assist · 2025-08-29T16:11:09Z

vllm/model_executor/models/minicpmv.py

+        model = Idefics2VisionTransformer(
+            config.vision_config,
+            quant_config=quant_config,
+            prefix=prefix,
+            use_data_parallel=self.use_data_parallel,
+        )


This init_vision_module implementation is nearly identical across MiniCPMV2_5, MiniCPMV2_6, MiniCPMV4_0, and MiniCPMV4_5. The only significant difference is the conditional logic for quant_config in the v4.x models. This duplication increases maintenance effort. Consider refactoring this into a shared method in a base class to improve code reuse and maintainability.

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 · 2025-08-30T02:24:12Z

@DarkLight1337 Could you please take a look at this pr?

DarkLight1337

Thanks, can you move the flag introduced by #23325 from the V4 model to the base model so that this is actually enabled for the other models?

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 · 2025-08-30T04:50:47Z

Thanks, can you move the flag introduced by #23325 from the V4 model to the base model so that this is actually enabled for the other models?

Done

docs/configuration/optimization.md

Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Jiangyun Zhu <[email protected]>

Signed-off-by: zjy0516 <[email protected]> Signed-off-by: Jiangyun Zhu <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

enable encoder dp for minicpmv

3b7561f

Signed-off-by: zjy0516 <[email protected]>

gemini-code-assist bot reviewed Aug 29, 2025

View reviewed changes

update doc

a6fa5f7

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 requested a review from hmellor as a code owner August 29, 2025 16:13

mergify bot added the documentation Improvements or additions to documentation label Aug 29, 2025

DarkLight1337 reviewed Aug 30, 2025

View reviewed changes

fix

f473dfa

Signed-off-by: zjy0516 <[email protected]>

DarkLight1337 reviewed Aug 30, 2025

View reviewed changes

docs/configuration/optimization.md Outdated Show resolved Hide resolved

Update docs/configuration/optimization.md

368308e

Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Jiangyun Zhu <[email protected]>

DarkLight1337 approved these changes Aug 30, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) August 30, 2025 06:03

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 30, 2025

ZJY0516 added 2 commits August 30, 2025 15:38

Merge branch 'main' into encoder-dp

7b03c6c

Merge branch 'main' into encoder-dp

8fb2de0

vllm-bot merged commit 3a6acad into vllm-project:main Aug 30, 2025
37 of 41 checks passed

ZJY0516 deleted the encoder-dp branch August 30, 2025 13:32

DarkLight1337 mentioned this pull request Aug 29, 2025

[Feature]: Generalized the DP feature for ViT and multimodal backbone for the benefit of all models #22743

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Enable encoder DP for MiniCPM-V #23948

[Model] Enable encoder DP for MiniCPM-V #23948

Uh oh!

ZJY0516 commented Aug 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 29, 2025

Uh oh!

ZJY0516 commented Aug 30, 2025

Uh oh!

DarkLight1337 left a comment

Uh oh!

ZJY0516 commented Aug 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Model] Enable encoder DP for MiniCPM-V #23948

[Model] Enable encoder DP for MiniCPM-V #23948

Uh oh!

Conversation

ZJY0516 commented Aug 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

ZJY0516 commented Aug 30, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

ZJY0516 commented Aug 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ZJY0516 commented Aug 29, 2025 •

edited by github-actions bot

Loading