doc: update multimodal models on support-matrix.md #6431

yechank-nvidia · 2025-07-29T04:55:48Z

Add multimodal models on support-matrix doc.

Summary by CodeRabbit

Documentation
- Updated the support matrix to include new AI models and revised modality support for existing ones.
- Expanded the modality legend to clearly define language, image, video, and audio capabilities.

Signed-off-by: yechank <[email protected]>

coderabbitai · 2025-07-29T04:55:54Z

📝 Walkthrough

"""

Walkthrough

The support matrix documentation for PyTorch backend models was updated to add several new models, revise the modality support for existing models, and expand the modality legend to explicitly define abbreviations for language, image, video, and audio.

Changes

Cohort / File(s)	Change Summary
Support Matrix Documentation `docs/source/reference/support-matrix.md`	Added new models, updated modality support for existing models, and expanded the modality legend in the support matrix.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

[Doc][Qwen3] update qwen3 into support-matrix #6161: Both PRs update the same documentation file, support-matrix.md, to add new models and revise modality support, focusing on expanding the list of supported models and their modalities without changing any code.

Suggested labels

Documentation

Suggested reviewers

litaotju
amukkara
"""

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c93b74 and afb7657.

📒 Files selected for processing (1)

docs/source/reference/support-matrix.md (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

docs/source/reference/support-matrix.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

docs/source/reference/support-matrix.md (3)
14-16: Verify new Gemma 3 / HCXVision rows aren’t duplicating existing entries

We now list three Gemma-related rows (Gemma/Gemma2 in the TRT section, and the two new Gemma3* rows here). Please make sure:

There is no pre-existing Gemma3ForCausalLM/Gemma3ForConditionalGeneration elsewhere in the file.

The ordering of architectures in this table (alphabetical by class name) still holds after insertion.

A quick grep across the repo (or the rendered docs) will catch accidental duplication.

27-32: Consistency pass on new multimodal rows

Phi4MM, Qwen2VL, and Qwen2_5_VL use three-letter modality strings.
Consider normalising the whitespace and ordering so every row is L + I + V (+ A) in the same left-to-right order (L, I, V, A) to avoid scanning errors.

37-40: Render the legend with a Sphinx admonition for uniform styling

The plain “Note:” header followed by a bullet list renders differently from the {note} admonitions used elsewhere (e.g. lines 175-177). Switching keeps the docs consistent and guarantees proper theming.
-Note:
-- L: Language
-- I: Image
-- V: Video
-- A: Audio
+```{note}
+Modality legend  
+* **L** – Language  
+* **I** – Image  
+* **V** – Video  
+* **A** – Audio  
+```

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e58afa5 and 1100595.

📒 Files selected for processing (1)

docs/source/reference/support-matrix.md (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

docs/source/reference/support-matrix.md (2)

Learnt from: moraxu
PR: #6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Learnt from: yechank-nvidia
PR: #6254
File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204
Timestamp: 2025-07-22T09:22:14.726Z
Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using from_shared_tensor() is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call strip_for_generation() to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (1)

docs/source/reference/support-matrix.md (1)

18-22: Cross-check modality changes for LLaVA-NeXT, Llama 4 & Mistral 3

The modality column was changed to:

LLaVA-NeXT → L + I

Llama 4 → L + I + V

Mistral 3 → L + I

Before publishing, confirm that these labels match the current upstream model cards / configs; the support matrix is frequently consulted and discrepancies cause confusion.

jaedeok-nvidia

This PR distinguishes between the image and video support of multi-modal models, which had previously been mixed together. However, the multimodal capabilities vary across models; some support only images but not videos. Therefore, clarifying this in the support matrix would help readers. Thank @yechank-nvidia for your continuous support in advancing multimodal support for TensorRT-LLM.

Signed-off-by: yechank <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

docs/source/reference/support-matrix.md (1)

14-17: Keep the “Architecture” column alphabetically sorted for easier scanning
The new Gemma 3 / EXAONE 4 / HCXVision rows look good, but their insertion breaks the previously alphabetical ordering of the table (e.g., Deepseek → DeciLM → Exaone → Gemma). Consider re-ordering the rows to preserve lexicographic order so users can locate architectures quickly.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1100595 and 7c93b74.

📒 Files selected for processing (1)

docs/source/reference/support-matrix.md (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

docs/source/reference/support-matrix.md (2)

Learnt from: moraxu
PR: #6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Learnt from: yechank-nvidia
PR: #6254
File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204
Timestamp: 2025-07-22T09:22:14.726Z
Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using from_shared_tensor() is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call strip_for_generation() to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation.

🔇 Additional comments (4)

docs/source/reference/support-matrix.md (4)

19-23: Double-check the updated modality flags & HF example links
LLaVA-NeXT, Llama 4, and Mistral 3 rows were updated/added with new modality combinations. Please verify:
• That the cited HF checkpoints indeed expose image (and video) inputs as declared.
• That the example repository names are still live and not superseded (e.g., “llava-hf/llava-v1.6-mistral-7b-hf”).
A quick check avoids stale links or incorrect capability claims.

28-28: 👍 New Phi-4-multimodal entry is clear and consistent
The addition correctly reflects language + image + audio support and follows the existing format.

32-33: Confirm Qwen-VL modality expansion to include images
Both Qwen2-VL rows now show “L + I + V”. Ensure that image support is truly available in the current backend implementation; otherwise, mark as “L + V” to avoid misleading users.

38-41: Legend update looks good
Adding I, V, A clarifies the new modality abbreviations and keeps the table self-contained.

docs/source/reference/support-matrix.md

brb-nv

LGTM.

docs/source/reference/support-matrix.md

Signed-off-by: yechank <[email protected]>

yechank-nvidia · 2025-07-30T07:21:25Z

/bot run --stage-list "A10-Build_Docs"

tensorrt-cicd · 2025-07-30T07:26:37Z

PR_Github #13506 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-30T08:52:14Z

PR_Github #13506 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #10118 (Partly Tested) completed with status: 'SUCCESS'

Signed-off-by: yechank <[email protected]> Signed-off-by: Lanyu Liao <[email protected]>

Signed-off-by: yechank <[email protected]>

doc: update multimodal models on support-matrix.md

1100595

Signed-off-by: yechank <[email protected]>

yechank-nvidia self-assigned this Jul 29, 2025

coderabbitai bot requested a review from litaotju July 29, 2025 04:56

coderabbitai bot reviewed Jul 29, 2025

View reviewed changes

jaedeok-nvidia self-requested a review July 29, 2025 04:58

jaedeok-nvidia approved these changes Jul 29, 2025

View reviewed changes

add EXAONE4.0

7c93b74

Signed-off-by: yechank <[email protected]>

coderabbitai bot reviewed Jul 29, 2025

View reviewed changes

brb-nv reviewed Jul 29, 2025

View reviewed changes

docs/source/reference/support-matrix.md Show resolved Hide resolved

brb-nv approved these changes Jul 29, 2025

View reviewed changes

litaotju reviewed Jul 30, 2025

View reviewed changes

docs/source/reference/support-matrix.md Outdated Show resolved Hide resolved

fix llama and vila

afb7657

Signed-off-by: yechank <[email protected]>

coderabbitai bot requested review from amukkara and litaotju July 30, 2025 01:52

litaotju approved these changes Jul 30, 2025

View reviewed changes

litaotju merged commit 83621e4 into NVIDIA:main Jul 31, 2025
2 of 3 checks passed

coderabbitai bot mentioned this pull request Jul 31, 2025

doc: add bielik on support-matrix.md #6480

Merged

lancelly pushed a commit to lancelly/TensorRT-LLM that referenced this pull request Aug 6, 2025

doc: update multimodal models on support-matrix.md (NVIDIA#6431)

82f574b

Signed-off-by: yechank <[email protected]> Signed-off-by: Lanyu Liao <[email protected]>

coderabbitai bot mentioned this pull request Aug 7, 2025

[TRTLLM-5930][doc] 1.0 Documentation. #6696

Merged

jain-ria pushed a commit to jain-ria/TensorRT-LLM that referenced this pull request Aug 7, 2025

doc: update multimodal models on support-matrix.md (NVIDIA#6431)

706911a

Signed-off-by: yechank <[email protected]>

coderabbitai bot mentioned this pull request Aug 8, 2025

[None][doc] Add doc for multimodal feature support matrix (#6619) #6739

Merged

coderabbitai bot mentioned this pull request Aug 28, 2025

[None][doc] Update doc for multimodal #7347

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

doc: update multimodal models on support-matrix.md #6431

doc: update multimodal models on support-matrix.md #6431

Uh oh!

yechank-nvidia commented Jul 29, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jul 29, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

jaedeok-nvidia left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

brb-nv left a comment

Uh oh!

Uh oh!

yechank-nvidia commented Jul 30, 2025

Uh oh!

tensorrt-cicd commented Jul 30, 2025

Uh oh!

tensorrt-cicd commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

doc: update multimodal models on support-matrix.md #6431

doc: update multimodal models on support-matrix.md #6431

Uh oh!

Conversation

yechank-nvidia commented Jul 29, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

jaedeok-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brb-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yechank-nvidia commented Jul 30, 2025

Uh oh!

tensorrt-cicd commented Jul 30, 2025

Uh oh!

tensorrt-cicd commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

yechank-nvidia commented Jul 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 29, 2025 •

edited

Loading