[Bug]: InternVL2-2B outputs gibberish with tensor parallel inference

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

**Reproduce**
- Just run `examples/offline_inference_vision_language.py` with `tensor_parallel_size=2`.
- The inference with `tensor_parallel_size=1` works normally.

**Outputs**
```
Processed prompts: 100%|████████████████████████████████████████| 1/1 [00:01<00:00,  1.52s/it, est. speed input: 1192.65 toks/s, output: 26.96 toks/s]
1.
1.
1/2
3/2
1
  for example
�2/定有了一个iSA诉快的队/纳厄/否化，4.
INFO 08-30 03:22:42 multiproc_worker_utils.py:136] Terminating local vLLM worker processes
(VllmWorkerProcess pid=9476) INFO 08-30 03:22:42 multiproc_worker_utils.py:237] Worker exiting
```

**The root issue**
- This is broken by the `split_qkv` function for `internlm2` backbone introduced in #7187 to make compatible with awq model.

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: InternVL2-2B outputs gibberish with tensor parallel inference #8017

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: InternVL2-2B outputs gibberish with tensor parallel inference #8017

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions