[Feature]: Phi-3 vision -- allow multiple images as Microsoft shows can be done

### 🚀 The feature, motivation and pitch

i.e. instead of this:
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/serving_chat.py#L138-L140

allow multiple images.

Idea is that many models trained for 1 image actually work well with multiple, and blocking usage inhibits exploration of what models are capable of.

E.g. would be good for microsoft/Phi-3-vision-128k-instruct

In HF transformers, Phi-3 handles multiple images just fine.  I've used it just fine as well.

It's also an officially supported task from Microsoft:

https://github.com/microsoft/Phi-3CookBook/blob/main/md/03.Inference/Vision_Inference.md#3-comparison-of-multiple-images

### Alternatives

None

### Additional context

```
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "Multiple 'image_url' input is currently not supported.", 'type': 'BadRequestError', 'param': None, 'code': 400}

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Phi-3 vision -- allow multiple images as Microsoft shows can be done #5820

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Phi-3 vision -- allow multiple images as Microsoft shows can be done #5820

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions