[Bug]: another example of structured output xgrammar does not support

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```

</details>


### Model Input Dumps

_No response_

### 🐛 Describe the bug

It seems that xgrammar does not support some structured output. The test code is:
```
 class Animals(BaseModel):
        location: str
        activity: str
        animals_seen: conint(ge=1, le=5)  # type: ignore # Constrained integer type
        animals: list[str]

    user_input = "I saw a puppy, a cat and a raccoon during my bike ride in the park"
    messages = [
        {
            "role": "system",
            "content": "You are a helpful which converts user input to JSON object. Respond in JSON format.",
        },
        {
            "role": "user",
            "content": f"convert to JSON according to provided schema: '{user_input}'",
        },
    ]
    logger.info(f"Sending Chat API request to {model_name}")

    completion = client.client.chat.completions.create(
        model=model_name,
        messages=messages,
        temperature=0.1,
        max_tokens=250,
        extra_body=dict(guided_json=json.dumps(Animals.model_json_schema()), guided_decoding_backend="lm-format-enforcer"),
    )
    assert completion is not None
    logger.warning(f"{completion=}")

    # check that output JSON has keys according to the schema, assert for values is too brittle (e.g. "park" vs "the park")
    assert set(json.loads(completion.choices[0].message.content).keys()) == set(
        Animals.model_fields.keys()
    )
```
I set `guided_decoding_backend` as `outlines` or `lm-format-enforcer`, it works fine. However, if I set it as `xgrammar`, it can not pass the test. The completion is:
```
 completion=ChatCompletion(id='chatcmpl-425b0ae7-02eb-467e-89bf-83080494182c', choices=[Choice(finish_reason='length', index=0, logprobs=None, message=ChatCompletionMessage(content='{\n  "location": "park",\n  "activity": "bike ride",\n  "animals_seen": \n \t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[]), stop_reason=None)], created=1736835123, model='dsp.llama-3.1-8b-instruct', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=250, prompt_tokens=80, total_tokens=330, completion_tokens_details=None, prompt_tokens_details=None), prompt_logprobs=None)
```

I tested with llama3.1-8b-instruct, quantized using gpt-w8a8 using llm-compressor. The deployment arguments are:
```
        - '--tensor-parallel-size=1'
        - '--max-num-batched-tokens=4096'
        - '--enable-chunked-prefill'
        - '--gpu-memory-utilization=0.96'
        - '--enable-auto-tool-choice'
        - '--tool-call-parser=llama3_json'
        - '--chat-template=/mnt/models/tool_chat_template_llama3.1_json.jinja'
```
Thanks a lot!
 

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: another example of structured output xgrammar does not support #12028

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: another example of structured output xgrammar does not support #12028

Description

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions