-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
Description
Your current environment
The output of `python collect_env.py`
Your output of `python collect_env.py` here
🐛 Describe the bug
I'm following the examples entirely from vllm's example code for structured output found here: https://github.com/vllm-project/vllm/blob/main/examples/online_serving/openai_chat_completion_structured_outputs.py and getting errors in many examples. Additionally using response_format
has had many errors which have varied from release v0.6.3-post1 through v0.8.1; I've tested all versions between the two previously mentioned with llama 3.3 and found inconsistencies. The unit tests occurring for guided generation do not seem robust; especially when the examples provided in the examples section don't work when running without modification.
Specifically the issues start with the following code in openai_chat_completion_structured_outputs.py; all subsequent examples in that .py file are also throwing errors:
# Guided decoding by JSON using Pydantic schema
class CarType(str, Enum):
sedan = "sedan"
suv = "SUV"
truck = "Truck"
coupe = "Coupe"
class CarDescription(BaseModel):
brand: str
model: str
car_type: CarType
json_schema = CarDescription.model_json_schema()
prompt = ("Generate a JSON with the brand, model and car_type of"
"the most iconic car from the 90's")
completion = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct",
messages=[{
"role": "user",
"content": prompt,
}],
extra_body={"guided_json": json_schema},
)
print(completion.choices[0].message.content)
The error I get is:
BadRequestError: Error code: 400 - {'object': 'error', 'message': 'The provided JSON schema contains features not supported by xgrammer.', 'type': 'BadRequestError', 'param': None, 'code': 400}
As mentioned previously, I'm unable to run ALL other subsequent examples in /examples/online_serving/openai_chat_completion_structured_outputs.py without receiving some sort of error; those are "Guided decoding by Grammar", "Extra backend options".
I'm also running into many issues using response format
. For example, when running this specific code from openai:
https://github.com/openai/openai-cookbook/blob/main/examples/Leveraging_model_distillation_to_fine-tune_a_model.ipynb
where response format is defined as:
response_format = {
"type": "json_schema",
"json_schema": {
"name": "grape-variety",
"schema": {
"type": "object",
"properties": {
"variety": {
"type": "string",
"enum": varieties.tolist()
}
},
"additionalProperties": False,
"required": ["variety"],
},
"strict": True
}
}
Results in the following error on v0.8.1 when running the cell which contains
answer = call_model('gpt-4o', generate_prompt(df_france_subset.iloc[0], varieties))
answer
(note: gpt-4o was replaced with "meta-llama/Llama-3.3-70B-Instruct")
BadRequestError: Error code: 400 - {'object': 'error', 'message': 'The provided JSON schema contains features not supported by xgrammar.', 'type': 'BadRequestError', 'param': None, 'code': 400}
And results in the following - different - error on v0.7.3:
BadRequestError: Error code: 400 {'object': 'error', 'message': '[{'type': 'extra_forbidden', 'loc': ('body', 'metadata'), 'msg':, 'Extra inputs are not permitted', 'input': {'distillation': 'wine-distillation'}}, {'type': 'extra_forbidden', 'loc': ('body', 'store'), 'msg': 'Extra Inputs are not permitted', 'intput': True}]", 'type: 'BadRequestError, 'param': None, 'code': 400}
Edit: Also seeing issues with some of the examples found in: https://docs.vllm.ai/en/latest/features/structured_outputs.html
This example, which used to work in v0.7.3, now returns an error:
from pydantic import BaseModel
from openai import OpenAI
class Info(BaseModel):
name: str
age: int
client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="dummy")
completion = client.beta.chat.completions.parse(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "My name is Cameron, I'm 28. What's my name and age?"},
],
response_format=Info,
extra_body=dict(guided_decoding_backend="outlines"),
)
message = completion.choices[0].message
print(message)
assert message.parsed
print("Name:", message.parsed.name)
print("Age:", message.parsed.age)
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.