Skip to content

Conversation

@kuntal1461
Copy link
Contributor

Summary

This PR contributes to issue #326 by improving the API design of org.springframework.ai.image:
Introduces a new ImageResponseFormat enum for strongly typed response formats.
Aligns ImageOptions across providers (OpenAI, Azure OpenAI, Stability, ZhiPuAi).
Updates tests accordingly.
Enhances documentation to reflect the new type-safe format property.
Ensures observation and response metadata consistency.
This aligns image models with improved usability and type safety.

Testing

  • ./mvnw -q -DskipITs -DskipNpm -Dgpg.skip=true -Dmaven.javadoc.skip=true test

Copy link
Member

@ericbottard ericbottard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I would typically be in favor of such strongly typed changes, I wonder if this is really desirable.
Do all providers really agree on values? For example, you changed one test value from base64 to b64_json. Was it intentional? Or was it an oversight exactly because a strong type gives you confidence?

Moreover, do all providers really support (and will they always) support all values of the enum?

.width(1920)
.style("sketch")
.responseFormat("base64")
.responseFormat(ImageResponseFormat.B64_JSON)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a change of semantics here

@Override
public String getResponseFormat() {
return this.responseFormat;
return (this.responseFormat != null) ? this.responseFormat.getValue() : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we carry through with this change (see my overall comment on the benefits of this PR), could we try to push the null check inside the enum (for example having ImageResponseFormat.NULL a value, and setting all options to that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ericbottard I wouldn’t introduce a ImageResponseFormat.NULL sentinel. Today, if the caller never sets a response format we just return null, and Jackson simply omits response_format from the payload—which is exactly what the provider expects. If we replaced that with an enum constant whose getValue() returns "null", we’d end up serializing response_format: "null" unless every caller remembered to special-case the sentinel, which is both error-prone and not part of the provider’s spec. Keeping the null check where we materialize the request is much clearer; if we want to tighten the API surface, a better direction would be to expose an Optional rather than distort the enum domain.

Can you confirm if this approach matches what you’d expect?

@kuntal1461
Copy link
Contributor Author

Hi @ericbottard Thanks for calling that out—yes, the base64 → b64_json change was intentional. OpenAI (and Azure’s wrapper) return the literal b64_json token, so the enum would never round‑trip if we left the old string. The enum isn’t asserting that every provider supports every value; each integration only ever sets the members its API understands (e.g. Stability AI uses the PNG/JSON entries, OpenAI uses URL/b64_json). I do share the concern about future formats—if someone ships a new token we’ll need to extend the enum quickly—but the stronger typing gives us an earlier signal when that happens.

OpenAI’s image API docs list the response_format choices explicitly: url or b64_json. You can see it here (scroll to the request parameters table): https://platform.openai.com/docs/api-reference/images/create

Azure OpenAI mirrors OpenAI’s parameters. The CreateImage endpoint documents response_format taking url or b64_json as well: https://learn.microsoft.com/azure/ai-services/openai/reference#image-generation-create.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants