Skip to content

JSON API #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

JSON API #1

wants to merge 4 commits into from

Conversation

Marcono1234
Copy link
Owner

No description provided.

Json.java Outdated
Comment on lines 175 to 176
var unescapedKey = Utils.unescape(
strKey.toCharArray(), 0, strKey.length());
Copy link
Owner Author

@Marcono1234 Marcono1234 Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this unescape keys? This seems pretty unexpected to me and I am not sure if many users actually want that behavior. With this approach if they wanted a literal \ in their key they would manually have to do escaping themselves beforehand.

It is also inconsistent that this does not disallow ASCII control characters, while the JSON specification demands this for strings.

Copy link
Owner Author

@Marcono1234 Marcono1234 Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a general issue with this API: Most existing JSON libraries differentiate between the JSON input data (e.g. String, Reader) and the parsed representation, e.g. a dedicated JsonArray, JsonObject, ... or List and Map.

This API here mixes this up where the parsed (or rather 'untyped') data also represents unparsed JSON data to some extent. This will likely cause confusion. I guess it is fine that it internally uses the unescaped representation, but it should not expose that to the user (unless JsonValue#toString() or similar is called), and neither should it require the user to provide already escaped values.

This becomes evident for example when you try to construct a JsonObject from JsonObject#members again:

JsonObject o = (JsonObject) Json.parse("{\"\\\\ test\": 1}");
// IllegalArgumentException: Illegal escape sequence
JsonObject.of(o.members());

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser should probably enforce a default nesting limit of JSON arrays and objects. See related google/gson#2588.

(See also https://mail.openjdk.org/pipermail/core-libs-dev/2025-May/146012.html)

return theNumber.orElseSet(() -> {
var str = toString();
// Check if integral (Java literal format)
boolean integerOnly = true;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this convert -0 to a Double? It seems currently it would return a Long, losing the sign.

@Marcono1234 Marcono1234 changed the title Json api JSON API Jun 28, 2025
@Marcono1234
Copy link
Owner Author

Hello @naotoj and @justin-curtis-lu,
I have added here feedback for the JSON API you are currently working on. Hopefully it is helpful for you. Feel free to comment if you have any questions or add reactions to track which notes are already resolved (in case you want to adjust the implementation accordingly).

Copy link
Owner Author

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @naotoj and @justin-curtis-lu,
I have updated this PR now with your latest changes, marked the comments which you have addressed as resolved and added a few additional comments.


Personally I think the current handling of unescaping and escaping is not ideal and possibly more complex than it has to be.

As a JSON library user, how often did you have to deal with escaping or unescaping in the past?
It is the job of the JSON library to do that. When reading a JSON document it performs unescaping, and when writing a JSON document it performs escaping. As a user you should not have to do either of it; it should be transparent to the user.

That is why I think the current API which requires users to provide already escaped values is unfamiliar and inconvenient. If you insist that users must provide already escaped values, then please at least provide a Json#escape(String) function or similar. You cannot expect that users additionally implement escaping themselves.

There might be cases where you need to have control over the escaped output, but those use cases are pretty rare. If you want to support them with the API, then in my opinion that should be achieved via an explicit API and it should not be the default case. For example have a JsonString#of which expects an unescaped value, and then a separate JsonString#ofEscaped or similar which expects an already escaped value1.

The implementation could also be simplified:

  • JsonStringImpl could have two fields:

    • value (returned by #value()): The unescaped represented value, e.g. for "some \" text" it would be some " text
    • jsonValue2 (returned by #toString()): The escaped JSON value, e.g. "some \" text"

    When the parser creates a JsonStringImpl, initially only jsonValue would be available (and the parser would have already validated that it is valid).
    When the user creates a JsonStringImpl (through JsonString#of) then initially only value would be available (no validation is necessary since any string value is allowed).
    The corresponding unavailable value would then be calculated on demand. E.g. JsonString.of("some \0 text").toString() would initialize jsonValue with "some \u0000 text" once toString() is called.

  • JsonObjectImpl would always store the unescaped values as keys, and escape them on demand when toString is called to produce JSON for the complete object. It would not provide any API to provide the escaped keys or entries separately.

That is only my personal opinion on this though. It would be good though if you also asked others for opinions on this, and it would probably also be good if the JSON API only gets added as Preview feature first (in case that wasn't the plan anyway) to find any potential pain points for users.


Anyway, this 'conversation' is rather one-sided at the moment anyway if you don't really engage with any of the comments here. So I might not look again at any further changes.

It is up to you what you make with this feedback. I hope it was useful nonetheless. And thanks for your work on adding an official JSON API to Java.

Footnotes

  1. But honestly I don't really think there is much demand for creating a JSON string from an already escaped value. If necessary you can still add this later.

  2. Or any other similar name.

@@ -105,6 +105,7 @@ public static JsonValue parse(String in) {
*/
public static JsonValue parse(char[] in) {
Objects.requireNonNull(in);
// Defensive copy on input. Ensure source is immutable.
return new JsonParser(Arrays.copyOf(in, in.length)).parseRoot();
Copy link
Owner Author

@Marcono1234 Marcono1234 Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, can't this be simplified by using clone?

Suggested change
return new JsonParser(Arrays.copyOf(in, in.length)).parseRoot();
return new JsonParser(in.clone()).parseRoot();

* {@return the {@code String} value represented by this {@code JsonString}
* surrounded by quotation marks} Any escaped characters in the original JSON
* string are converted to their unescaped form in the returned {@code String}.
* {@return the {@code String} value represented by this {@code JsonString}}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value is not really the "value represented", that is what #value() returns, i.e. the value of "some \" text" is some " text.

case 'u' -> {
if (offset + 4 < doc.length) {
escapeLength = 4;
offset++; // Move to first char in sequence
c = codeUnit();
// Move to the last hex digit, since outer loop will increment offset
offset += 3;
c = switch (c) {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably good to add a comment explaining why this double escaping is done here.

I am also not sure if this is really worth it. Why no just internally store JsonObjectImpl as Map<unescaped, JsonValue> instead of Map<escaped, JsonValue>. It seems that would greatly simplify this and the other code (and might also improve performance).

Then only JsonObject#toString would have to do escaping again.

}

private String source() {
// getSource throws on quotes, so bypass and re-insert
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does getSource refer to? Should this be getCompliantString?

if (offset + 4 < endOffset) {
c = codeUnit(doc, offset + 1);
length = 4;
c = switch (c) {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to #1 (comment), would be good to add a comment explaining why this performs (seemingly) double unescaping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant