-
Notifications
You must be signed in to change notification settings - Fork 0
JSON API #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
JSON API #1
Conversation
Json.java
Outdated
var unescapedKey = Utils.unescape( | ||
strKey.toCharArray(), 0, strKey.length()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this unescape keys? This seems pretty unexpected to me and I am not sure if many users actually want that behavior. With this approach if they wanted a literal \
in their key they would manually have to do escaping themselves beforehand.
It is also inconsistent that this does not disallow ASCII control characters, while the JSON specification demands this for strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a general issue with this API: Most existing JSON libraries differentiate between the JSON input data (e.g. String
, Reader
) and the parsed representation, e.g. a dedicated JsonArray
, JsonObject
, ... or List
and Map
.
This API here mixes this up where the parsed (or rather 'untyped') data also represents unparsed JSON data to some extent. This will likely cause confusion. I guess it is fine that it internally uses the unescaped representation, but it should not expose that to the user (unless JsonValue#toString()
or similar is called), and neither should it require the user to provide already escaped values.
This becomes evident for example when you try to construct a JsonObject
from JsonObject#members
again:
JsonObject o = (JsonObject) Json.parse("{\"\\\\ test\": 1}");
// IllegalArgumentException: Illegal escape sequence
JsonObject.of(o.members());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser should probably enforce a default nesting limit of JSON arrays and objects. See related google/gson#2588.
(See also https://mail.openjdk.org/pipermail/core-libs-dev/2025-May/146012.html)
return theNumber.orElseSet(() -> { | ||
var str = toString(); | ||
// Check if integral (Java literal format) | ||
boolean integerOnly = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this convert -0
to a Double
? It seems currently it would return a Long
, losing the sign.
Hello @naotoj and @justin-curtis-lu, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @naotoj and @justin-curtis-lu,
I have updated this PR now with your latest changes, marked the comments which you have addressed as resolved and added a few additional comments.
Personally I think the current handling of unescaping and escaping is not ideal and possibly more complex than it has to be.
As a JSON library user, how often did you have to deal with escaping or unescaping in the past?
It is the job of the JSON library to do that. When reading a JSON document it performs unescaping, and when writing a JSON document it performs escaping. As a user you should not have to do either of it; it should be transparent to the user.
That is why I think the current API which requires users to provide already escaped values is unfamiliar and inconvenient. If you insist that users must provide already escaped values, then please at least provide a Json#escape(String)
function or similar. You cannot expect that users additionally implement escaping themselves.
There might be cases where you need to have control over the escaped output, but those use cases are pretty rare. If you want to support them with the API, then in my opinion that should be achieved via an explicit API and it should not be the default case. For example have a JsonString#of
which expects an unescaped value, and then a separate JsonString#ofEscaped
or similar which expects an already escaped value1.
The implementation could also be simplified:
-
JsonStringImpl
could have two fields:value
(returned by#value()
): The unescaped represented value, e.g. for"some \" text"
it would besome " text
jsonValue
2 (returned by#toString()
): The escaped JSON value, e.g."some \" text"
When the parser creates a
JsonStringImpl
, initially onlyjsonValue
would be available (and the parser would have already validated that it is valid).
When the user creates aJsonStringImpl
(throughJsonString#of
) then initially onlyvalue
would be available (no validation is necessary since any string value is allowed).
The corresponding unavailable value would then be calculated on demand. E.g.JsonString.of("some \0 text").toString()
would initializejsonValue
with"some \u0000 text"
oncetoString()
is called. -
JsonObjectImpl
would always store the unescaped values as keys, and escape them on demand whentoString
is called to produce JSON for the complete object. It would not provide any API to provide the escaped keys or entries separately.
That is only my personal opinion on this though. It would be good though if you also asked others for opinions on this, and it would probably also be good if the JSON API only gets added as Preview feature first (in case that wasn't the plan anyway) to find any potential pain points for users.
Anyway, this 'conversation' is rather one-sided at the moment anyway if you don't really engage with any of the comments here. So I might not look again at any further changes.
It is up to you what you make with this feedback. I hope it was useful nonetheless. And thanks for your work on adding an official JSON API to Java.
Footnotes
@@ -105,6 +105,7 @@ public static JsonValue parse(String in) { | |||
*/ | |||
public static JsonValue parse(char[] in) { | |||
Objects.requireNonNull(in); | |||
// Defensive copy on input. Ensure source is immutable. | |||
return new JsonParser(Arrays.copyOf(in, in.length)).parseRoot(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, can't this be simplified by using clone
?
return new JsonParser(Arrays.copyOf(in, in.length)).parseRoot(); | |
return new JsonParser(in.clone()).parseRoot(); |
* {@return the {@code String} value represented by this {@code JsonString} | ||
* surrounded by quotation marks} Any escaped characters in the original JSON | ||
* string are converted to their unescaped form in the returned {@code String}. | ||
* {@return the {@code String} value represented by this {@code JsonString}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value is not really the "value represented", that is what #value()
returns, i.e. the value of "some \" text"
is some " text
.
case 'u' -> { | ||
if (offset + 4 < doc.length) { | ||
escapeLength = 4; | ||
offset++; // Move to first char in sequence | ||
c = codeUnit(); | ||
// Move to the last hex digit, since outer loop will increment offset | ||
offset += 3; | ||
c = switch (c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably good to add a comment explaining why this double escaping is done here.
I am also not sure if this is really worth it. Why no just internally store JsonObjectImpl
as Map<unescaped, JsonValue>
instead of Map<escaped, JsonValue>
. It seems that would greatly simplify this and the other code (and might also improve performance).
Then only JsonObject#toString
would have to do escaping again.
} | ||
|
||
private String source() { | ||
// getSource throws on quotes, so bypass and re-insert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does getSource
refer to? Should this be getCompliantString
?
if (offset + 4 < endOffset) { | ||
c = codeUnit(doc, offset + 1); | ||
length = 4; | ||
c = switch (c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to #1 (comment), would be good to add a comment explaining why this performs (seemingly) double unescaping.
No description provided.