Skip to content

Conversation

@hXtreme
Copy link
Contributor

@hXtreme hXtreme commented Feb 27, 2025

Attempts to fix the jupyter_server/issues/1503 issue

For details please see the linked issue. TLDR is that if the log_message contents have the literal '%' those are getting converted to '%%' when no formatting args are specified. When HTTPError.__str__ is called, log_message is always %-formatted even when no formatting args are specified, and the '%' literal replacement is likely to avoid issues here.

This pr performs %-formatting of log_message only when required (args are specified) to avoid having to do the literal replacement.

While not completely a parallel example this is more or less what python's logging module does when creating a string out of a LogRecord:

https://github.com/python/cpython/blob/e06bebb87e1b33f7251196e1ddb566f528c3fc98/Lib/logging/__init__.py#L391-L402

If this is intentionally done for some special reason such as security then maybe this issue could be tackled solely on the jupyter side.

related to: #1393

cc: @ptch314

Attempts to fix the [jupyter_server/issues/1503](jupyter-server/jupyter_server#1503) issue
@hXtreme
Copy link
Contributor Author

hXtreme commented Feb 27, 2025

It would be great if someone could point me to instruction on how to run tests/lints on the code :)

@bdarnell
Copy link
Member

I'm not sure we have this documented, but the simplest way is to install tox and run something like tox -e lint,docs,py313-full (For simple changes like this you can usually omit docs and -full, and feel free to change py313 to a different version if you have something else installed)

@hXtreme
Copy link
Contributor Author

hXtreme commented Feb 28, 2025

@bdarnell I've updated the pr to pass tox -e py312-full

some tests were skipped with the following output:

python -bb -m tornado.test
Ran 1209 tests in 15.091s

OK (skipped=3)
Some tests were skipped because: needs fix, no testable future
imports, py312 has its own check for test case returns
python -O -m tornado.test
Ran 1209 tests in 17.613s

OK (skipped=3)
Some tests were skipped because: needs fix, no testable future
imports, py312 has its own check for test case returns
python -m tornado.test --httpclient=tornado.curl_httpclient.CurlAsyncHTTPClient
Ran 1209 tests in 21.064s

OK (skipped=4)
Some tests were skipped because: curl client accepts invalid headers,
needs fix, no testable future imports, py312 has its own check for
test case returns
python -m tornado.test --resolver=tornado.platform.caresresolver.CaresResolver
Ran 1209 tests in 17.034s

OK (skipped=4)
Some tests were skipped because: localhost does not resolve to ipv4,
needs fix, no testable future imports, py312 has its own check for
test case returns

I couldn't run docs test

@hXtreme
Copy link
Contributor Author

hXtreme commented Mar 6, 2025

The pr should be ready to review, please let me know if you have any feedback or would like me to make additional changes!

Thank you :)

@bdarnell
Copy link
Member

This looks good and is probably the way we should have done it from the beginning. My only hesitation is that it's a little backwards-incompatible. If someone were using log_message and args in their own code they'd have to be aware of this change. One place this comes up is if they override log_exception (copying the default implementation as a starting point) to send things to a different logger.

What is jupyter doing with these exceptions that's showing unprocessed HTTPError.log_message to users?

@hXtreme
Copy link
Contributor Author

hXtreme commented Mar 12, 2025

What is jupyter doing with these exceptions that's showing unprocessed HTTPError.log_message to users?

I'll defer this to @ptch314, I am too new a contributor to jupyter to be able to answer this fully; I'll also ask someone that's a lot more involved with jupyter to answer this.
(technically this contribution to tornado is my first contribution to jupyter :))

Here's the gist of what I understand, a user of jupyter server wants to return a url encoded url in certain error cases, they noticed that the '%' chars in the encoded url gets converted to '%%' which causes the string that's returned to not be a valid url-encoded url. (wow that was a lot of 'url's)

In more lower-level detail, jupyter uses f-strings to construct the log message that's passed into HTTPError's constructor. Jupyter server could use '%' formatted string to workaround this issue. However, given that it is a bit of an unexpected behaviour for a message content to get updated, a more systematic solution from tornado would be nicer.


For backwards compatibility concerns, if I'm understanding this correctly if downstream someone is overriding log_exception and using log_message and args in their application, then after this change is introduced they might see regression of issue #1393 .
I think I have a work around for that, I'll add a commit to try and maintain this compatibility.

@hXtreme
Copy link
Contributor Author

hXtreme commented Mar 12, 2025

Let me know if this approach of using a log_message property on HTTPError for providing backwards compatibility looks good to you. I would also appreciate if you could give me some guidance on the following:

  • Updates to docstring for HTTPError: should I change anything in either the class docstring or for the log_message property?
  • Do we want to add any deprecation notice/warnings on the HTTPError.log_message property asking users to prefer using get_message function?

I'm currently away from any machine where I can test my most recent changes but will be able to test them by Monday 17th.

@ptch314
Copy link

ptch314 commented Mar 13, 2025

What is jupyter doing with these exceptions that's showing unprocessed HTTPError.log_message to users?

I'll defer this to @ptch314, I am too new a contributor to jupyter to be able to answer this fully

Thanks @hXtreme, I think you have given a good summary of our jupyter use case. For clarification, we are simply an outside user of jupyter. The issue we ran into is that jupyter does not properly pass error messages we create that include % symbols. I think we as well as jupyter expect error message strings (that may contain %s) to be passed through as-is, but the code in tornado over-processes the log_message and replaces %s with %%s when it does not seem necessary.

@ojarjur
Copy link

ojarjur commented Mar 13, 2025

What is jupyter doing with these exceptions that's showing unprocessed HTTPError.log_message to users?

Here's my understanding of it:

Jupyter defines a base class for REST API handlers.

That base class overrides the write_error method and converts the HTTPError into a JSON object that is written to the HTTP response body.

The log_message from the HTTPError gets stored inside of that JSON object in a field named message

When the JupyterLab UI receives the HTTP response from one of these API handlers with an error status code, it displays the message field from the response body as-is to the user via a pop-up dialog.


Additional background:

The place we first hit this was here.

That's invoked from here, which is a handler that receives requests from the JupyterLab UI running in a browser.

The base class of that handler is the APIHandler referenced above.

@ojarjur
Copy link

ojarjur commented Mar 13, 2025

@hXtreme

Let me know if this approach of using a log_message property on HTTPError for providing backwards compatibility looks good to you.

This looks reasonable to me, but it means we also need to change Jupyter in order to fix the original bug.

The APIHandler in Jupyter uses the log_message field/property directly, so maintaining backwards compatibility on that property means that the Jupyter server codebase needs to change to use the new get_message method instead.

I assume that is what you were thinking, but I wanted to make sure we're all on the same page.

@hXtreme
Copy link
Contributor Author

hXtreme commented Mar 18, 2025

@bdarnell I've rerun the tests, lint and py312 passes without any errors


@ojarjur: yes. If this pr gets merged and released then on the jupyter side I will create a pr to use the get_message method instead. I do have some free time so could probably keep the pr ready.

@bdarnell
Copy link
Member

For backwards compatibility concerns, if I'm understanding this correctly if downstream someone is overriding log_exception and using log_message and args in their application, then after this change is introduced they might see regression of issue #1393 .

Yes, exactly. (I'm not sure if anyone is actually doing this - it's possible that anyone using HTTPError.log_message is doing so without using args, just like jupyter is)

That base class overrides the write_error method and converts the HTTPError into a JSON object that is written to the HTTP response body.

The log_message from the HTTPError gets stored inside of that JSON object in a field named message

Jupyter is using log_message without args, which is why it is not affected by #1393. But it would have missing data if any HTTPErrors were to get raised that attempted to use args. (Tornado generates some HTTPErrors that use args internally, but currently only in StaticFileHandler, so they're not going to reach your base class write_error).

This looks reasonable to me, but it means we also need to change Jupyter in order to fix the original bug.

Yes, and this is unfortunate. The change looks good and is fully backwards-compatible, but if it still requires changes in jupyter to use the new interfaces, is it worth the complexity? The alternative would be a fix solely on jupyter's side to do what log_exception does and format with args unconditionally: reply["message"] = (e.log_message % e.args) if e.log_message else message (and then I think we would still want a docs-only change to Tornado to document what's going on here). Is it worth going through a new interface (and deprecating the old one) to slightly clean up this error class (and, significantly, make it work more like logging.LogRecord)?

@bdarnell
Copy link
Member

With the approval of PEP 750 Template Strings, I think a change to HTTPError is warranted, to encapsulate the message formatting and allow template strings to eventually replace the use of a printf-style string and *args.

I'm not ready to make any final changes here since template strings aren't available yet, but I think this is a good start. get_message will let us swap out the % operation in a backwards-compatible way. Eventually the log_message property will probably be deprecated but I'll wait on that until template strings have landed and we settle on the replacement interface for format strings and arguments.

@bdarnell bdarnell merged commit 5e4fff4 into tornadoweb:master Apr 12, 2025
15 checks passed
@hXtreme hXtreme deleted the jupyter-1503 branch April 17, 2025 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants