-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Docs: fix taint tracking query #20620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing!
docs/codeql/codeql-language-guides/analyzing-data-flow-in-python.rst
Outdated
Show resolved
Hide resolved
import RemoteToFileFlow::PathGraph | ||
module RemoteToFileFlow = TaintTracking::Global<RemoteToFileConfiguration>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I normally put these two lines the other way round, so you define the module above where you import its submodule.
import RemoteToFileFlow::PathGraph | |
module RemoteToFileFlow = TaintTracking::Global<RemoteToFileConfiguration>; | |
module RemoteToFileFlow = TaintTracking::Global<RemoteToFileConfiguration>; | |
import RemoteToFileFlow::PathGraph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick search for Go and python and the import comes after the module in all the cases I looked at (where they're both in the same file).
Thanks for fixing this @sylwia-budzynska . I will look into the docs for the other languages. |
@sylwia-budzynska Actually, now I read the whole file, I see that it doesn't make sense to just fix this example. It's all been explained earlier on in terms of data flow nodes rather than path nodes. I think it would be better to keep most of it as it is, then at the end have an extra section on path problems which explains how to convert what you've already written into a path problem query, which shows you the whole path. What do you think? If you agree, I'm happy to take this over and do that consistently for all the languages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I just found that we have good docs on making path queries, and that page links to all the language-specific "analyzing data flow" pages. I think the idea is to have the language-specific parts in separate pages and then explain the path query part once. So I don't think this example should be updated. However, I do think it would be good to have a short section at the end of each explaining that you can get information about the whole path and linking to the docs for that. I will do this in a new PR.
Here is my suggested approach: #20622 |
Currently, the results of this specific query show as
I was a bit worried that we don't have a taint tracking path query example for each language in the docs. That sounds like a good solution 👍 |
Well, if you are using global flow as part of something else (like filtering out results if a safe value flows there) then you would just use |
Makes sense. I'll close this PR then. I think adding path query examples to the global taint tracking in [language] docs would be very beneficial, especially for the times when teaching CodeQL to new users who might not know the differences between using a path query or global data flow without path. Thank you for taking the time to fix it! |
👋 Several query examples in the CodeQL docs for analyzing global dataflow seem to be wrong, and don't work as expected. E.g. in docs for Python the query:
Results of running the query:

This won’t show a path query, but a #select because:
@kind path-problem
is missingimport RemoteToFileFlow:PathGraph
RemoteToFileFlow:PathNode
types, notDataFlow::Node
where RemoteToFileFlow::flowPath(input, fileAccess)
instead ofRemoteToFileFlow::flow(input, fileAccess)
select fileAccess.getNode(), input, fileAccess, "This file access uses data from $@.", input, "user-controllable input."
Making the changes fixes the query to display like in screenshot:
This PR fixes this specific query. However, I noticed that very similar issues, like the ones I listed above, seem to exist in docs for other languages too, in the Analyzing data flow in [language] pages.