-
Notifications
You must be signed in to change notification settings - Fork 807
[SYCL RTC] Introduce --auto-pch
support
#20226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
92d3176
to
fdabb7f
Compare
fdabb7f
to
91a06f2
Compare
91a06f2
to
d1ff0c3
Compare
d1ff0c3
to
73d087b
Compare
73d087b
to
14cc5df
Compare
14cc5df
to
aa66cf5
Compare
aa66cf5
to
193d044
Compare
193d044
to
960f581
Compare
--auto-pch
support--auto-pch
support
960f581
to
ae02915
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rocking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cc @tahonermann for awareness.
I've sketched the changes to make the cache persistent, llvm::StringRef PrecompiledPreamble::memoryContents() const {
return Storage->memoryContents();
} and the SYCL RTC code would do something like this: if (!llvm::sys::fs::exists(PCHPath)) {
auto PCHPreamble = PrecompiledPreamble::Build(...);
raw_fd_ostream{PCHPath, EC} << PCHPreamble->memoryContents();
}
adjustInvocation(PCHPath); // Impl mostly copy-pasted from PrecompiledPreamble. |
@gmlueck , your approval is the only remaining blocker. |
@premanandrao should review this. This seems like useful functionality we should consider upstreaming if we can demonstrate it to be a robust solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nice and clean. I noted one FIXME comment that should ideally be addressed/removed. Otherwise, this looks good to me.
- No persistency between invocations | ||
- Currently there is no eviction mechanism, so application is expected to use | ||
the option only when number of preambles is limited. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't an immediate concern, but future support for C++20 modules might require changes to include module imports in the preamble. Module imports don't look like preprocessor control lines, but they are. The language has rules that prevent use of macros that expand to import declarations so that recognition of them isn't dependent on preprocessing.
import std; // Named module import
import <vector>; // Header unit import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! This seems like a great feature. I have some comments / questions on the documentation and exact behavior, though. See below.
|
||
Enable auto-detection of the preamble and use it as a pre-compiled header to | ||
speed up subsequent compilations of TUs matching the preamble/compilation | ||
options. Example of the code that can benefit from this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some more description here in order for people to understand what this does. I suggest something like:
Automatically create and use a precompiled header (PCH) to speed up compilation. The first time this option is passed, the compiler finds the initial set of
#include
directives in the compiled source string (the preamble) and creates a PCH from these header files. On subsequent compilations, if the compiled source string has the same preamble, the PCH is used instead of the header files, which speeds up compilation. If the compiled source string has a different preamble, a new PCH is generated, and that PCH can also be used to speed up subsequent compilations. These PCH files are stored internally in memory, so they do not persist from one execution of the application to the next.The preamble ends with the first statement that is not a preprocessing directive. For example, in the code below, the preamble ends immediately before the
namespace syclext =
statement because this is the first statement that is not a preprocessing directive.[Example here. I suggest an example that has a
#define
before the#include
to illustrate that this is legal. For example, you could#define SYCL_SIMPLE_SWIZZLES
.]The compiler uses the following factors when deciding whether a previously generated PCH can be used:
- The preamble must exactly match (including whitespace and comments).
- The compilation options must match (including the same order).
There are also certain restrictions that the user must avoid:
- The content of each header file in the preamble must not change from one compilation to another.
- The header files in the preamble must not use the
__DATE__
or__TIME__
macros.
I guessed at some of the details above like whitespace and the order of options. If I guessed wrong, please correct.
// Auto-detected preamble ends in the middle of `#else` and would fail to compile. | ||
void foo() {} | ||
#endif | ||
---- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This limitation isn't clear to me. Does the limitation exist whenever there is a #if / #endif
? Or, is it a limitation because the preamble ends inside the body of an #if / #endif
?
How is this handled? Does the user get an obvious error, so they know it's related to --auto-pch
? Can we instead make this work by ending the preamble at the #if
that encloses the non-directive statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the upstream preamble support just works for that, probably
llvm/clang/include/clang/Lex/PreprocessorOptions.h
Lines 137 to 143 in 2dedcee
/// True indicates that a preamble is being generated. | |
/// | |
/// When the lexer is done, one of the things that need to be preserved is the | |
/// conditional #if stack, so the ASTWriter/ASTReader can save/restore it when | |
/// processing the rest of the file. Similarly, we track an unterminated | |
/// #pragma assume_nonnull. | |
bool GeneratePreamble = false; |
|
||
* No support (including not reporting any errors) for `+__DATE__+`/`+__TIME__+` | ||
macros inside auto-detected preamble (transitively in regards to the | ||
includes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "no support" mean? Does this generate an error? If so, does the error message tell the user it's related to --auto-pch
? Does it generate wrong code? Does it cache the value of the date / time in the PCH and use the cached values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I planned that as UB at this stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silently generating bad code is not a good option. I think any of these would be acceptable:
- Issue an error and abort compilation with a clear message.
- Cache the value of the date / time in the PCH and use that in subsequent compilations that use the PCH.
- Do not generate the PCH if it contains these macros, optionally issuing a warning message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that you requested my review, but this comment is not resolved. It looks like clang diagnoses a warning if the PCH contains these macros. Perhaps another option is to implicitly pass -Werror=pch-date-time
, which would cause an obvious failure if these macros are used when compiling with --auto-pch
.
@tahonermann I assume your team is the one investigating this? I think this needs to be fixed before we can make a release with the |
Compilation of
#include <sycl/sycl.hpp>
is slow and that's especially problematic for SYCL RTC (run-time compilation). One way to overcome this is fine-grained includes that are being pursued separately. Another way is to employ clang's precompiled headers support which this PR is doing. Those two approaches can be combined, and this PR addstest-e2e/PerformanceTests/KernelCompiler/auto-pch.cpp
that gives some idea of the PCH impact. The test shows PCH benefits when compiling some of the fine-grained includes on top of absolute minimum required to compiled SYCL RTC's "Hello world". From one of the CI runs:It misses
sycl/sycl.hpp
line because that currently crashes FE when reading the generated PCH, the crash is being investigated/fixed separately.Implementation-wise I'm reusing existing upstream
clang::PrecompiledPreamble
with one minor modification. It seems thatPrecompiledPreamble
's main usage is for things likeclangd
so it ignores errors in the code. I've modified it so that those errors would break pch-generation the same way normal compilation would break. I'm also not sure if we'd want that long-term, because it seems that making such "auto-pch" persistent would deviate from the upstream version ofPrecompiledPreamble
even more. I can imagine that in some near future we'd need to "fork" it into a separate utility. Still, seems to be fine for the first step.Driver modifications are for the
--auto-pch
option support that should only be present on the SYCL RTC path and not for the regularclang
invocations from the command line. I'm relatively confident those will stay in future.