Use llvm-dialects as specification layer for Julia LLVM IR #52945
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The LLVM IR we emit in codegen uses pseudo-intrinsics to represent the additional language-specific
semantics needed for the correct optimization of Julia code. Since Julia uses a precise GC we need to
track values in the generated code. We could do so early, but that would clutter the code quite a bit
and thus we decided to take a late-lowering approach. We represent enough semantics with our own LLVM
dialect so that at the end of the optimization pipeline we can legalize/lower our Julia LLVM dialect
to the general LLVM dialect that the backends can emit code for.
We thus far have an informal specification of this Julia LLVM dialect scattered across both codegen
and optimizations, and other producers like Enzyme code-generator. The https://github.com/GPUOpen-Drivers/llvm-dialects
project provides tools for using an approach similar to MLIR to specify a custom dialect on the LLVM substracte.
This PR is mostly meant to open up the discussion if we want this, but my goal for it is to make it
easier for producers like Enzyme (and other GPUCompiler) to emit our dialect correctly, as well
as unifying the definition across codegen and optimization passes, and having one place to document and specify
the behaviour of our dialect operations.
(Side-comment we have technically at least two dialects one produced by codegen and lowered by late-lowering and then a second between late-lowering and final-lowering)