Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ Language changes
that significantly improves load and inference times for heavily overloaded methods that
dispatch on Types (such as traits and constructors).
* The "h bar" `ℏ` (`\hslash` U+210F) character is now treated as equivalent to `ħ` (`\hbar` U+0127).
* The `@simd` macro now has a more limited and clearer semantics, it only enables reordering and contraction
of floating-point operations, instead of turning on all "fastmath" optimizations.
If you observe performance regressions due to this change, you can recover previous behavior with `@fastmath @simd`,
if you are OK with all the optimizations enabled by the `@fastmath` macro. ([#49405])
* When a method with keyword arguments is displayed in the stack trace view, the textual
representation of the keyword arguments' types is simplified using the new
`@Kwargs{key1::Type1, ...}` macro syntax ([#49959]).
Expand Down
2 changes: 1 addition & 1 deletion base/simdloop.jl
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ The object iterated over in a `@simd for` loop should be a one-dimensional range
By using `@simd`, you are asserting several properties of the loop:

* It is safe to execute iterations in arbitrary or overlapping order, with special consideration for reduction variables.
* Floating-point operations on reduction variables can be reordered, possibly causing different results than without `@simd`.
* Floating-point operations on reduction variables can be reordered or contracted, possibly causing different results than without `@simd`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually slightly broader I think. It's the entire reduction chain. Not just the reduction operations themselves

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions for the wording?


In many cases, Julia is able to automatically vectorize inner for loops without the use of `@simd`.
Using `@simd` gives the compiler a little extra leeway to make it possible in more situations. In
Expand Down
8 changes: 4 additions & 4 deletions src/llvm-muladd.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,10 @@ STATISTIC(TotalContracted, "Total number of multiplies marked for FMA");
* Combine
* ```
* %v0 = fmul ... %a, %b
* %v = fadd fast ... %v0, %c
* %v = fadd contract ... %v0, %c
* ```
* to
* `%v = call fast @llvm.fmuladd.<...>(... %a, ... %b, ... %c)`
* `%v = call contract @llvm.fmuladd.<...>(... %a, ... %b, ... %c)`
* when `%v0` has no other use
*/

Expand Down Expand Up @@ -87,13 +87,13 @@ static bool combineMulAdd(Function &F) JL_NOTSAFEPOINT
it++;
switch (I.getOpcode()) {
case Instruction::FAdd: {
if (!I.isFast())
if (!I.hasAllowContract())
continue;
modified |= checkCombine(I.getOperand(0), ORE) || checkCombine(I.getOperand(1), ORE);
break;
}
case Instruction::FSub: {
if (!I.isFast())
if (!I.hasAllowContract())
continue;
modified |= checkCombine(I.getOperand(0), ORE) || checkCombine(I.getOperand(1), ORE);
break;
Expand Down
3 changes: 2 additions & 1 deletion src/llvm-simdloop.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,8 @@ static void enableUnsafeAlgebraIfReduction(PHINode *Phi, Loop *L, OptimizationRe
return OptimizationRemark(DEBUG_TYPE, "MarkedUnsafeAlgebra", *K)
<< "marked unsafe algebra on " << ore::NV("Instruction", *K);
});
(*K)->setFast(true);
(*K)->setHasAllowReassoc(true);
(*K)->setHasAllowContract(true);
Comment on lines +152 to +153
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use setFastMathFlags to set multiple flags in one go?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can probably use setHasNoSignedZeros() as well. It can help when you initialize loops at zero.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be going a bit too far, though I kind of want to just expose all of the fastmath flags separately. @fastmath implies too much and most of it is unsafe and useless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO no signed zeros is reasonable. they don't do much and prevent a lot of useful transformations (i.e. 0-x to -x)

++length;
}
ReductionChainLength += length;
Expand Down
6 changes: 3 additions & 3 deletions test/llvmpasses/loopinfo.jl
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,10 @@ function simdf(X)
acc += x
# CHECK: call void @julia.loopinfo_marker(), {{.*}}, !julia.loopinfo [[LOOPINFO:![0-9]+]]
# LOWER-NOT: llvm.mem.parallel_loop_access
# LOWER: fadd fast double
# LOWER: fadd reassoc contract double
# LOWER-NOT: call void @julia.loopinfo_marker()
# LOWER: br {{.*}}, !llvm.loop [[LOOPID:![0-9]+]]
# FINAL: fadd fast <{{(vscale x )?}}{{[0-9]+}} x double>
# FINAL: fadd reassoc contract <{{(vscale x )?}}{{[0-9]+}} x double>
end
acc
end
Expand All @@ -46,7 +46,7 @@ function simdf2(X)
# CHECK: call void @julia.loopinfo_marker(), {{.*}}, !julia.loopinfo [[LOOPINFO2:![0-9]+]]
# LOWER: llvm.mem.parallel_loop_access
# LOWER-NOT: call void @julia.loopinfo_marker()
# LOWER: fadd fast double
# LOWER: fadd reassoc contract double
# LOWER: br {{.*}}, !llvm.loop [[LOOPID2:![0-9]+]]
end
acc
Expand Down
4 changes: 2 additions & 2 deletions test/llvmpasses/simdloop.ll
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ loop:
; CHECK: llvm.mem.parallel_loop_access
%aval = load double, double *%aptr
%nextv = fsub double %v, %aval
; CHECK: fsub fast double %v, %aval
; CHECK: fsub reassoc contract double %v, %aval
%nexti = add i64 %i, 1
call void @julia.loopinfo_marker(), !julia.loopinfo !3
%done = icmp sgt i64 %nexti, 500
Expand All @@ -59,7 +59,7 @@ loop:
%aptr = getelementptr double, double *%a, i64 %i
%aval = load double, double *%aptr
%nextv = fsub double %v, %aval
; CHECK: fsub fast double %v, %aval
; CHECK: fsub reassoc contract double %v, %aval
%nexti = add i64 %i, 1
call void @julia.loopinfo_marker(), !julia.loopinfo !2
%done = icmp sgt i64 %nexti, 500
Expand Down