-
-
Couldn't load subscription status.
- Fork 5.7k
Description
I was updating my performance-optimization lecture notes from last year to Julia 1.0, which start with a comparison of C, Python, and Julia sum functions, and I noticed something odd:
Both the Julia sum(::Vector{Float64}) function and the NumPy sum function are faster than last year (yay for compiler improvements?). Last year, Julia and NumPy sum had almost identical speed, but now the NumPy sum function is now about 30% faster than Julia.
I'm running a 2016 Intel core i7, the same as last year. So apparently the NumPy sum function has gotten some new optimization that we don't have? (I did switch from Python 2 to Python 3; I'm using the Conda Python.) Some kind of missing SIMD optimization?
I'm not so concerned about sum per se, but this is a pretty basic function — if we are leaving 30% on the table here, then we might be missing performance opportunities in many other places too.