-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
change Array(T, dims...) deprecation message to Array{T}(dims) #20217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ms...) and not Array{T,N}(dims...)
|
I see where you're coming from with this and I do think this syntax is more convenient, but supplying |
|
|
||
| """ | ||
| Array{T,N}(dims) | ||
| Array{T}(dims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to have both signatures at the top, i.e.
Array{T}(dims)
Array{T,N}(dims)
Construct...I should have done that in my previous PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the top
was not addressed
|
How can it improve performance? The number of arguments (or the length of a type-inferred tuple argument) is known at compile-time, and so if you type (Put another way, if it improves performance compared to (Or do you mean it improves compilation time, rather than runtime performance? I'm pretty skeptical that this has a significant impact on compilation time; I can't see any effect in a quick benchmark.) |
|
The only case in which would definitely improve performance is for something like |
|
Yes I believe that's right; there has to be some type info missing elsewhere for specifying the number of dimensions explicitly to help. |
|
I think we are talking about compile time. I know |
|
I'm rather amazed at how large an effect this has on compile time/memory. We lean really heavily on inlining. Heck, look at |
|
Seems pretty small to me; there are much bigger compiler performance issues. The LLVM code for both |
|
Great. Then let's not recommend a premature optimization here and instead just make the replacement easy to read and write. |
|
I'll have to find the commits, but leaving out the N was responsible for a significant amount of the initial compile time slowdown between 0.4 and 0.5 before Jameson put many of them back. I'm against telling people to do the slow thing when it's only a few characters longer to do the better thing. Could we at least recommend the faster Vector and Matrix type aliases for 1 and 2 dims? |
|
That might be the case for often-used functions in Base like For comparison, I recently tried running the tests with |
|
While compile-time is somewhat interesting, I'm much more interested in the runtime case: julia> f(T::DataType) = Array{T}(0);
julia> g(T::DataType) = Vector{T}(0);
julia> f(Int); g(Int);
julia> @time for i = 1:10^6; g(Int); end
0.049566 seconds (1000.00 k allocations: 76.294 MiB, 11.37% gc time)
# this time is indistinguishable from `h() = Vector{Int}(0)`
julia> @time for i = 1:10^6; f(Int); end
0.263999 seconds (1000.00 k allocations: 76.294 MiB, 78.99% gc time)
# 5x runtime penaltyThis can matter for some cases (like |
|
I get very close timings for |
|
Looks like you hit a GC for the second one? |
|
While OTOH, there is a difference, but only when it's not inlined. We compile the loop even in toplevel so julia> @noinline f(T::DataType) = Array{T}(0);
julia> @noinline g(T::DataType) = Vector{T}(0);
julia> @time for i in 1:1000_000; f(Int); end
1.292829 seconds (8.00 M allocations: 473.051 MiB, 15.37% gc time)
julia> @time for i in 1:1000_000; f(Int); end
1.111166 seconds (8.00 M allocations: 473.022 MiB, 2.41% gc time)
julia> @time for i in 1:1000_000; g(Int); end
0.290434 seconds (1.00 M allocations: 76.315 MiB, 1.10% gc time)
julia> @time for i in 1:1000_000; g(Int); end
0.281773 seconds (1000.00 k allocations: 76.294 MiB, 0.56% gc time)
julia> @time for i in 1:1000_000; g(Int); end
0.275566 seconds (1000.00 k allocations: 76.294 MiB, 0.53% gc time)Feels more like an argument to recommend putting types into parameters if perfomance matters rather than anything else though... |
|
It's not straightforward in a deprecation message to tell the difference between a literal in a test vs an occurrence in a package library function where performance would matter, and if it's simple to help the compiler out a bit then why shouldn't we recommend that? |
Ah, oops. I missed that since the relative difference was correct :P |
|
At least we can recommend this for |
|
I don't like recommending uglier code (that requires the programmer to supply the same information twice) that is unlikely to be faster except in corner cases where it isn't inlined or there is some other inference failure, and is even less likely to be performance-critical (as @JeffBezanson says, this probably only matters for very low-level functions in Base). Also, the deprecation message might give the impression that the ugly/redundant method is the only way to do it. |
|
Furthermore, if we deprecate |
|
In @vtjnash's example, with BenchmarkTools I get essentially the same time: julia> @btime f(Int); @btime g(Int);
21.032 ns (1 allocation: 80 bytes)
20.905 ns (1 allocation: 80 bytes) |
|
I agree completely. Let's fix #10988 to make it easier for folks to find traps like this. If it's performance sensitive, they'll run |
|
Recommending the |
|
Recommending Recommending |
|
If we're not going to recommend it in the deprecation message, then it should be at least added in the performance tips that explicitly specifying dimensionality can help the compiler out. It's not obvious that |
|
By the way, case of Regarding including this in the performance tips, I disagree. We tell them to run |
|
I'm fine with whatever folks think is best, but it would be great if you could add |
|
@ararslan, I've updated the docs to list both signatures. |
|
|
||
| """ | ||
| Array{T}(dims) | ||
| Array{T,N}(dims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could make this Array{T[,N]} to be consistent with the syntax you used for SharedArray below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or drop the brackets from the SharedArray one, that isn't julia syntax for optional arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, but it seemed clearer to just write it twice. I would have done the same thing for SharedArray, but SharedArray has so many other arguments that writing it twice seemed to make it harder to read, because it takes a second to realize that all the other arguments are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tkelman, we use brackets in lots of other places in the documentation for optional arguments. Can you suggest a clearer way to write the SharedArray doc that doesn't involve duplicating the long call-signature line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just write:
SharedArray{T}(dims::NTuple; init=false, pids=Int[])
SharedArray{T,N}(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use brackets in lots of other places in the documentation for optional arguments
And there's been a gradual move towards getting rid of them and using actual Julia syntax in more places. The suggestion with ... seems clear enough - also not exactly valid syntax but obvious what it means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will do the ... version.
This modifies the deprecation message and docstring from #19989 to recommend
Array{T}(dims)and notArray{T,N}(dims...). SupplyingNexplicitly is unnecessary, and I don't think that we should recommend typing redundant information by default. (Also, it may mislead people into thinking that theNis required.)