change Array(T, dims...) deprecation message to Array{T}(dims) #20217

stevengj · 2017-01-24T16:50:21Z

This modifies the deprecation message and docstring from #19989 to recommend Array{T}(dims) and not Array{T,N}(dims...). Supplying N explicitly is unnecessary, and I don't think that we should recommend typing redundant information by default. (Also, it may mislead people into thinking that the N is required.)

…ms...) and not Array{T,N}(dims...)

ararslan · 2017-01-24T16:57:30Z

I see where you're coming from with this and I do think this syntax is more convenient, but supplying N isn't entirely redundant; IIUC it can improve performance since the compiler doesn't have to determine the number of dimensions itself. (cc @tkelman, who has been recommending using the {T,N} syntax in deprecation fixes across the ecosystem)

ararslan · 2017-01-24T17:00:11Z

base/array.jl


 """
-    Array{T,N}(dims)
+    Array{T}(dims)


It might be nice to have both signatures at the top, i.e.

Array{T}(dims) Array{T,N}(dims) Construct...

I should have done that in my previous PR.

at the top

was not addressed

stevengj · 2017-01-24T17:03:11Z

How can it improve performance? The number of arguments (or the length of a type-inferred tuple argument) is known at compile-time, and so if you type Array{T}(m,n) it gets inlined to Array{T,2}(m,n) anyway.

(Put another way, if it improves performance compared to Array{T}(m,n), then this would seem to indicate a serious problem in either the compiler or in how these constructors are written. However, in a quick benchmark of foo(n,m) = Array{Int}(n,m) vs bar(n,m) = Array{Int,2}(n,m), I find no difference between @benchmark foo(3,4) and @benchmark bar(3,4), as expected.)

(Or do you mean it improves compilation time, rather than runtime performance? I'm pretty skeptical that this has a significant impact on compilation time; I can't see any effect in a quick benchmark.)

stevengj · 2017-01-24T17:19:32Z

The only case in which would definitely improve performance is for something like Array{Int}([3,4]...), which is type-unstable since the length of the splatted array is not known at compile time, but I don't think this usage is very common (and probably shouldn't be used in performance-critical code anyway).

JeffBezanson · 2017-01-24T17:23:02Z

Yes I believe that's right; there has to be some type info missing elsewhere for specifying the number of dimensions explicitly to help.

KristofferC · 2017-01-24T17:26:30Z

I think we are talking about compile time. I know @vtjnash has mentioned it.

mbauman · 2017-01-24T17:34:27Z

I'm rather amazed at how large an effect this has on compile time/memory. We lean really heavily on inlining. Heck, look at 1 + 2.0. Can we just work on reducing this cost everywhere instead of drawing a line in the sand here?

julia> f() = Array{Int}(0,0)
       g() = Array{Int,2}(0,0)
       f(); g(); # Ensure any machinery that is shared between f and g is pre-compiled
       f1() = Array{Int}(0,0)
       g1() = Array{Int,2}(0,0)
       @time f1()
       @time g1()
  0.003308 seconds (1.21 k allocations: 64.724 KiB)
  0.002871 seconds (453 allocations: 23.067 KiB)

JeffBezanson · 2017-01-24T17:49:18Z

Seems pretty small to me; there are much bigger compiler performance issues. The LLVM code for both f1 and g1 consists of a single call to :jl_alloc_array_2d, so LLVM codegen will be a small fraction of the time there, which is known to be atypical.

mbauman · 2017-01-24T17:56:02Z

Great. Then let's not recommend a premature optimization here and instead just make the replacement easy to read and write.

tkelman · 2017-01-24T18:17:33Z

I'll have to find the commits, but leaving out the N was responsible for a significant amount of the initial compile time slowdown between 0.4 and 0.5 before Jameson put many of them back. I'm against telling people to do the slow thing when it's only a few characters longer to do the better thing. Could we at least recommend the faster Vector and Matrix type aliases for 1 and 2 dims?

JeffBezanson · 2017-01-24T18:23:56Z

That might be the case for often-used functions in Base like similar and the Array constructors themselves, which are called and inlined into hundreds of places. I would be very surprised if a call in a higher level of the stack had the same effect.

For comparison, I recently tried running the tests with -O0 and it took about half the time. That's where the real issues are.

vtjnash · 2017-01-24T18:36:01Z

While compile-time is somewhat interesting, I'm much more interested in the runtime case:

julia> f(T::DataType) = Array{T}(0);
julia> g(T::DataType) = Vector{T}(0);
julia> f(Int); g(Int);

julia> @time for i = 1:10^6; g(Int); end
  0.049566 seconds (1000.00 k allocations: 76.294 MiB, 11.37% gc time)
  # this time is indistinguishable from `h() = Vector{Int}(0)`

julia> @time for i = 1:10^6; f(Int); end
  0.263999 seconds (1000.00 k allocations: 76.294 MiB, 78.99% gc time)
  # 5x runtime penalty

This can matter for some cases (like similar in Base) where the caller may not be a constant type. It doesn't matter for literal cases like Array{Int} vs. Vector{Int}, where the resulting code is the same.

JeffBezanson · 2017-01-24T18:39:45Z

I get very close timings for f and g.

yuyichao · 2017-01-24T18:41:34Z

Looks like you hit a GC for the second one?

yuyichao · 2017-01-24T18:44:27Z

While OTOH, there is a difference, but only when it's not inlined. We compile the loop even in toplevel so f and g actually makes no difference here. This does make a difference

julia> @noinline f(T::DataType) = Array{T}(0);

julia> @noinline g(T::DataType) = Vector{T}(0);

julia> @time for i in 1:1000_000; f(Int); end
  1.292829 seconds (8.00 M allocations: 473.051 MiB, 15.37% gc time)

julia> @time for i in 1:1000_000; f(Int); end
  1.111166 seconds (8.00 M allocations: 473.022 MiB, 2.41% gc time)

julia> @time for i in 1:1000_000; g(Int); end
  0.290434 seconds (1.00 M allocations: 76.315 MiB, 1.10% gc time)

julia> @time for i in 1:1000_000; g(Int); end
  0.281773 seconds (1000.00 k allocations: 76.294 MiB, 0.56% gc time)

julia> @time for i in 1:1000_000; g(Int); end
  0.275566 seconds (1000.00 k allocations: 76.294 MiB, 0.53% gc time)

Feels more like an argument to recommend putting types into parameters if perfomance matters rather than anything else though...

tkelman · 2017-01-24T18:49:46Z

It's not straightforward in a deprecation message to tell the difference between a literal in a test vs an occurrence in a package library function where performance would matter, and if it's simple to help the compiler out a bit then why shouldn't we recommend that?

vtjnash · 2017-01-24T18:53:47Z

While OTOH, there is a difference, but only when it's not inlined

Ah, oops. I missed that since the relative difference was correct :P

nalimilan · 2017-01-24T21:10:01Z

At least we can recommend this for Vector and Matrix, which are the most common cases anyway.

stevengj · 2017-01-24T23:58:59Z

I don't like recommending uglier code (that requires the programmer to supply the same information twice) that is unlikely to be faster except in corner cases where it isn't inlined or there is some other inference failure, and is even less likely to be performance-critical (as @JeffBezanson says, this probably only matters for very low-level functions in Base). Also, the deprecation message might give the impression that the ugly/redundant method is the only way to do it.

stevengj · 2017-01-25T00:01:53Z

Furthermore, if we deprecate Array(T, n,m) to Array{T}(n,m), then at the very least it shouldn't get any slower, so we won't be creating any regressions.

stevengj · 2017-01-25T00:03:58Z

In @vtjnash's example, with BenchmarkTools I get essentially the same time:

julia> @btime f(Int); @btime g(Int);
  21.032 ns (1 allocation: 80 bytes)
  20.905 ns (1 allocation: 80 bytes)

mbauman · 2017-01-25T00:06:21Z

I agree completely. Let's fix #10988 to make it easier for folks to find traps like this. If it's performance sensitive, they'll run @code_warntype on it.

tkelman · 2017-01-25T00:26:55Z

Recommending the Vector and Matrix type aliases is arguably more explicit and fixes the issue for the most common cases. Inlining can be difficult to predict in complicated library code.

stevengj · 2017-01-26T17:29:30Z

Recommending Vector or Matrix doesn't work if someone has code like Array(T, dims...). You'll get a deprecation message of Vector or Matrix depending on whether dims has 1 or 2 elements at the time it happens to run, but if someone actually makes that change the code will become less generic.

Recommending Array{T}(dims...) will not result in a slowdown from Array(T, dims...). Yes, there might be ways to speed it up (unlikely to be significant in real code, as Jeff said), but the faster way is uglier and/or less generic. Deprecation messages generally tell you how to translate your code to the direct equivalent in the latest version of Julia, not how you might potentially optimize it further.

tkelman · 2017-01-26T17:33:32Z

If we're not going to recommend it in the deprecation message, then it should be at least added in the performance tips that explicitly specifying dimensionality can help the compiler out. It's not obvious that Array{T} is an incompletely specified type and has compile-time, and occasionally run-time, costs.

stevengj · 2017-01-26T17:51:25Z

By the way, case of Array(T, dims...) is also a problem with the current deprecation message: Array{T,whatever N occurred at deprecation time}(dims...) is simply the wrong way to upgrade such code.

Regarding including this in the performance tips, I disagree. We tell them to run @code_warntype, and that will let them know if there is an inference failure for an array allocation where it can't determine N. We don't try to give exhaustive examples of all possible cases of type instabilities, especially not ones that seem as unlikely to matter as this. (Giving performance tips that only help in very rare cases can actually be harmful to the reader, because it wastes their time.)

ararslan · 2017-01-26T18:17:14Z

I'm fine with whatever folks think is best, but it would be great if you could add Array{T,N} signature to the docs at the top and make the same changes you're making here to the SharedArray docs/deprecations.

stevengj · 2017-01-26T18:23:40Z

@ararslan, I've updated the docs to list both signatures.

ararslan · 2017-01-26T18:25:14Z

base/array.jl


 """
    Array{T}(dims)
+    Array{T,N}(dims)


Could make this Array{T[,N]} to be consistent with the syntax you used for SharedArray below

or drop the brackets from the SharedArray one, that isn't julia syntax for optional arguments

I thought about it, but it seemed clearer to just write it twice. I would have done the same thing for SharedArray, but SharedArray has so many other arguments that writing it twice seemed to make it harder to read, because it takes a second to realize that all the other arguments are the same.

@tkelman, we use brackets in lots of other places in the documentation for optional arguments. Can you suggest a clearer way to write the SharedArray doc that doesn't involve duplicating the long call-signature line?

Maybe just write:

SharedArray{T}(dims::NTuple; init=false, pids=Int[]) SharedArray{T,N}(...)

?

we use brackets in lots of other places in the documentation for optional arguments

And there's been a gradual move towards getting rid of them and using actual Julia syntax in more places. The suggestion with ... seems clear enough - also not exactly valid syntax but obvious what it means.

Okay, will do the ... version.

change Array(T, dims...) deprecation message to recommend Array{T}(di…

fe886ea

…ms...) and not Array{T,N}(dims...)

ararslan added collections Data structures holding multiple items, e.g. sets deprecation This change introduces or involves a deprecation labels Jan 24, 2017

ararslan reviewed Jan 24, 2017

View reviewed changes

ararslan requested a review from tkelman January 24, 2017 17:00

mbauman approved these changes Jan 24, 2017

View reviewed changes

mbauman mentioned this pull request Jan 26, 2017

Add highlighting to code_warntype for non-leaf types (such as ::Type) #10988

Closed

doc clarifications

8b062e4

ararslan reviewed Jan 26, 2017

View reviewed changes

tweak to SharedArray docs

1fcf866

StefanKarpinski approved these changes Jan 30, 2017

View reviewed changes

ararslan approved these changes Jan 30, 2017

View reviewed changes

stevengj merged commit 509c775 into JuliaLang:master Feb 4, 2017

stevengj deleted the arraydepr branch February 4, 2017 00:17

Uh oh!

change Array(T, dims...) deprecation message to Array{T}(dims) #20217

change Array(T, dims...) deprecation message to Array{T}(dims) #20217

Uh oh!

Conversation

stevengj commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ararslan commented Jan 24, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stevengj commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevengj commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JeffBezanson commented Jan 24, 2017

Uh oh!

KristofferC commented Jan 24, 2017

Uh oh!

mbauman commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JeffBezanson commented Jan 24, 2017

Uh oh!

mbauman commented Jan 24, 2017

Uh oh!

tkelman commented Jan 24, 2017

Uh oh!

JeffBezanson commented Jan 24, 2017

Uh oh!

vtjnash commented Jan 24, 2017

Uh oh!

JeffBezanson commented Jan 24, 2017

Uh oh!

yuyichao commented Jan 24, 2017

Uh oh!

yuyichao commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tkelman commented Jan 24, 2017

Uh oh!

vtjnash commented Jan 24, 2017

Uh oh!

nalimilan commented Jan 24, 2017

Uh oh!

stevengj commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevengj commented Jan 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevengj commented Jan 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbauman commented Jan 25, 2017

Uh oh!

tkelman commented Jan 25, 2017

Uh oh!

stevengj commented Jan 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tkelman commented Jan 26, 2017

Uh oh!

stevengj commented Jan 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ararslan commented Jan 26, 2017

Uh oh!

stevengj commented Jan 26, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

stevengj commented Jan 24, 2017 •

edited

Loading

stevengj commented Jan 24, 2017 •

edited

Loading

stevengj commented Jan 24, 2017 •

edited

Loading

mbauman commented Jan 24, 2017 •

edited

Loading

yuyichao commented Jan 24, 2017 •

edited

Loading

stevengj commented Jan 24, 2017 •

edited

Loading

stevengj commented Jan 25, 2017 •

edited

Loading

stevengj commented Jan 25, 2017 •

edited

Loading

stevengj commented Jan 26, 2017 •

edited

Loading

stevengj commented Jan 26, 2017 •

edited

Loading

stevengj Jan 26, 2017 •

edited

Loading

stevengj Jan 26, 2017 •

edited

Loading