Skip to content

Conversation

@bjarthur
Copy link
Member

not battle tested yet. and might as well add an init kwarg too

src/mapreduce.jl Outdated
Comment on lines 80 to 86
out_dims = [size(a)...]
out_dims[_dims] .= 1
out = fill($init(eltype(a)), out_dims...)
for c in eachchunk(a)
out_c = [c...]
out_c[_dims] .= Ref(1:1)
out[out_c...] .= $acum.(out[out_c...], Base.$_fname(f, a[c...], dims))
Copy link
Collaborator

@rafaqz rafaqz Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for each line would help for understanding the logic.

Also splatting a vector isn't type stable, maybe better to convert back to tuple and pass it through a function barrier before the loop? We actually know the length too so you could use ntuple to make it.

src/mapreduce.jl Outdated
@eval function Base.$_fname(f::Function, a::AbstractDiskArray, dims)
_dims = typeof(dims)<:Tuple ? [dims...] : typeof(dims)<:Number ? [dims] : dims
out_dims = [size(a)...]
out_dims[_dims] .= 1
Copy link
Collaborator

@rafaqz rafaqz Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out_dims = ntuple(Val{N}()) do i
    i in _dims ? 1 : size(a)[i]
end 

Something like this should be the same but type stable for splats. N needs a where N on the array ndims type parameter.

(actually the one below is more important as its in the loop, but theyre kind of the same)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a commit with this code. the out array is still type unstable. not sure how to fix that given it's eltype is calculated.

@bjarthur
Copy link
Member Author

also need to search for something other than "DiskGenerator" in the output of @trace as that is not present on main either.

src/mapreduce.jl Outdated
out_dims = ntuple(Val(N)) do i
i in _dims ? 1 : size(a)[i]
end
out = fill($init(Base.return_types(f, (T,))[1]), out_dims...)
Copy link
Collaborator

@rafaqz rafaqz Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
out = fill($init(Base.return_types(f, (T,))[1]), out_dims...)
T1 = Base.promote_op(f, T)
out = fill($init(T1), out_dims...)

I think this is more likely to be stable? (untested)

(also seems you have 8 space indenting turned on somewhere instead of 4)

@meggart
Copy link
Collaborator

meggart commented Mar 25, 2025

Why exactly do we need these special implementations? Has something in the Julia mapreduce internals changed so that all of these don't hit the mapreducedim! implementation anymore. See also this PR for a similar discussion.

@bjarthur
Copy link
Member Author

Why exactly do we need these special implementations? Has something in the Julia mapreduce internals changed so that all of these don't hit the mapreducedim! implementation anymore. See also this PR for a similar discussion.

which PR? and no, nothing falls back to mapreduce now as the internals have changed. #246

@rafaqz
Copy link
Collaborator

rafaqz commented Mar 25, 2025

Right yes we have this code already:

https://github.com/JuliaIO/DiskArrays.jl/blob/main/src/mapreduce.jl#L13-L26

Probably we need to hook this up again? It could also do with some comments and cleanup so its clear what its all for

@bjarthur
Copy link
Member Author

oops, i take that back. maximum is indeed falling back to https://github.com/JuliaIO/DiskArrays.jl/blob/main/src/mapreduce.jl#L13, it's count that isn't falling back to https://github.com/JuliaIO/DiskArrays.jl/blob/main/src/mapreduce.jl#L28. quick benchmarks show that this PR is no faster. shoot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants