Skip to content

Imprecision of sum(::Generator) #30421

@cstjean

Description

@cstjean

It seems that sum uses the naive sequential sum algorithm for generators. With large vectors, it eventually saturates, and yields an incorrect answer:

julia> N = 100000000; aa = rand(Float32, N);

julia> mean((x for x in aa))
0.16777216f0

julia> mean(aa)
0.500059f0

I have a real-world case where it causes an alarmingly large difference:

julia> mean(skipmissing(Umat))
1.0638367f0 V

julia> mean(collect(skipmissing(Umat)))
3.1320891f0 V

As @simonbyrne pointed out on discourse, sum(::Array) already uses a smarter algorithm. It could presumably be used for generators, too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    foldsum, maximum, reduce, foldl, etc.mathsMathematical functions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions