Skip to content

Conversation

@vchuravy
Copy link

@vchuravy vchuravy commented Mar 9, 2020

I started making the internals of TimerOutputs threadsafe, but paused when I started thinking about ordering of operations races.

As an example

@timeit "root" begin
   @sync begin
       for i in 1:100
       Threads.@spawn @timeit "iteration" begin
       end
       end
       Threads.@spawn @timeit "leaf2" begin
       end
   end
end

Running this with threads currently gives:

ERROR: TaskFailedException:
UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex at ./array.jl:744 [inlined]
 [2] pop! at ./array.jl:1080 [inlined]
 [3] pop! at /home/vchuravy/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:74 [inlined]
 [4] macro expansion at /home/vchuravy/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:218 [inlined]
 [5] (::var"#3#5")() at ./threadingconstructs.jl:113

...and 1 more exception(s).

Stacktrace:
 [1] sync_end(::Array{Any,1}) at ./task.jl:300
 [2] top-level scope at task.jl:319
 [3] top-level scope at REPL[2]:2
 [4] top-level scope at /home/vchuravy/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:214

With this PR you get:

julia> TimerOutputs.get_defaulttimer()
 ──────────────────────────────────────────────────────────────────────────
                                   Time                   Allocations
                           ──────────────────────   ───────────────────────
     Tot / % measured:          12.0s / 0.28%            210MiB / 0.96%

 Section           ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────
 root                   1   33.6ms   100%  33.6ms   2.01MiB  100%   2.01MiB
   iteration           99   6.80μs  0.02%  68.7ns   3.33KiB  0.16%    34.4B
     leaf2              1    281ns  0.00%   281ns     0.00B  0.00%    0.00B
       iteration        1    111ns  0.00%   111ns     0.00B  0.00%    0.00B
 ──────────────────────────────────────────────────────────────────────────

Which while certainly an improvement we can see that the nesting is broken.

It is also possible to observe data-races, see the TODO in the code. Below we should have seen 1000 calls to iteration instead of 994

julia> TimerOutputs.get_defaulttimer()
 ──────────────────────────────────────────────────────────────────────────
                                   Time                   Allocations
                           ──────────────────────   ───────────────────────
     Tot / % measured:           283s / 0.01%            174MiB / 1.58%

 Section           ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────
 root                   1   34.7ms   100%  34.7ms   2.74MiB  100%   2.74MiB
   iteration          993    105μs  0.30%   106ns   68.2KiB  2.43%    70.3B
     leaf2              1    445ns  0.00%   445ns     0.00B  0.00%    0.00B
       iteration        1    100ns  0.00%   100ns     0.00B  0.00%    0.00B
 ──────────────────────────────────────────────────────────────────────────

I am not a 100% sure what the right approach here is. Ideally we maintain a tree, but identifying which branch we are on across function calls and task-switches is hard.

cc: @NHDaly, @maleadt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant