On the use of @sync during benchmarking in the documentation

The documentation suggests: you need to ensure the GPU is synchronized at the end of every sample, e.g. by calling synchronize(). However this is generally overkill -- the overhead from @sync can be at the same order of magnitude as the actual cost of the kernel call or even higher which makes the measurement highly inaccurate. I usually end up calling @sync every N kernel calls to mitigate this. Also @benchmark generally gives a good estimate without the sync if you ignore the minimum time and use the median.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

On the use of @sync during benchmarking in the documentation #279

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

On the use of @sync during benchmarking in the documentation #279

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions