Skip to content

On the use of @sync during benchmarking in the documentation #279

@denizyuret

Description

@denizyuret

The documentation suggests: you need to ensure the GPU is synchronized at the end of every sample, e.g. by calling synchronize(). However this is generally overkill -- the overhead from @sync can be at the same order of magnitude as the actual cost of the kernel call or even higher which makes the measurement highly inaccurate. I usually end up calling @sync every N kernel calls to mitigate this. Also @benchmark generally gives a good estimate without the sync if you ignore the minimum time and use the median.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions