-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Description
If I am correct in assuming that the time calculated for evaluating the whole dataset (https://github.com/paperswithcode/torchbench/blob/master/torchbench/image_classification/utils.py#L75) is used for calculating the speed on the leaderboard, then I would like to point out several issues with this
- The disk read time gets included in the measurement, and models that are fast enough that the disk speed cannot keep up would report an unfairly low speed.
- If consistent disk speeds are not ensured between runs (because some other process happened to be accessing the same disk), then it further compounds (1) above, and the evaluated speed would not be the same between runs.
I believe that the speed measurement should be done on a chunk of preloaded dummy data, with a note on the leaderboard saying that the actual speeds people can get in practice would depend on the rate at which the model can be fed.
Metadata
Metadata
Assignees
Labels
No labels