-
Couldn't load subscription status.
- Fork 3.7k
[Relay][VM]VM Profiler #3727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay][VM]VM Profiler #3727
Conversation
be94c95 to
04e33e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an example output of how it looks in the description? probably in the unit test as well?
May be it could be a table shows number of trials, min, max, mean, stdev etc. We can discuss how it should look like.
|
A higher level comment is that such functionality is usually referred to as profiling rather than debugging. What do others think ? |
Agree what I added lies more in profiling. I'm following the name for current graph runtime debugger(https://docs.tvm.ai/dev/debugger.html), which supports inspecting intermediate tensors and operator execution metrics. I'm basically replicating the behavior for Relay VM hence the name. I expect eventually it will be more like a debugger than a profiler. |
Ok, makes sense now. Thanks. |
ef73e3f to
2a9adae
Compare
|
You should also set USE_VM_DEBUG to OFF by default in the config.cmake. |
|
I also agree with @u99127 I don't know why the TVM runtime calls this a debugger, this is way more in profiling land then debugger land. I would imagine a VM debugger to be a typical step debugger instead of just profiling info. |
python/tvm/relay/backend/debug_vm.py
Outdated
| "{}".format(type(target))) | ||
| return tgts | ||
|
|
||
| class VMCompilerDebug(vm.VMCompiler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need a different entry point for the debugging the VM? can't we just pass an option into the normal build target?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to pass in an option, then VMCompiler will need to switch between VirtualMachine and VirtualMachineDebug based on the config, which means we have to build VirtualMachineDebug all the time. I think we need the option to not build the VirtualMachineDebug source, does it make sense?
|
Yeah, we probably can rename it to profiler since debugger is different than what we are doing. A side note, we can probably have a follow-up pr to add memory stats for each op as well through memory_manager. |
|
+1 to call this a profiler. debugger makes no sense to me |
|
Thanks. I renamed the PR to profiler. Luckily I don't think we need to change sources since I haven't used the term |
|
Do we need to change the file name and class name as well? |
* [Relay][VM]VM debugger * Report mean/min/max for op duration * Typos * Lint * Lint * Lint * Support build debug VM in CMake * Lint * Enable VM debug in unit test * Disable debug vm test until new docker image is built * Add device sync code * Fix qnn unit test * Disable vm debug by default * Rename files * Rename classes * Fix comment * Fix comment
* [Relay][VM]VM debugger * Report mean/min/max for op duration * Typos * Lint * Lint * Lint * Support build debug VM in CMake * Lint * Enable VM debug in unit test * Disable debug vm test until new docker image is built * Add device sync code * Fix qnn unit test * Disable vm debug by default * Rename files * Rename classes * Fix comment * Fix comment
* [Relay][VM]VM debugger * Report mean/min/max for op duration * Typos * Lint * Lint * Lint * Support build debug VM in CMake * Lint * Enable VM debug in unit test * Disable debug vm test until new docker image is built * Add device sync code * Fix qnn unit test * Disable vm debug by default * Rename files * Rename classes * Fix comment * Fix comment
Basic profiler functionality to get operator execution time.
Example output:
#OpName #InvokeCount #Duration(us): Sum/Mean/Min/Max fused_nn_softmax 1 12.035/12.035/12.035/12.035 fused_nn_bias_add 1 1.149/1.149/1.149/1.149 fused_nn_dense 1 119.539/119.539/119.539/119.539 fused_nn_batch_flatten 1 5.081/5.081/5.081/5.081 fused_nn_global_avg_pool2d 1 11.423/11.423/11.423/11.423 fused_add_19 2 73.225/36.6125/36.438/36.787 fused_nn_conv2d_11 1 391.84/391.84/391.84/391.84 fused_nn_conv2d_10 3 19419.7/6473.22/5625.45/8143.65 fused_nn_relu_4 4 217.312/54.328/30.015/102.374 fused_add_18 4 164.761/41.1903/32.124/66.364 Total Duration 107960 uscc @zhiics @icemelon9 @jroesch