-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Follow-up to the discussion in PR8509.
Resnet50/float32 model crashes during compilation on Hexagon: changes from the PR enable code execution path that leads to an assertion in function Array<te::Tensor> ScheduleBuilder::VisitExpr_(const ConstantNode* op) due to not handling scalar constants of type `float16'. This is a change from the behavior from before this PR.
Cause
The FuseOps pass gets the link_params flag from the Executor attribute from the IRModule, instead of taking it from the current target. If the current target has link_params=False, while the executor has link_params=True, this will lead to unexpected behavior for the current target.
In particular, when compiling a model that uses float16 constants, having link_params=True in FuseOps will prevent it from extracting these constants into parameters (which was the original behavior). This will then lead to more relay code being presented with ConstantNodes with type float16, which that code may not yet handle (as is in the case of ScheduleBuilder).
How it happens
Hexagon target sets link_params=True, while CPU target has link_params=False. During compilation, relay optimizations execute pass FoldConstants. This pass runs relay interpreter with a CPU target, regardless of the original compilation target. During the interpreter execution, FuseOps pass is executed. At this point the current target is CPU, while the IRModule's Executor settings correspond to the original "hexagon" target.
Edit: clarifications