-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[microNPU] Add support for TFLite FULLY_CONNECTED #10345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is primarily a legalization to an NPU Conv2d operator. The legalization target is Conv2d with 1 1 I O (HWIO)
Test TVM runtime against TFLite for codegen and operator legalization.
lhutton1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dchauhan-arm, this will be a very useful addition. Although this is still WIP I just wanted to offer a couple of suggestions which may help get this in!
| ) | ||
| bias_add = is_op("nn.bias_add")(dense, is_constant()) | ||
| req = is_op("qnn.requantize")( | ||
| dense | bias_add, is_constant(), is_constant(), is_constant(), is_constant() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently I the legalization will fall over if there is not a bias present. We should make bias optional in FullyConnectedParams, see QnnTransposeConv2dParams for an idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack! (and thanks for the pointer on transpose conv2d)
| ): | ||
| @tf.function | ||
| def fully_connected(): | ||
| return tf.keras.layers.Dense( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with the Keras API, but I'm not sure this will work. One thing we could do instead is use tf.matmul which gets legalized to fully connected in TFLite under the conditions we will use it for. e.g. something like this would be a starting point:
@tf.function
def dense_layer(x):
w = tf.constant(
np.random.uniform(size=[units, units]),
dtype=tf.float32,
)
return tf.matmul(x, w)
_compare_tvm_with_tflite(dense_layer, [(1, units)], accel_type)
Happy to keep the Keras implementation if we get it working though, just wanted to offer an alternative :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a very welcome change, I'l try and make this work!
| requantize_op.args[RequantArgs.IFM_ZERO_POINT.value], | ||
| ) | ||
| self.ifm = TensorParams( | ||
| qnn_dense.args[QDenseArgs.ifm.value], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ifm should be capitals i.e. QDenseArgs.IFM.value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
ekalda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dchauhan-arm, broadly looks good, I left some suggestions for improvements :)
| # IFM reshapes | ||
| ifm = post.args[0] | ||
| if len(params.ifm.shape) != 4 or not params.ifm.shape[1] == params.ifm.shape[2] == 1: | ||
| ifm = relay.reshape(ifm, (-1, 1, 1, params.ifm.shape[-1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ifm = relay.reshape(ifm, (-1, 1, 1, params.ifm.shape[-1])) | |
| ifm = relay.reshape(ifm, (1, 1, 1, params.ifm.shape[-1])) |
should be safer since the NPU doesn't support IFMs with a batch size anything other than 1 and this kind of fully connected wouldn't be offloaded anyway
| def callback(self, pre, post, node_map): | ||
| params = ethosu_patterns.FullyConnectedParams(post.op.body) | ||
| params.ifm.tensor = post.args[0] | ||
| activation_map = {"clip": "CLIP"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we don't expect that dict to expand, so we can just do if activation == "clip": etc
| if len(params.ofm.shape) != 4 or not params.ofm.shape[1] == params.ofm.shape[2] == 1: | ||
| ethosu_fc = relay.reshape(ethosu_fc, params.ofm.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect there isn't a test case that exercises this case since on line 1700 this pass runs after the no op legalizer, so the last reshape won't have a following identity op and will fall over in TE
|
|
||
| @ir.transform.module_pass(opt_level=1) | ||
| class LegalizeFullyConnected: | ||
| """This is the pass that wraps the AddRewriter""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| """This is the pass that wraps the AddRewriter""" | |
| """This is the pass that wraps the FullyConnectedRewriter""" |
| """ | ||
|
|
||
| composite_name = "ethosu.fully_connected" | ||
| activation_map = {"clip": "CLIP"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same nit about the clip dict as before :)
| if activation_function == "RELU": | ||
| assert str(op.attrs.activation) == "CLIP" | ||
|
|
||
| dense_pattern_table = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it would be better to keep the naming consistent, so maybe rename this to fc_pattern_table or fully_connected_pattern_table
|
|
||
| # check IFM | ||
| ifm = op.args[0].checked_type | ||
| assert list([1, 3, units, 1]) == list([1, 3, units, 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert doesn't check anything... Some things to potentially check:
- That we have ended up with a
ethosu_conv2dop (taking into account that there might be reshape ops before and after the conv2d) - That the IFM is in a shape of (1, 1, 1, c)
- That the weights are in a shape (o, 1, 1, c) with o being the output channels of the weights
- That the kernel and dilation are (1, 1)
Address comments, update codegen test, fix linting.
Address more comments, ensure qnn.dense is lowered to NPU, fix linting
Fix linting, update legalization test and codegen test for completeness.
lhutton1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @dchauhan-arm, looking much better! Mostly just a few stylistic suggestions below
| import tvm # type: ignore | ||
| from tvm import relay | ||
| from tvm.relay.expr import Constant, Call # type: ignore | ||
| from tvm.relay.expr import Constant, Call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't think we need to change this
| if not np.all(np.array(self.ifm.shape[:-1]) == 1): | ||
| # As we reshape the ifm from | ||
| # [n0, n1, ... , n_m] to [n0 * n1 * ... * n_{m-1}, n_m] | ||
| # all except the last dims need to be 1. | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this due to reasoning in the above comment and since we already check that the batch size == 1 with check_batch_size above and we know that the ifm must be 2D
| return True # optional_bias_add = ( | ||
|
|
||
| # is_op("nn.bias_add")(dense, is_constant()) | dense | ||
| # ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove comments :)
| dense = is_op("qnn.dense")( | ||
| wildcard(), is_constant(), is_constant(), is_constant(), is_constant(), is_constant() | ||
| ) | ||
| optional_bias_add = is_op("nn.bias_add")(dense, is_constant()) | dense |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should just be optional_bias_add = is_op("nn.bias_add")(dense, is_constant())
|
|
||
| # check OFM | ||
| ofm = op.checked_type | ||
| assert [ofm.shape[2], ofm.shape[3]] == [1, ofm_channels] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, lets alter this to check the whole ofm shape assert ofm.shape == ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need to be assert list(ofm.shape) == [1, 1, 1, ofm_channels]
| assert [ofm.shape[2], ofm.shape[3]] == [1, ofm_channels] | ||
| # assert list(ofm.shape) == list(expected_ofm_shape) | ||
| assert str(ofm.dtype) == dtype | ||
| # assert ofm.shape[3] == ofm_channels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove some comments
| assert weights_ohwi.shape[0] == ofm_channels | ||
| assert weights_ohwi.shape[1] == 1 | ||
| assert weights_ohwi.shape[2] == 1 | ||
| assert weights_ohwi.shape[3] == ifm_shape[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we could do this with one assert e.g. assert list(weights_ohwi) == [ofm_channels, 1, 1, ifm_shape[1]
| # (1, 1), | ||
| # (1, 1), | ||
| # (1, 1), | ||
| # ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: comments
| # (1, 1), | ||
| # (1, 1), | ||
| # ) | ||
| assert list(op.attrs.padding) == [0, 0, 0, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: might also be worth checking the op name is an NPU convolution here as well
|
Looks like CI only failed because of a flaky GPU test :) |
Address comments, fix linting. Certain legalization test assertions were updated. Co-authored-by: Rishabh Jain <[email protected]>
Fix assertion in legalization test.
|
|
||
| # check IFM | ||
| ifm = op.args[0].checked_type | ||
| assert [ifm.shape[2], ifm.shape[3]] == list(ifm_shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get this change to work we would need assert list(ifm.shape) == [1, 1] + list(ifm_shape), since ifm.shape is not a list
|
|
||
| # check OFM | ||
| ofm = op.checked_type | ||
| assert [ofm.shape[2], ofm.shape[3]] == [1, ofm_channels] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need to be assert list(ofm.shape) == [1, 1, 1, ofm_channels]
| # check weights | ||
| weights_ohwi = op.args[1].data.asnumpy() | ||
| assert str(weights_ohwi.dtype) == dtype | ||
| assert list(weights_ohwi) == [ofm_channels, 1, 1, ifm_shape[1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ones just missing .shape i.e. assert list(weights_ohwi.shape) == [ofm_channels, 1, 1, ifm_shape[1]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack! good spot on the list(ifm.shape)
Address comments, fixing assertion on ifm and ofm shape.
lhutton1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dchauhan-arm, LGTM!.. Providing CI stays green
manupak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks @dchauhan-arm @lhutton1 @ekalda @jainris !
* [microNPU] Add support for TFLite FULLY_CONNECTED This is primarily a legalization to an NPU Conv2d operator. The legalization target is Conv2d with 1 1 I O (HWIO) * [microNPU] Add support for TFLite FULLY_CONNECTED Test TVM runtime against TFLite for codegen and operator legalization. * [microNPU] Add support for TFLite FULLY_CONNECTED Fix linting * [microNPU] Add support for TFLite FULLY_CONNECTED Address comments, update codegen test, fix linting. * [microNPU] Add support for TFLite FULLY_CONNECTED Address more comments, ensure qnn.dense is lowered to NPU, fix linting * [microNPU] Add support for TFLite FULLY_CONNECTED Fix linting, update legalization test and codegen test for completeness. * [microNPU] Add support for TFLite FULLY_CONNECTED Address comments, fix linting. Certain legalization test assertions were updated. Co-authored-by: Rishabh Jain <[email protected]> * [microNPU] Add support for TFLite FULLY_CONNECTED Fix assertion in legalization test. * [microNPU] Add support for TFLite FULLY_CONNECTED Address comments, fixing assertion on ifm and ofm shape. Co-authored-by: Rishabh Jain <[email protected]>
This is primarily a legalization to an NPU Conv2d operator. The
legalization target is Conv2d with 1 1 I O (HWIO)