You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ET-VK] Deprecate gpu_sizes_ubo() and extents(); also toggle packing layout via specialization constants
Pull Request resolved: #3181
## Context
This changeset cleans up how shaders consume tensor metadata in two ways:
### Pass in Packing Layout via Specialization Shader
The packing layout of a tensor determines how to convert between tensor indices and physical texture coordinates. Currently, the packing layout is determined by generating a completely new variant of a shader. However, this is rather expensive for build size.
Specialization constants support was added a while back, which enables packing layout to be communicated to the shader via a specialization constant. This is a much better and natural way for shaders to determine the packing layout of its tensors and vary its behaviour.
The primary benefit of this is that we can vastly reduce the number of variants that are generated. Generating shader variants for combinations of dtypes and memory layouts can lead to combinatorial explosion of build size.
Note that dtype cannot be passed as a specialization constant since it impacts the types used in the layout portion of a shader.
### Deprecate GPU sizes and Extents
Currently there are 3 representations of the tensor's sizes; `cpu_sizes()`, `gpu_sizes()`, and `extents()`. The GPU sizes is a simple modification of the CPU sizes where the packed dim is aligned to the next multiple of 4. Extents represents the physical extents of the image texture used to store the image.
However, often times shaders need to reference the original sizes of the tensor so we end up passing two different representations of the tensor sizes. The CPU sizes and extents is used to determine out of bounds elements and the GPU sizes is used to convert between logical tensor indices and physical texture coordinates.
Since the GPU sizes and extents are easily determined from the CPU sizes given the packing layout, deprecate GPU sizes and use CPU sizes exclusively as the canonical tensor sizes. Hence `cpu_sizes()` is renamed to simple `sizes()`.
The primary benefit of this change is such:
1. Less confusion over how to reference the tensor sizes
2. Fewer descriptors to bind when constructing compute pipelines
3. Fewer uniform buffers to update when resizing tensors between inferences.
Differential Revision: [D56377775](https://our.internmc.facebook.com/intern/diff/D56377775/)
ghstack-source-id: 223317313
0 commit comments