Skip to content

Conversation

@Stonepia
Copy link
Contributor

The un-templated data_ptr() does not check about the type check and storage_initialized() check. Thus, it may introduce potential bugs.

By using the template data_ptr, this could throw an error like below:

torch._dynamo.exc.TorchRuntimeError: Failed running call_function torchvision.roi_align(*(FakeTensor(..., device='xpu:0', size=(1, 1024, 50, 75), dtype=torch.bfloat16), FakeTensor(..., device='xpu:0', size=(1000, 5), dtype=torch.bfloat16), 0.0625, 14, 14, 0, True), **{}):
The tensor has a non-zero number of elements, but its data is not allocated yet. Caffe2 uses a lazy allocation, so you will need to call mutable_data() or raw_mutable_data() to actually allocate memory.

@Stonepia Stonepia requested a review from toyxu July 26, 2024 07:18
@fengyuan14 fengyuan14 changed the title Use safer data_ptr access Use safe API to access Tensor data pointer Jul 26, 2024
@Stonepia Stonepia marked this pull request as draft July 26, 2024 07:33
@Stonepia Stonepia force-pushed the tongsu/data_ptr branch 2 times, most recently from 95dccc4 to 91fe767 Compare July 26, 2024 09:05
@Stonepia Stonepia marked this pull request as ready for review July 26, 2024 09:39
@Stonepia
Copy link
Contributor Author

Note that currently, there is no bug found in stock pytorch about this mentioned unsafe data ptr (but we found on in IPEX).

This PR doesn't address the static_cast data_ptr. These occurrences do not use the template data_ptr. Similarly, the data_ptr can't be set as const_data_ptr.

static_cast<TLMetaForAddressScalar<scalar_vals_t, depth>*>(
          addressStorage.data_ptr());

// Could not be changed to int64_t
static_cast<TLMetaForAddressScalar<scalar_vals_t, depth>*>(
          addressStorage.mutable_data_ptr<int64_t>());

// Set as mutable_data_ptr. Could not use const_data_ptr
static_cast<TLMetaForAddressScalar<scalar_vals_t, depth>*>(
          addressStorage.mutable_data_ptr());

To fix this, we may need to further change these static_cast usage (to align with PyTorch implementation).

@Stonepia Stonepia requested a review from fengyuan14 July 26, 2024 09:55
Stonepia added 4 commits July 26, 2024 13:44
The un-templated data_ptr() does not check about the type check and storage_initialized() check. Thus, it may introduce potential bugs.
@fengyuan14 fengyuan14 added this pull request to the merge queue Jul 29, 2024
Merged via the queue into main with commit 0608225 Jul 29, 2024
@fengyuan14 fengyuan14 deleted the tongsu/data_ptr branch July 29, 2024 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants