Use safe API to access Tensor data pointer #655

Stonepia · 2024-07-26T07:13:31Z

The un-templated data_ptr() does not check about the type check and storage_initialized() check. Thus, it may introduce potential bugs.

By using the template data_ptr, this could throw an error like below:

torch._dynamo.exc.TorchRuntimeError: Failed running call_function torchvision.roi_align(*(FakeTensor(..., device='xpu:0', size=(1, 1024, 50, 75), dtype=torch.bfloat16), FakeTensor(..., device='xpu:0', size=(1000, 5), dtype=torch.bfloat16), 0.0625, 14, 14, 0, True), **{}):
The tensor has a non-zero number of elements, but its data is not allocated yet. Caffe2 uses a lazy allocation, so you will need to call mutable_data() or raw_mutable_data() to actually allocate memory.

Stonepia · 2024-07-26T09:55:03Z

Note that currently, there is no bug found in stock pytorch about this mentioned unsafe data ptr (but we found on in IPEX).

This PR doesn't address the static_cast data_ptr. These occurrences do not use the template data_ptr. Similarly, the data_ptr can't be set as const_data_ptr.

static_cast<TLMetaForAddressScalar<scalar_vals_t, depth>*>(
          addressStorage.data_ptr());

// Could not be changed to int64_t
static_cast<TLMetaForAddressScalar<scalar_vals_t, depth>*>(
          addressStorage.mutable_data_ptr<int64_t>());

// Set as mutable_data_ptr. Could not use const_data_ptr
static_cast<TLMetaForAddressScalar<scalar_vals_t, depth>*>(
          addressStorage.mutable_data_ptr());

To fix this, we may need to further change these static_cast usage (to align with PyTorch implementation).

The un-templated data_ptr() does not check about the type check and storage_initialized() check. Thus, it may introduce potential bugs.

Stonepia mentioned this pull request Jul 26, 2024

Proposal: Switch to safer data_ptr API #654

Closed

Stonepia requested a review from toyxu July 26, 2024 07:18

fengyuan14 changed the title ~~Use safer data_ptr access~~ Use safe API to access Tensor data pointer Jul 26, 2024

Stonepia marked this pull request as draft July 26, 2024 07:33

Stonepia force-pushed the tongsu/data_ptr branch 2 times, most recently from 95dccc4 to 91fe767 Compare July 26, 2024 09:05

Stonepia marked this pull request as ready for review July 26, 2024 09:39

Stonepia requested a review from fengyuan14 July 26, 2024 09:55

Stonepia added 4 commits July 26, 2024 13:44

Use safer data_ptr access

90edd0f

The un-templated data_ptr() does not check about the type check and storage_initialized() check. Thus, it may introduce potential bugs.

Add more changes

91fe767

format code

b6ab0b5

Use mutable_data_ptr to replace data_ptr

122698d

EikanWang approved these changes Jul 29, 2024

View reviewed changes

fengyuan14 added this pull request to the merge queue Jul 29, 2024

Merged via the queue into main with commit 0608225 Jul 29, 2024

fengyuan14 deleted the tongsu/data_ptr branch July 29, 2024 03:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use safe API to access Tensor data pointer #655

Use safe API to access Tensor data pointer #655

Uh oh!

Stonepia commented Jul 26, 2024

Uh oh!

Stonepia commented Jul 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Use safe API to access Tensor data pointer #655

Use safe API to access Tensor data pointer #655

Uh oh!

Conversation

Stonepia commented Jul 26, 2024

Uh oh!

Stonepia commented Jul 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants