-
Notifications
You must be signed in to change notification settings - Fork 31k
Closed
Description
System Info
python-3.9.10
transformers-4.31.0
pytorch-2.0.1
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
The following example (based on the examples from the docs) gives consistent results with transformers-4.27.2 whether or not image is kept as uint8 or converted to float32. But with 4.31.0, the result is wrong when using the float32 input:
import torch
import numpy as np
from transformers import AutoImageProcessor, UperNetForSemanticSegmentation
from PIL import Image
from huggingface_hub import hf_hub_download
image_processor = AutoImageProcessor.from_pretrained("openmmlab/upernet-convnext-tiny")
model = UperNetForSemanticSegmentation.from_pretrained("openmmlab/upernet-convnext-tiny")
filepath = hf_hub_download(
repo_id="hf-internal-testing/fixtures_ade20k", filename="ADE_val_00000001.jpg", repo_type="dataset"
)
image = Image.open(filepath).convert("RGB")
image = np.array(image)
# Comment the line below to get the right result in 4.31.0
image = image.astype(np.float32)/255.0
inputs = image_processor(images=image, return_tensors="pt").pixel_values
outputs = model(inputs)
sizes = [np.array(image).shape[:2]]
seg = torch.stack(image_processor.post_process_semantic_segmentation(outputs, target_sizes=sizes))
torch.unique(seg)
Expected behavior
Expected result (observerd behaviour in 4.27.2 regardless of whether the float conversion is commented out):
tensor([ 0, 1, 2, 4, 6, 9, 17, 25, 52, 53])
Actual result in 4.31.0 (unless the float conversion is commented out and the input image is kept as uint8):
tensor([2])
(i.e., the whole image is perceived as one class)
Metadata
Metadata
Assignees
Labels
No labels