I am going to move the image for 1 or 2 pixels, as I specified a small number (1.25 , 1.9) in the affine matrix.
BUT, the image is moved far far away, like hundreds of pixels:
( my input image is fully filled with yellow pineapples)
Below is a working example.
import torch import numpy as np import matplotlib.pyplot as plt from torchvision import datasets, transforms import torch.nn.functional as F rotation_simple = np.array([[1,0, 1.25], [ 0,1, 1.9]]) #load image transform = transforms.Compose([transforms.Resize(255), transforms.CenterCrop(224), transforms.ToTensor()]) dataloader = torch.utils.data.DataLoader(datasets.ImageFolder('/home/Pictures',transform=transform,), shuffle=True) dtype = torch.FloatTensor i = 0 while i<3: img, labels = next(iter(dataloader)) img = img#.double() # 有时候要转为double有时候不用转 rotation_simple = torch.as_tensor(rotation_simple)[None] grid = F.affine_grid(rotation_simple, img.size()).type(dtype) x = F.grid_sample(img, grid) plt.imshow(x.permute(1, 2, 0)) plt.show() i+=1
I wonder why does the function move the the image so far away instead of moving it for just 1 pixel in x and y direction.
Ps. Setting “align_corners=True” didn’t help for this case.
Pps. My pytorch version is 1.4.0+cu100
The “unit of measures” for the grid and the affine transformation are not pixels, but rather normalized coordinates:
gridspecifies the sampling pixel locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of
[-1, 1]. For example, values
x = -1, y = -1is the left-top pixel of input, and values
x = 1, y = 1is the right-bottom pixel of input.
Therefore, translating by
[1.25, 1.9] is actually translating by almost the entire image size. You need to divide the translation values by 2*
img.shape to get pixel-wise translations.
See the doc for
grid_sample for more information.