Convert image files into a tensor and back with FastAi, PIL, Torchvision and Vanilla Pytorch
Some basic processes may seem obvious for the most, but for me as a beginner, It takes some time and experiment to become familiar some topics. Image processing for Deep learning is one of them. This post is an attempt to understand the process of opening an image file converting to a tensor and after training process convert back into a image. My favorite library FastAi makes all these possible without a big effort but sometimes, especially when playing with the data, it is very useful to understand the basics.
- This is where my data (image) is located ( a path to a folder)
- This is how I open a particular image in a folder:
- Now we have a PILimage
- This is how we see it.
- This is how a PILimage converted into a Pytorch tensor manually
- Now we will use Torchvision ToTensor class for the same process.
- Finally, fast.ai ToTensor class
- Transform a tensor (possibly frome a prediction) back into a PILimage:
import fastbook
fastbook.setup_book()
from fastai.vision.all import *
from fastbook import *
# below is for disabling Jedi autocomplete that doesn't work well in my computer.
#Please comment it out if it was not a case for you.
%config Completer.use_jedi = False
Note: This image created by NightCafe creator with one word text input: "FastAi" NightCafe is here
Note: Understanding "Path" class is an another topic. In our case we can think Path as a location in a computer.
path = Path('images/image_explorations')
path
Note: PIL is the Python Imaging Library by Fredrik Lundh and Contributors. Most known fork is Pillow and if you want to look deeper please follow this link: Pillow Documents
Im = PILImage.create(path/'Fastai.jpg')
type(Im)
Tip: There is no need to install PIL if you have FastAi. It is included in the library.
Im.show()
Note: Just pass the PIlimage to tensor method.
Im_tensor = tensor(Im)
Im_tensor.shape
Im_tensor[220][220]
Tip: Sometimes(almost everytime) it is important to keep the tensor dimension in a particular order. In this case it is "C H W" (channel x heigh x weigh). It is easy with "permute" method.
Im_tensor = Im_tensor.permute(2,0,1)
Im_tensor.shape
Important: At the location of 220 and 220, the value of the "red" channel is 33. Please keep this is in mind. We will see this exact number in other methods too.
Im_tensor[0][220][220]
Note: Torchvision library is part of the PyTorch project. It used for transforming images, contain pretrained models and datasets etc. Below we import transforms from the library.
import torchvision.transforms as T
torchvis_transform = T.ToTensor()
Img_torchvis = torchvis_transform(Im)
Important: With Torchvision, we do not need to arrange the order of tensor dimension, It is already how is expected. "C H W" (channel x heigh x weigh).
Img_torchvis.shape
Note: The result is not 33 but 0.1294. Same location but different value. Why?
Img_torchvis[0][220][220]
Note: type is same
type(Img_torchvis)
doc(ToTensor )
Note: There is a ToTensor Class in the FastAi library but I believe the behaviour is slightly different than the previous examples.
transform = ToTensor()
Img_tensor = transform(Im)
Img_tensor.shape
Note: Shape of the tensor is right but value is 33 but not 0.1294
Img_tensor[0][220][220]
Note: In fact result is same just need to be clamped between 1 and 0.
Img_tensor=Img_tensor/255
Img_tensor[0][220][220]
Im_tensor[0][220][220]
Note: We use torchvision library to convert the tensor into image.
transform_to_im = T.ToPILImage()
image_from_tensor= transform_to_im(Im_tensor)
Note: It is useful when Debugging your code during develelopment, feeding a model with real data.
image_from_tensor
Note: resizing value must be a tuple
resizeTo = 224,224
image_from_tensor = image_from_tensor.resize(resizeTo)
image_from_tensor