While trying to build a letter classifier in ML, this was a code for creating image data and the labels from the images from a folder using PIL.
def create_dataset_PIL(img_folder): img_data_array= class_name= for dir1 in os.listdir(img_folder): print(dir1) for file in os.listdir(os.path.join(img_folder, dir1)): image_path= os.path.join(img_folder, dir1, file) image= np.array(Image.open(image_path)) image= np.resize(image,(IMG_HEIGHT,IMG_WIDTH,3)) image = image.astype('float32') image /= 255 img_data_array.append(image) class_name.append(dir1) return img_data_array , class_name
Each image is
32 X 32 pixels in the dataset already and I am resizing it to a list of
32 X 32 X 3 dimension.
But I don’t understand, what is this 3rd dimension when all I need is 32 X 32 pixels?
I stumbled upon Numpy Resize/Rescale Image where I learned this may be interpolation parameter. Also from YouTube, I learned that interpolation is required while resizing images. But I don’t know what to do with this extra data? Should size of input layer of my Neural Network be now
32 X 32 X 3 instead of just
32 X 32?
3 represent the RGB (RED-GREEN-BLUE) values. Each pixel of the image represented by 3 pixels instead of one. In a black&white image, each pixel would be represented by [pixel], In RGB image each pixel would be represented by [pixel(R),pixel(G),pixel(B)]
In fact, each pixel of the image has 3 RGB values. These range between 0 and 255 and represent the intensity of Red, Green, and Blue. A lower value stands for higher intensity and a higher value for lower intensity. For instance, one pixel can be represented as a list of these three values [ 78, 136, 60]. Black would represented as [0, 0, 0].
And yes: Your input layer should match this 32X32X3.