Computer Vision: Upsampling2D & Conv2DTranspose layers in TensorFlow
Upsampling means increase the dimensions of an image. It is used in decoders of segmentation algorithms, creating an image of a random vector sample with GANs and improving an image’s quality, etc. The up-sampling technique increases the resolution as well as the size of the image.
There are two ways of upsampling. Either we scale the original image up, or we use something called Transpose convolution/deconvolution.
SCALING
The new pixels that are added are generated with some kind of interpolation/Scaling method.
Two main methods amongst the plethora that are available are:
1. nearest neighbor interpolation: in which the upsampled pixel is a copy of the closest pixel from the pool.
2. bilinear interpolation: we calculate the weighted distance between each nearest pixel using linear interpolation and calculate the new pixel.
3. bicubic interpolation: we choose the nearest pixel in this approach as well, but instead of linear interpolation to calculate the new pixel value, we have to choose 3rd order polynomial interpolation. This takes somewhat more time than the other two methods but gives better and smoother results.
TensorFlow API: Upsampling2D is a simple and fast API that is mostly used in Keras to upsample images from the results of the pooling (pooling after Convnets). It is added as a layer to the model.
tf.keras.layers.UpSampling2D(
size=(2, 2), data_format=None, interpolation='nearest', **kwargs
)
In code to use upsampling is specified as a layer with a number of properties.
The size is the upsampling window size. For example, if it’s two comma two is shown here, any pixel will be upsampled to a two-by-two pixel array of four pixels.
The data_format parameter is used to determine how Keras loads the data and the order of the parameters and then when it loads it.
- The option channels first means that when listing out the dimensions, the channel dimension is listed before the height and the width.