Computer Vision: Upsampling2D & Conv2DTranspose layers in TensorFlow

A Basic Introduction

Rahul S
3 min readNov 23, 2022

--

Upsampling means increase the dimensions of an image. It is used in decoders of segmentation algorithms, creating an image of a random vector sample with GANs and improving an image’s quality, etc. The up-sampling technique increases the resolution as well as the size of the image.

There are two ways of upsampling. Either we scale the original image up, or we use something called Transpose convolution/deconvolution.

SCALING

The new pixels that are added are generated with some kind of interpolation/Scaling method.

Two main methods amongst the plethora that are available are:
1. nearest neighbor interpolation: in which the upsampled pixel is a copy of the closest pixel from the pool.
2. bilinear interpolation: we calculate the weighted distance between each nearest pixel using linear interpolation and calculate the new pixel.
3. bicubic interpolation: we choose the nearest pixel in this approach as well, but instead of linear interpolation to calculate the new pixel value, we have to choose 3rd order polynomial interpolation. This takes somewhat more time than the other two methods but gives better and smoother results.

TensorFlow API: Upsampling2D is a simple and fast API that is mostly used in Keras to upsample images from the results of the pooling (pooling after Convnets). It is added as a layer to the model.

tf.keras.layers.UpSampling2D(
size=(2, 2), data_format=None, interpolation='nearest', **kwargs
)

In code to use upsampling is specified as a layer with a number of properties.

In code to use upsampling is specified as a layer with a number of properties.

The size is the upsampling window size. For example, if it’s two comma two is shown here, any pixel will be upsampled to a two-by-two pixel array of four pixels.

The data_format parameter is used to determine how Keras loads the data and the order of the parameters and then when it loads it.

  • The option channels first means that when listing out the dimensions, the channel dimension is listed before the height and the width.

--

--

Computer Vision: CNNs for Images. Why?

2 min read

Aug 17

Computer Vision with Neural Networks — an Overview

3 min read

Dec 9, 2022

Please explain “Non-Max Suppression” for us.

3 min read

Dec 5, 2022

Can you tell us something about ‘Global Average Pooling’?

3 min read

Nov 29, 2022

An Intuition of Neural Style Transfer

3 min read

Aug 11

Understanding Jaccard’s Index and Dice Coefficient in Object Detection and Image Segmentation

6 min read

Nov 22, 2022

Computer Vision: U-Net

5 min read

Nov 25, 2022

Computer Vision: Semantic Segmentation- An Intuition

8 min read

Nov 23, 2022

Computer Vision: Convolutional Neural Networks (CNNs)

8 min read

Nov 21, 2022

Rahul S

I learn as I write | LLM, NLP, Statistics, ML