Computer Vision: U-Net

Convolutional Networks for Biomedical Image Segmentation

Rahul S
5 min readNov 25, 2022

U-Net is one of the most famous image segmentation architectures. It was proposed in 2015 by Olaf Ronneberger, Philipp Fischer, Thomas Brox (University of Freiburg, Germany). [1]

One should read the full paper to really relish the architechture. Following is an outlines treatment, suitable for beginners.

An end-to-end segmentation technique- U-Net takes a raw image in and outputs a segmentation map of the image.

The U-Net architecture is a U-shaped, symmetric convolutional network with a down-sampling contraction path and an up-sampling expansion path. The resulting segmented output image is much smaller than the raw input image. U-net only has Convolutional layers.

And the input image is fed into the network, the data is propagated through the network resulting in a segmented map as output.

Source: Ronneberger O., Fischer P., Brox T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N., Hornegger J., Wells W., Frangi A. (eds) Medical Image Computing and Computer-Assisted Intervention — MI

Contraction/down sampling path (Encoder Path):

The encoder path captures the context of the image. It is just a stack of convolution and max pooling layers.

The encoding path has 4 blocks. Each block consists of
1) Two 3 x 3 convolution layers + ReLU activation function (with batch normalization).
2) And. One 2 x 2…

--

--