An Intuition of Neural Style Transfer

Part 1 on Neural Style Transfer

Rahul S
3 min readAug 11


Neural Style Transfer deals with two sets of images: Content image and Style image. It recreates the content image in the style of the Style image.

Here are the required inputs to the model for image style transfer:

  1. A Content Image –an image to which we want to transfer style to
  2. A Style Image — the style we want to transfer to the content image
  3. An Generated Image— the final blend of content and style image

NST employs a pre-trained Convolutional Neural Network with added loss functions to transfer style from one image to another and synthesize a newly generated image with the features we want to add.

With deep CNN, we meticulously segregate the representations of content and style. In this context, the VGG network emerges as a prominent player due to its remarkable ability in constructing robust semantic representations.

It is our feature extractor.

To extract the essence of content representation, we execute the following steps:

  1. We employ diverse images as input through VGG and selectively choose feature maps from a designated layer.
  2. These feature maps intricately capture the elements constituting the content.

These feature maps, akin to image-like representations, hold distinctive characteristics contingent on the layer’s depth within the network. This results in a spectrum ranging from low-level details like edges to high-level intricate nuances.

We know higher layers of the model focus more on the features present in the image i.e. overall content of the image. Thus, images having the same content should also have similar activations in the higher layers. So we extract its intermediate layers of our chosen feature maps and use them to describe the content and style of the input images.

Taking a step towards the creation of stylized imagery, we introduce a Gaussian noise image into the equation. Through the VGG network, we generate an initial style representation, albeit rudimentary. The objective is to align this representation with the content representation.



Computer Vision: CNNs for Images. Why?

2 min read

Aug 17

Computer Vision with Neural Networks — an Overview

3 min read

Dec 9, 2022

Please explain “Non-Max Suppression” for us.

3 min read

Dec 5, 2022

Can you tell us something about ‘Global Average Pooling’?

3 min read

Nov 29, 2022

Understanding Jaccard’s Index and Dice Coefficient in Object Detection and Image Segmentation

6 min read

Nov 22, 2022

Computer Vision: U-Net

5 min read

Nov 25, 2022

Computer Vision: Semantic Segmentation- An Intuition

8 min read

Nov 23, 2022

Computer Vision: Upsampling2D & Conv2DTranspose layers in TensorFlow

3 min read

Nov 23, 2022

Computer Vision: Convolutional Neural Networks (CNNs)

8 min read

Nov 21, 2022

Rahul S

I learn as I write | LLM, NLP, Statistics, ML