Think Upon AI

Generative Adversarial Networks (GANs): Explained

Table of Contents

Generative Adversarial Networks (GANs)
Photo by Pat Moin on Unsplash

The concept of GANs was initially developed by Ian Goodfellow and his colleagues in June 2014.

GANs or Generative Adversarial Network is a type of architecture that allows generative modeling using deep learning methods. GANs can be broken in three parts:

  • Generative – The model can generate new data instances using noise/random distributions.
  • Adversarial – The training is done in an adversarial (attack/defense) setting.
  • Networks – Deep neural networks are deployed in the architecture for training purposes.
 
Let us first learn the idea behind generative modeling in contrast to the well known discriminative models.

Generative vs. Discriminative models

As most of us would probably know, Discriminative models discriminate between different kind of data instances. These type of model include logistic regression, decision trees, linear regression and so on. In contrast, generative models generate new data instances. These include bayesian networks, markov random fields, and GANs.

Although both generative and discriminative models learn the distribution, generative models need to tackle more difficult task than discriminative models. They have to learn “more”.

Example

For instance, if a discriminative model has to classify between a car and a bike. It has to learn patterns that indicate for example, “a car has 4 wheels while a bike has only 2 wheels”. While, to generate new car or bike instances, the generative model has to learn the correlations like “the car wheels are of specific shape and are at a particular distance to each other” or “the car is most probably going to appear near things that look like a road or a parking lot “. And it is much more complex distribution as compared to that of discriminative models.

It is so, since discriminative models just have to draw boundaries among the given data space, while generative models have to learn how the data is placed throughout the space.

Now as we know the difference, we can look more into how GANs use their architecture for generative modeling. 

GANs Anatomy

A Generative Adversarial Network consists of two networks:

a) Generator (G): The generator turns noise into an imitation of the data in order to trick the discriminator.

b) Discriminator (D): The discriminator tries to identify real data from fakes created by the generator.

Here, both the generator and discriminator are deep neural networks and hence can be trained using backpropagation method.

Generative Adversarial Networks (GANs) architecture

The interesting part is that the model is trained in an adversarial or we can say, a two-player game setting. The two players being generator and discriminator are simultaneously trained on the input data.

To train with backpropagation technique, discriminator and generator losses are calculated as follows:

  • Discriminator loss function (maximization problem)
discriminator loss
  • Generator loss function (minimization problem)

Here, term (1) denotes the probability that the discriminator is rightly classifying the real input data. Whereas term (2) denotes the probability of discriminator correctly classifying the fake data coming from the generator. Further, the discriminator tries to maximize both the term (1) and term (2) in order to correctly classify real from fake data. While the generator has an opposing objective of minimizing the term (2), to fool the discriminator with realistic fake data.

Intuition behind GANs

The overall idea of GANs is to sample from something as simple as a noise distribution. During training, a transformation or mapping in generator network is learned. This network is then used to generate new fake instances. 

Now, let us look at how this mapping is learned in an adversarial setting. As an analogy, we can also imagine the generator network as a counterfeiter while the discriminator network as a police investigator.

While training, initially, the weights of the generator are locked. During this period, the discriminator network is trained to distinguish between real and fake input data. This is done by learning the distribution through backpropagation technique. Just to be clear, the real data is generated from real world domain while the fake data is generated from the noise and fixed generator weights.

GANs discriminator

Once the discriminator is able to distinguish between the real and fake data, the weights of the discriminator network are now locked. During this phase, the generator network is trained to produce fake data such that the current discriminator distribution fails to classify the real data from fake data.

Further, again the discriminator relearns the distribution such that it can classify the real data from new fake data. And this competition goes on, until, the discriminator is unable to differentiate the real data from the fake data. And the training stops, as the model could not improve further on it.

So, the stronger the discriminator at classifying real data from fake data, the more realistic data would be generated by the generator network.

Let us look at few extensions of GANs that made them popular for various potential applications in today’s world.

Advances and applications

There have been many extensions to the concept of GANs, based on the variety of applications. Few important ones are listed below.

1) Conditional GANs (CGAN) [Paper] 

     This type of architecture allows the network to generate new instances conditioned on a class label. For instance, if we want to generate images of flower but only of specific class, then the normal GANs will not be able to produce those specific class instances. In order to solve this problem, the class labels are provided as a conditioning factor to both generator and discriminator network.

 

Conditional GANs
2) Super Resolution GANs (SRGAN) [Paper]

     These networks are particularly useful in upscaling low resolution images to detailed higher resolution images. They do so by filling in the blurry spots that appear while upscaling the image. 

Super Resolution GANs
Super-resolved image(left) almost indistinguishable from original (right) [4 x upscaling] - Image credits C. Ledig et. al.
3) CycleGANs [Paper]

     These are one of the most popular types of GANs today. They are capable of transforming data across different domains. For instance, CycleGAN can transform an image from winter to summer. Many of us would have likely experimented with Faceapps, which leverages CycleGAN to alter human faces, simulating different age groups. Notably, the popular “Deep fakes” are also part of CycleGAN. I would recommend you to watch this famous Fake Obama video created using such AI tool, to realise the immense potential of GANs.

4) Deep Convolutional GANs (DCGAN) [Paper]

     Convolutional networks are used when dealing with images. DCGAN use deep convolutional networks to generate high resolution images based on certain inputs. Few of the applications include augmenting the image data, hence increasing the size of the dataset required for training a good model. Additionally, anime characters can also be generated using DCGAN, which currently is a manual process that takes a lot of time.

As an example, the image below from Tero Karras et al. in their paper titled “Progressive growing of GANs for improved Quality, Stability and Variation” demonstrate the generation of possible realistic photos of human faces. You can see the results are simply remarkable.

 

Deep Convolutional GANs
Examples of Photorealistic GAN-Generated Faces.Taken from Progressive Growing of GANs for Improved Quality, Stability, and Variation, 2017.

Summary

In this blog, we get to know about

   – generative modeling in contrast to discriminative models

   – basic anatomy of GANs architecture

   – an intuition behind GANs to understand the process of training in an adversarial setting

   – few extensions to GANs along with their applications in current scenarios.

At the end, I would like to thank you all for taking the time to read this blog till the end. 

Hope you learned something valuable today.

Share:

4 Responses

  1. I am not sure where youre getting your info but good topic I needs to spend some time learning much more or understanding more Thanks for magnificent info I was looking for this information for my mission

  2. I do not even know how I ended up here but I thought this post was great I do not know who you are but certainly youre going to a famous blogger if you are not already Cheers

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

What exactly is ChatGPT?

ChatGPT is the single fastest growing human application in the whole history. It had around 100 million users in just 2.5 months from its launch date. What exactly is about ChatGPT that made it widely popular?

Read More »