List of 11 GANs Architectures For Computer Vision Tasks



What are generative adversarial networks (GAN)?

Generative adversarial networks (GAN) is a type of artificial intelligence algorithm used in unsupervised machine learning and is a two-player neural network composed of a generator and a discriminator. The generator is tasked with producing synthetic data samples that are indistinguishable from real data samples, while the discriminator is tasked with differentiating between the synthetic and real data samples. The goal is for the generator to be able to produce data that is so realistic that the discriminator cannot tell the difference between synthetic and real data.


The following 11 types of GANs are used for computer vision:

1. Vanilla GANs

A vanilla GAN is composed of two neural networks: a generator and a discriminator. The generator is responsible for generating new data, and the discriminator is responsible for discriminating between generated data and real data.

The generator is initially trained to generate data that is indistinguishable from real data. Then, the discriminator is trained to discriminate between generated data and real data. The goal is for the generator to generate data that the discriminator cannot distinguish from real data.

The advantage of vanilla GANs is that they are relatively simple to implement and they perform well in a wide range of applications. The disadvantage of vanilla GANs is that they are relatively unstable and can easily fail to converge.

2. Cycle-GAN algorithm

A Cycle-GAN is a GAN in which the generator and discriminator are repeatedly trained in a cyclic manner. The generator is first trained to generate samples that are close to the training data. The discriminator is then trained to distinguish between the generated samples and the real training data. The generator is then updated to try and generate samples that are even closer to the training data. This process is repeated until the generator is able to generate samples that are indistinguishable from the real training data.

3. Pix2pix GAN

The Pix2pix algorithm is a machine learning algorithm for generating images from text descriptions. It is a type of conditional generative adversarial network (GAN), where the generator is a convolutional neural network (CNN) and the discriminator is a recurrent neural network (RNN).

4. Style GAN

The Style GAN algorithm is a deep learning algorithm used to generate images that look like real-world objects. The algorithm is composed of two parts: a generator network and a discriminator network. The generator network is used to generate images, and the discriminator network is used to determine whether the images are real or fake.

The generator network is composed of a number of layers, and each layer is responsible for generating a certain part of the image. The first layer generates the image’s basic features, such as the outline of the object and its basic color. The next few layers add more detail to the image, such as the object’s texture and shading. The final layer generates the image’s most detailed features, such as the object’s details and reflections.

The discriminator network is also composed of a number of layers. The first layer is used to determine whether the image is real or fake. The next few layers are used to determine the level of detail in the image. The final layer is used to determine the overall quality of the image.

The Style GAN algorithm is trained using a number of images of the same object. The generator network is trained to generate images that look like real-world objects, and the discriminator network is trained to determine whether the images are real or fake. The algorithm can generate images that are nearly indistinguishable from the real-world object.

5. Deep Convolutional GAN (DCGAN)

A DCGAN is a deep Convolutional GAN. They are a type of GAN that is composed of 2 deep neural networks: a Generator and a Discriminator.

The Generator is responsible for generating new images that are similar to those in the training dataset. The Discriminator is responsible for discriminating between images generated by the Generator and images from the training dataset.

Training a DCGAN is a 3-step process:

1) Train the Generator to generate new images that are similar to those in the training dataset.

2) Train the Discriminator to discriminate between images generated by the Generator and images from the training dataset.

3) Gradually increase the amount of data used to train the Discriminator until it can no longer distinguish between images generated by the Generator and images from the training dataset. This indicates that the Generator has learned to generate realistic images.

6. Conditional GAN (CGAN)

A conditional GAN is an algorithm for training a generative adversarial network (GAN), where the generator is conditioned on some extra information, such as a label or a vector of feature values. This extra information can be used to improve the quality of the generated samples, for example by allowing the generator to better match the distribution of training data.

One advantage of a conditional GAN is that the generator can be tuned to generate samples that are more similar to the training data. This can be useful for tasks such as image synthesis, where it is important to produce realistic images that match the distribution of the training data.

Another advantage of a conditional GAN is that the extra information can be used to improve the quality of the generated samples. For example, in the context of image synthesis, the extra information could be used to improve the realism of the generated images by allowing the generator to better match the distribution of the training data.

7. Pixel Recurrent Neural Network (PixelRNN)

A pixelRNN generative adversarial network (GAN) is a neural network composed of two sub-networks: a generator and a discriminator. The generator produces synthetic data, while the discriminator tries to distinguish between real and synthetic data. The two networks are trained in a cyclic fashion: the generator is trained to generate data that is indistinguishable from real data, while the discriminator is trained to correctly identify synthetic data.

The advantage of a GAN is that it can learn to generate data that is realistic and complex, without any prior information about the data distribution. In addition, GANs are relatively robust to noise, making them well-suited for applications such as image synthesis or data augmentation.

8. DiscoGAN

DiscoGAN is a generative adversarial network (GAN) algorithm that was developed by researchers at the University of Montreal in 2018. It is a variant of the original GAN algorithm that is designed to produce more realistic images.

The DiscoGAN algorithm consists of two parts: the generator and the discriminator. The generator is responsible for creating fake images, and the discriminator is responsible for distinguishing between fake images and real images.

The two networks are trained simultaneously using a reinforcement learning algorithm. The generator is trying to produce images that are indistinguishable from real images, and the discriminator is trying to distinguish between real and fake images. The two networks are constantly feedbacking information to each other, and the algorithm is designed to find a balance between the two networks.

The DiscoGAN algorithm has been shown to produce more realistic images than the original GAN algorithm. In particular, it is better at generating images of faces and animals.

9. Super Resolution GAN (SRGAN)

SRGANs are a type of generative adversarial network (GAN), which are a type of neural network that is used for machine learning, particularly in the area of image recognition.

SRGANs are used to create high-resolution images from low-resolution images. They work by having two neural networks compete against each other — a generator network, which creates high-resolution images, and a discriminator network, which tries to distinguish between the high-resolution images created by the generator and actual high-resolution images.

The advantage of SRGANs is that they can produce high-resolution images without the need for any additional data. This is in contrast to other methods such as upscaling, which can only produce images that are a close approximation to the original high-resolution image.

SRGANs can produce images that are almost indistinguishable from the original high-resolution image, and they are also able to produce images with more detail and less noise than upscaling.

10. InfoGAN

The InfoGAN algorithm is an improvement from the original GAN algorithm. It uses a more sophisticated loss function that encourages the generator to produce images that are not only realistic but also informative. This makes the generator more likely to generate images that are both realistic and meaningful.

11. StackGAN

A StackGAN is a type of generative adversarial network (GAN) that uses multiple generators and discriminators. The generators are stacked in a series, with the first generator being the one that produces the input data, the second generator being the one that produces the data to be used as the target for the first generator, and so on. The discriminators are also stacked, with the first discriminator being used to determine the quality of the input data, the second discriminator being used to determine the quality of the data generated by the first generator, and so on.

This article brought to you by images.cv
images.cv provides you with an easy way to build image datasets for your next computer vision project.

Visit us