AI Lab projects

Learn about breakthrough AI innovation with hands-on labs, code resources, and deep dives.

Explore all AI Lab projects Explore Innovation Tech Hub

CoModGAN: AI-Powered Image Completion

CoModGAN is an image completion tool that uses AI to complete an image that is missing significant amounts of visual information. Two neural networks—a generator tasked with filing in missing information and a discriminator that analyzes the realism of the new image—work together to generate and verify a completed image.

Try the CoModGAN demo

Paris Residential Apartments and a rooftop

The need

Generative Adversarial Networks (GANs) are powerful neural networks used for image generation. While they can execute image completion tasks for small regions, GANs fail when asked to generate large-scale missing regions of an image.

The idea

GAN development has taken divergent approaches that have enabled a variety of image completion tasks. But we argue that GANs need greater generative capability to successfully fill in large missing regions than current methods provide.

The solution

We have developed a new co-modulated GAN, or CoModGAN, architecture that generates realistic imagery based on small amounts of visual information and does so better than previous models.

Using neural networks to fill in missing image information

Learn how CoModGAN combines image-conditional and unconditional modalities to improve how GANs generate more realistic images.

Diagram showing how CoModGAN solution works. This process is described in the video and in the following text.

CoModGAN Architecture

CoModGAN uses a generator that completes the image, and a discriminator that evaluates the “realness” of the generator’s outputs. When the discriminator detects an image that it considers fake, the generator can use that new information to get a little bit better at fooling the discriminator. And the discriminator gets better at telling the difference between what’s real and what’s fake.

Get started with CoModGAN on GitHub

Technical details for CoModGAN

Generative Adversarial Networks execute image completion tasks by pitting two neural networks—a generator and discriminator—against each other such that this competitive relationship helps train both networks. While the generator is tasked with creating or completing an image, the discriminator analyzes how realistic the output image is relative to a dataset of real images.

More recently, we have seen the development of two different image completion algorithms. Image-conditional GANs can fill in small missing areas of an image. And unconditional GANs can generate completely new images. But if you ask either one to fill in a large region of an image, they fail.

We developed a co-modulated GAN, or CoModGAN, that bridges the generative capability successes of unconditional generators with the image-completion successes of image-conditional generators.

CoModGAN, like other GANs, is trained with discriminator losses, but unlike others it can get smarter at filling in large missing regions of an image. At the same time, we propose a new Paired/Unpaired Inception Discriminative Score metric to robustly measure the realness of generated images compared to the real ones.

We hope that CoModGAN can help create AI that can more naturally and successfully execute image completion tasks.

A group works together while studying an old painting.

Gen Studio at The Met

Gen Studio uses GAN to create dreamlike images from real artworks at the Metropolitan Museum of Art. This interactive visual search allows everyone to see and explore the collection in new ways. Pix2Story uses Natural Language Processing (NLP) for storytelling. AI scans a picture, applies a writing style, and generates a story—demonstrating how AI can drive creativity.

Learn about Pix2Story Explore Gen Studio

Snip Insights

Snip Insights helps users find intelligent insights from a snip or screenshot. AI services convert a captured image to translated text, automatically detecting and tagging image content.

Learn about Snip Insights

A man looks at hand-written notes on a computer monitor.

Sketch2Code

Sketch2Code converts hand-written drawings to HTML prototypes. Designers share ideas on a whiteboard, then changes are shown instantly in the browser—helping improve collaboration between the designer, developer, and customer.

Learn about Sketch2Code

Create innovative AI solutions

Discover Azure AI—a portfolio of AI services designed for developers and data scientists. Take advantage of the decades of breakthrough research, responsible AI practices, and flexibility that Azure AI offers to build and deploy your own AI solutions.

Start using Azure AI