CoModGAN uses AI to complete an image that is missing significant amounts of visual information. Two neural networks—a generator tasked with filing in missing information and a discriminator that analyzes the realism of the new image—work together to generate and verify a completed image.Try the CoModGAN demo
Generative Adversarial Networks (GANs) are powerful neural networks used for image generation. While they can execute image completion tasks for small regions, GANs fail when asked to generate large-scale missing regions of an image.
GAN development has taken divergent approaches that have enabled a variety of image completion tasks. But we argue that GANs need greater generative capability to successfully fill in large missing regions than current methods provide.
We have developed a new co-modulated GAN, or CoModGAN, architecture that generates realistic imagery based on small amounts of visual information and does so better than previous models.
Using neural networks to fill in missing image information
Learn how CoModGAN combines image-conditional and unconditional modalities to improve how GANs generate more realistic images.
CoModGAN uses a generator that completes the image, and a discriminator that evaluates the “realness” of the generator’s outputs. When the discriminator detects an image that it considers fake, the generator can use that new information to get a little bit better at fooling the discriminator. And the discriminator gets better at telling the difference between what’s real and what’s fake.
Technical details for CoModGAN
Generative Adversarial Networks execute image completion tasks by pitting two neural networks—a generator and discriminator—against each other such that this competitive relationship helps train both networks. While the generator is tasked with creating or completing an image, the discriminator analyzes how realistic the output image is relative to a dataset of real images.
More recently, we have seen the development of two different image completion algorithms. Image-conditional GANs can fill in small missing areas of an image. And unconditional GANs can generate completely new images. But if you ask either one to fill in a large region of an image, they fail.
We developed a co-modulated GAN, or CoModGAN, that bridges the generative capability successes of unconditional generators with the image-completion successes of image-conditional generators.
CoModGAN, like other GANs, is trained with discriminator losses, but unlike others it can get smarter at filling in large missing regions of an image. At the same time, we propose a new Paired/Unpaired Inception Discriminative Score metric to robustly measure the realness of generated images compared to the real ones.
We hope that CoModGAN can help create AI that can more naturally and successfully execute image completion tasks.
Gen Studio at The Met
Gen Studio uses GAN to create dreamlike images from real artworks at the Metropolitan Museum of Art. This interactive visual search allows everyone to see and explore the collection in new ways. Pix2Story uses Natural Language Processing (NLP) for storytelling. AI scans a picture, applies a writing style, and generates a story—demonstrating how AI can drive creativity.
Snip Insights helps users find intelligent insights from a snip or screenshot. AI services convert a captured image to translated text, automatically detecting and tagging image content.
Sketch2Code converts hand-written drawings to HTML prototypes. Designers share ideas on a whiteboard, then changes are shown instantly in the browser—helping improve collaboration between the designer, developer, and customer.
Innovation Developer Hub
Explore insights and behind-the-scenes technology for breakthrough AI innovations. From Tech Minutes videos to Technology Deep Dives, learn about the engineering that powers the future of AI.
Learn to create your own AI experiences with courses in AI technology. Engage with learning paths in conversational AI, machine learning, AI for devices, cognitive services, autonomous systems, AI business strategies, and responsible AI.
Start building AI solutions with powerful tools and services. Microsoft AI is a robust framework for developing AI solutions in conversational AI, machine learning, data sciences, robotics, IoT, and more.