GauGAN releases second generation! After training more than 10 million pictures, two words can generate landscape paintings

Click the above artificial intelligence algorithm and Python big data to get more dry goods

In the upper right  ...  Set as a star ★, get resources at the first time

Only for academic sharing, if there is any infringement, please contact to delete

Reprinted from: Xinzhiyuan

Nvidia’s artist artifact, GauGAN, recently released its second generation, and its ability to generate landscape paintings has been further improved. Originally, you still need to specify some materials to synthesize images. Now you only need one sentence to generate the landscape paintings you want, and even common sense such as seasons. also understand!

Recently, NVIDIA released the second generation of the real-time painting tool GauGAN, the main feature is to support input text to generate images.

In the new version, GauGAN2 integrates segmentation mapping, inpainting and text-to-image generation technologies, users can generate some landscapes that do not exist in real life.

The goal of GauGAN2 is to create a camera technique that mixes text and images!

5d5ee7bd065c5325b75ccb9c791cec17.gif

The neural network model behind GauGAN2 is able to produce more diverse and higher quality images than state-of-the-art models dedicated to text-to-image or image-to-image segmentation applications.

Instead of drawing every element in the imaginary scene, users can quickly generate the key features and themes of the image by entering a short phrase, such as a snow mountain to generate a sketch of a snow mountain. Then use this sketch as a starting point for the next image modification, such as making the mountain higher, adding a few trees, changing the sky, etc. It can be said that it is very convenient!

The name GauGAN was inspired by the Impressionist painter Paul Gaugin, whose work only gained fame after his death. He is a representative of Impressionism. In addition to painting, he also has certain achievements in sculpture, pottery, printmaking and writing. His use of color led to Synthetism, which, coupled with the influence of Divisionism, also paved the way for Primitivism.

e9140958dfbd0c563182e18aa87cedba.png

Since 2019, Nvidia has fed the GauGAN system more than 1 million public Flickr images for model training.

In March 2019, at the GPU Technology Conference (GTC) in San Jose, California, Nvidia unveiled GauGAN, a generative adversarial AI system that lets users create photorealistic images of landscapes that don't actually exist. In the first month after the beta version of GauGAN was released on the Playground platform, 500,000 images have been generated, including applications in concept art fields such as movies and video games.

dbe6b6a177a711a44574955427090f0f.png

Nvidia says GauGAN is already being used by a healthcare organization as an exploratory therapy tool, and is also being used by animation modeler Colie Wertz, whose work includes Star Wars, Transformers, and The Avengers, among others.

The first public use of GauGAN was in GANPaint Studio, a public artificial intelligence tool that lets users upload any photo and edit the appearance of depicted buildings, flora and fixtures. Elsewhere, generative machine learning models have been used to generate photorealistic videos by watching YouTube clips, creating images and storyboards from natural language captions, and using audio clips containing human speech to animate and synchronize facial movements.

ec4886b3888815e474a3debbc2a9f548.png

Like the first generation of GauGAN, GauGAN2 knows the relationship between objects such as snow, trees, water, flowers, shrubs, mountains, and mountains. For example, the common sense that the type of precipitation changes with the seasons can also be maintained in image generation.

GauGAN and GauGAN2 are also based on an adversarial generative network (GAN), which includes a generator and a discriminator. The generator is used to take input samples (a text and an image) and predict whether the text description corresponds to the landscape image content.

The generator is trained by trying to trick the discriminator into not being able to tell the difference between the generated image and the real-world image. Although the quality of GAN's generation is very poor in the early stage, its generator will continue to become stronger with the feedback of the discriminator.

On the basis of the first generation, GauGAN2 has been trained on more than 10 million images and has been able to convert natural language into landscape images very well.

For example, entering "sunset on the beach" will generate a corresponding landscape map. Adding adjectives such as "sunset on a rocky mountain beach" or replacing "sunset" with "afternoon" or "rainy day" will also immediately Generate a modified image.

29c679a81fc85b280d99e243467cd5ff.png

Using GauGAN2, users can generate a segmentation map that shows the location of objects in the scene. Users can switch the resulting image to drawing mode, paint the scene as a rough sketch with labels like sky, tree, rock, and river, and be able to use brushes to embed doodles into the image.

GauGAN2 is similar to OpenAI's DALL-E, which also generates images based on text prompts. Such systems are essentially creators of visual ideas, with potential applications in film, software, video games, products, fashion and interior design.

Nvidia claims that the first version of GauGAN has already been used to create concept art for movies and video games. And, as with the first release, Nvidia plans to open-source the code for GauGAN2 on GitHub and conduct an interactive demo on Playground, Nvidia’s networking hub for AI and deep learning research.

5147d7a5c6dc21e9d6d75e0f715403c8.png

However, one disadvantage of generative models like GauGAN2 is the potential for model bias.

In the generated samples of Dall-E, OpenAI uses a special model CLIP to improve the image quality. The method used is to cover the top samples in each sample generated by DALL-E and replace them with other prompts. picture.

But one study found that CLIP misclassified a higher proportion of black personal photos, and that it perceived occupations such as babysitting, and domestic workers as being relevant to women.

In a related press material, Nvidia did not say how their R&D team reviewed social bias in GauGAN2.

But an Nvidia spokesperson said in an email that the model has more than 100 million parameters and was trained for a month using the landscape dataset. This dedicated model focused entirely on landscapes, and the researchers audited to ensure that no people were present in the training images. For now, GauGAN2 is just a research demonstration.

Another application of GauGAN is Nvidia Canvas, which enables creators to paint with materials rather than colors. This program is capable of real-time painting results without waiting for a full painting.

62198705b7aa4700f73c1af9a30b6d06.png

Users first draw simple shapes and lines using real-world materials, such as grass or clouds. The AI ​​model then immediately populates the screen showing the stopped results. Four fast shapes and an amazing mountain appear. A few more lines will create a beautiful field.

NVIDIA canvas also provides a variety of materials to use. NVIDIA Canvas comes in nine styles, modifying the look and feel of paintings and 15 different materials, from sky and mountains to rivers and stones. Draw on different layers, keeping elements separate. Start from scratch, or launch and modify one of the app's pre-made scenes for even more perfect inspiration tips.

73f3c8d34a1926d0d62474b237941f27.png

Draw in a pond and nearby elements such as trees and rocks will appear as reflections in the water. Change the material and turn the snow into grass, and the whole image changes from a winter wonderland to a tropical paradise.

ffe3400fde3a336256e3f5a91175c088.png

This tool allows artists to use style filters to alter the resulting image to adopt the style of a particular painter. Not just stitching together other pictures, or cutting and pasting textures, but creating entirely new images, just like an artist.

With NVIDIA's GauGAN, anyone can be an artist!

References:

https://venturebeat.com/2021/11/22/nvidias-latest-ai-tech-translates-text-into-landscape-images/

---------♥---------

Statement: This content comes from the Internet, and the copyright belongs to the original author

The pictures are sourced from the Internet and do not represent the position of this official account. If there is any infringement, please contact to delete

Dr. AI's private WeChat, there are still a few vacancies

4cd975da3f5102c653c39174a4bffa1e.png

a82b7308b85866a27019a3e4d2b0d726.gif

How to draw a beautiful deep learning model diagram?

How to draw a beautiful neural network diagram?

One article to understand various convolutions in deep learning

Click to see support5fec884aa244df444166be0ff57cda39.png3f850782f16dacc572e914f8a131b7be.png

Guess you like

Origin blog.csdn.net/qq_15698613/article/details/121668617