Application Scenarios of Generative AI in the Game Industry – Accelerating the Production of Game Art Content

f3c3c0ea103221802e18f58e78b1ba88.gif

Thank you for reading the "Generative AI Industry Solution Guide" series of blogs. The whole series is divided into 4 articles. It will systematically introduce the generative AI solution guide and its typical scenarios in e-commerce, games, and pan-entertainment industries. application practice. The directory is as follows:

background introduction

Since the birth of human beings, drawing has been an important vehicle for learning, communication and creation. Even before the appearance of language and writing, human beings have been using graphics to record their perception of the world and exchange ideas with each other. As the saying goes, a picture is worth a thousand words, and the amount of information that a picture can carry is enormous. From ancient Egyptian murals to photos, pictures, etc. that are produced, stored and transmitted digitally today, the content of paintings is the carrier of information, and its creation methods are constantly changing.

Starting from DALL-E, humans have created a new way of painting – generative AI painting, also called generative AI. AI painting has brought endless imagination to people, but just like the curve of human technological development, in the first few years, this new AI technology has not been able to be really used in large-scale industrial production scenarios.

As an industry heavily dependent on entrepreneurial design and art scenes, the game industry has been struggling to find AI painting tools that can actually help them in the game production pipeline, so as to improve efficiency and reduce the cost of game development.

This situation has changed dramatically since the launch of Stable Diffusion and MidJourney last year. With the rapid development of the open source community built around the Stable Diffusion Web UI in recent months, game creators have seen the great potential of integrating AI technology into the art production pipeline. Now we can see that game companies large and small have invested a lot of energy in the generative AI track, and have made progress that was unimaginable before.

Game Industry Application Scenarios

As mentioned earlier, the game industry relies heavily on conceptual design and art resources, and it is also one of the industries with the highest requirements for the quality of creative and art content. For game art designers, no matter how exquisite the two-dimensional picture is, it is actually difficult to directly use it for material production in the game. Because, in addition to the design style, scene design and character design need to consider a lot of details, such as whether the character pose is natural, whether the details are clear, whether the lighting is reasonable, etc., all need a lot of consideration. AI may be able to bring some content beyond the imagination of human beings, but card-drawing creation cannot really improve the production efficiency of the art pipeline. We need to use tools to allow AI to more accurately generate images that meet expectations.

At this stage, we can control the generated results of AI paintings in the following ways.

The first is the Vincent diagram, which uses text-prompt words to control the content of the screen generation. In the prompt words, we can define scenes, objects, styles, perspectives, etc., but as the most extensive control method, the prompt words are limited in that they are very dependent on the basic model. The same prompt words are used in different Performance on base models can vary widely.

The second is to generate a picture, using a reference picture combined with prompt words to let AI redraw the part. In essence, it is not much different from the Vincent diagram, and the controllability is still not guaranteed. In addition, the generation is controlled by model fine-tuning. Commonly used Stable Diffusion fine-tuning models include Text Inversion (Embedding), Hypernetworks, DreamBooth and LoRA, the most popular of which is LoRA. As a training method for model fine-tuning, LoRA can make small changes to the neural network of the basic model, but it can produce amazing results. In the game industry, we found that LoRA has been widely used to determine the style and perspective of character design.

Finally, we want to introduce ControlNet in combination with the scene of the game industry. ControlNet has been the focus of much attention since it was born in the open source community in February this year, because it allowed Stable Diffusion to officially enter the workflow of art designers from an auxiliary tool in the game brainstorming stage. It can be said that it is an important milestone in AI painting.

First of all, let's understand the principle of ControlNet. ControlNet superimposes a neural network structure on the outside of the existing model, and uses a trainable Encoder copy and uses zero convolution in the copy to connect to the original network, so as to input more conditions on the basic model, such as edge mapping, segmentation mapping and Key points and other pictures are used as guides to achieve precise control of the output content.

30ee98fc3b90a614c464de956bb81f95.png

The schematic reference is from Adding Conditional Control to Text-to-Image Diffusion Models: https://arxiv.org/abs/2302.05543

We can now use plugins to select preprocessors and load ControlNet models. The preprocessor Preprocessor (also known as annotator) allows us to use existing pictures to generate the required guide map type. As shown in the figure, we can use a character with three views, and then select the openpose_full preprocessor to get a character's whole body multi-view openpose guide map, then we can use this guide map and ControlNet's OpenPose model for more controlled authoring.

0bc7df9d2d80771abb0f5b480093d6d3.png

Up to now, the official ControlNet models have increased from 8 in 1.0 to 14 in 1.1 (11 production-ready and 3 experimental models), and more than 30 preprocessors. It contains a variety of different control methods, we can roughly classify them as follows:

10bc06473032b4192658f9b3215c02e3.png

Here we will combine several models of ControlNet to explore how to realize controllable AI image generation in the subdivided scenarios of the game industry.

Concept creative and scene design

In game production, Concept Artist and Level Artist play very important responsibilities. In the early stage of creation, they are required to edit the map and terrain according to the needs of game planning, make light effects, and lay the basic style of the map. Etc. to present better game visuals. In the example below, we use ControlNet's Segment Model and Guide Graphs to create a conceptual design of a game scene. In 3D editing software such as Blender, we can create a simple white model map and then color it according to the ADE20K color classification standard to identify the composition, or use the existing scene graph as a reference, and select Segment preprocessing to generate a Segment guide map. Here we use a pre-prepared Segment guide map to generate a conceptual scene.

78c67273c4e3b847ac765901bb770c91.png

b2b04479ffd1cd9d54915fcf5e6cd256.png

The prompt words we use are as follows:

Positive prompts:

(masterpiece:1.2), (best quality:1.2), (highres), ultra detailed, photorealistic, a concept painting for gaming, scenery, view from distance, no humans, cloud, waterfall, outdoors, flower, sky, mountain, water, day, pink flower, architecture, petals, castle, cloudy sky, blue sky, tree, landscape, building, (rainbow:0.9)

Swipe left to see more

Reverse cue words:

dim, dark, abstract, unclear,repetitive, ugly, monotonous,paintings, sketches, (worst quality:1), (low quality1), (normal quality:1), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, glan,nsfw, lowres, bad anatomy, text, error, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, {
    
    {bad_construction}}, {bad_structure}, bad_wail, {bad_windows}, {blurry}, cloned_window, cropped, {deformed}, {disfigured}, error, {extra_windows}, {extra_chimney}, {extra_door}, extra_structure,extra_frame, {fewer_digits}, {fused_structure}, gross_proportions, jpeg_artifacts, {
    
    {long_roof}}, low_quality, {structure_limbs}, missing_windows, {missing_doors}, missing_roofs, mutated_structure, {mutation}, normal_quality, out_of_frame, owres, poorly_drawn_structure, poorly_drawn_house, signature, text, too_many_windows, {ugly} username,uta,watermark,worst_quality

Swipe left to see more

For the architectural environment in the game scene, we can also use the Canny model to generate different style backgrounds while ensuring the consistency of the main objects.

99f4c9584be6f48957c621c95101b0fd.png

We first use the Vinsen diagram and select the model to generate the original concept image.

12bd6c7b1ad7c9a98769ba4da7c8b966.png

Positive prompts:

(masterpiece:1.4), (best quality), (highres),<br />temple in ruines, forest, stairs, columns, cinematic, detailed, atmospheric, epic, concept art, Matte painting, mist, photo-realistic, concept art, volumetric light, cinematic epic + rule of thirds octane render, corona render, movie concept art, octane render, cinematic, trending on artstation, movie concept art, cinematic composition, ultra-detailed, realistic, hyper-realistic, volumetric lighting

Swipe left to see more

Reverse cue words:

(EasyNegative:1.4), (lowres), (low quality), (normal quality), watermark, car, cars on the street, human

Swipe left to see more

Put the pictures that conform to the concept design into ControlNet and select the canny preprocessor to generate the line draft, and then you can change the style of different scenes by modifying the prompt words without changing the main body of the picture.

bc5eb8de7140b9d08c5e189c3d2a3fdc.png

desert effect

Positive prompts:

(masterpiece:1.4), (best quality), (highres), temple in ruines, desert, stairs, columns, cinematic, detailed, atmospheric, epic, concept art, Matte painting, mist, photo-realistic, concept art, volumetric light, cinematic epic + rule of thirds octane render, corona render, movie concept art, octane render, cinematic, trending on artstation, movie concept art, cinematic composition, ultra-detailed, realistic, hyper-realistic,

Swipe left to see more

Reverse cue words:

(EasyNegative:1.4), (lowres), (low quality), (normal quality), watermark, car, cars on the street, human, forest, cloud,

Swipe left to see more

night effect

Positive prompts:

(masterpiece:1.4), (best quality), (highres), temple in ruines,(midnight bliss), (moon:1.2), (star \(sky\)), (dark at night), torch, forest, stairs, columns, cinematic, detailed, atmospheric, epic, concept art, Matte painting, mist, photo-realistic, concept art, volumetric light, cinematic epic + rule of thirds octane render, corona render, movie concept art, octane render, cinematic, trending on artstation, movie concept art, cinematic composition, ultra-detailed, realistic, hyper-realistic,

Swipe left to see more

Reverse cue words:

(EasyNegative:1.4), (lowres), (low quality), (normal quality), watermark, car, cars on the street, human, sunlight,

Swipe left to see more

snow effect

Positive prompts:

(masterpiece:1.4), (best quality), (highres), temple in ruines, forest, winter, snow, stairs, columns, cinematic, detailed, atmospheric, epic, concept art, Matte painting, mist, photo-realistic, concept art, volumetric light, cinematic epic + rule of thirds octane render, corona render, movie concept art, octane render, cinematic, trending on artstation, movie concept art, cinematic composition, ultra-detailed, realistic, hyper-realistic

Swipe left to see more

Reverse cue words:

(EasyNegative:1.4), (lowres), (low quality), (normal quality), watermark, car, cars on the street, human, sunlight

Swipe left to see more

Game skin props and assets

In the production of the game, the design of a large number of items in the game is a very time-consuming and laborious part. There may be thousands of items such as equipment, skins, props, and potions. , can take a long time and a lot of budget. Here we try to use lineart_anime to extract anime character line art to create different character suits.

db296f55d1479fa2fa6826f05c43f02e.png

We still choose our own basic model first, and generate the original concept image through the prompt words.

Positive prompts:

(masterpiece),(best quality:1.0), (ultra highres:1.0), (bent over), detailed clothes, blunt bangs, braid, wide-sleeved kimono, hair ornament, white japanese clothes, (red obi:1.4), (purple hair:1.4), very long hair, straight hair, detailed face, cool face, (smooth chin:0.85), closed mouth, looking at viewer, beautiful eyes, detailed eyes, (ulzzang-6500:0.7), skirt, (from below:1.1), photon mapping, physically-based rendering, RAW photo, clear background, (white background:1.4), (photo realistic:1.35), high res, perspective

Swipe left to see more

Reverse cue words:

(sexy:1.4), 3d, sepia, painting, cartoons, sketch, (worst quality:2), (low quality:2), (normal quality:2), lowres, bad anatomy, bad hands, normal quality, ((monochrome)), ((grayscale)), futanari, full-package_futanari, newhalf, nipplepierces, collapsed eyeshadow, multiple eyeblows, pink hair, (nsfw:1.4)

Swipe left to see more

41c23071a1909c3d721683b00c6d4051.png

Then using lineart_anime's preprocessing plus lineart_anime's model, we can adjust the part of the prompt word related to the character characteristics to generate different suits in the example.

f25fb8fe42407aebe396923270ed66ff.png

Three views of character design

The game original painting is specific to the design of a character, and it is usually handed over to the modeler in the form of three views. Because the final character will be in three-dimensional form to express the details. The front view, back view, and side view included in the three views allow the modeler to quickly understand the design intention of the original artist. Through the OpenPose editor plug-in or other image editing tools, we can draw 3-4 character modeling guide maps. It should be noted that the length, width and pixels of the final image should be kept in the same proportion, and then combined with the prompts through the OpenPose model of ControlNet Words and specific models can generate three views of characters with good effects.

268c42b9a00ccf103f7e06a39bdb1faa.png

2d1e6788c8828e5aad7279af209291e4.png

Positive prompts:

(masterpiece),(best quality:1.0), (ultra highres:1.0), (bent over), full body, detailed clothes, blunt bangs, braid, wide-sleeved kimono, hair ornament, white japanese clothes, (red obi:1.4), (purple hair:1.4), very long hair, straight hair, detailed face, cool face, (smooth chin:0.85), closed mouth, looking at viewer, beautiful eyes, detailed eyes, (ulzzang-6500:0.7), (long skirt:1.4), (from below:1.1), photon mapping, physically-based rendering, RAW photo, clear background, (white background:1.4),(photo realistic:1.35),high res,perspective,(((full body))), multiple views, &lt;lora:charturnerbetaLora_charturnbetalora:0.1&gt;

Swipe left to see more

Reverse cue words:

(sexy:1.4), 3d, sepia, painting, cartoons, sketch, (worst quality:2), (low quality:2), (normal quality:2), lowres, bad anatomy, bad hands, normal quality, ((monochrome)), ((grayscale)), futanari, full-package_futanari, newhalf, collapsed eyeshadow, multiple eyeblows, pink hair, (nsfw:1.4)

Swipe left to see more

Architecture and working principle

This article is based on the generative AI industry solution guide, and the working principle of the solution is as follows:

55e4727009c937d5b04cece424729907.png

Generative AI industry solution guide, the front-end Stable Diffusion WebUI is deployed on the container service Amazon ECS, the back-end uses the serverless service Amazon Lambda for processing, and the front-end and back-end communicate through Amazon API Gateway calls. Model training and deployment are performed through Amazon SageMaker. At the same time, Amazon S3, Amazon EFS, and Amazon DynamoDB are used to store model data, temporary files, and usage data respectively. For details, please refer to the first article in the blog series "Generative AI Industry Solution Guide and Deployment Guide"

(https://aws.amazon.com/cn/blogs/china/generative-ai-industry-solutions-guide-and-deployment-practices/)。

Rapid Deployment Process

This industry solution guide can be deployed with one click using CloudFormation. If you want to use the Generative AI Industry Solution Guide, please refer to the first article in the blog series "Generative AI Industry Solution Guide and Deployment Guide" . This article will not focus on it.

summary

In this article, we briefly introduce how to use Amazon Cloud Technology's Generative AI Industry Solution Guide and ControlNet to efficiently generate highly controllable image materials in several scenarios in the game industry. I hope that you can use Amazon Cloud Technology's generative AI industry solution guide to practice AI painting skills, and make it a powerful tool for your creative assistance and efficiency improvement.

References

  1. Generative AI Industry Solution Guide Workshop:

    https://catalog.us-east-1.prod.workshops.aws/workshops/bae25a1f-1a1d-4f3e-996e-6402a9ab8faa

  2. Stable-diffusion-webui:

    https://github.com/AUTOMATIC1111/stable-diffusion-webui

  3. Hugging Face:

    https://huggingface.co/

  4. https://arxiv.org/abs/2302.05543

  5. https://github.com/lllyasviel/ControlNet

  6. https://github.com/Mikubill/sd-webui-controlnet

  7. https://huggingface.co/blog/train-your-controlnet

The author of this article

472911bd414d8d8fc49b0a3cb63018d0.jpeg

Wang Yan

Amazon cloud technology game industry solution architect, responsible for understanding industry trends and scene pain points, and providing corresponding industry solutions. Focusing on the design and implementation of cloud game production, operation and background systems, he has more than 15 years of experience in the design and development of interactive applications and large-scale distributed systems, and has rich practical experience in containerization and distributed system design.

ef5f139c56604fd0fcac60bd908a14f3.jpeg

Wang Jingfan

Amazon cloud technology industry solution architect, the main areas include finance, retail and generative AI. Worked at IBM, responsible for customer solutions in the financial industry. At present, it is aimed at the promotion of user data platform and generative AI solutions in Hong Kong.

4768de806c459ce427d62b5969d0dabf.jpeg

Tang Zhe

Amazon cloud technology industry solution architect, responsible for the consulting and architecture design of Amazon-based cloud computing solutions, and is committed to the dissemination and popularization of Amazon cloud service knowledge system. He has practical experience in software development, security protection and other fields, and is currently focusing on the fields of e-commerce and live broadcast.

4004b47c5e9a432f3cd01fd9d314a84f.jpeg

Wan Xi

Amazon cloud technology solution architect, responsible for consulting and architectural design of cloud computing solutions based on Amazon cloud technology. Solid embracer of the Amazon Builder culture. With more than 12 years of experience in game development, he has participated in the management and development of several game projects, and has a deep understanding and insight into the game industry.

d3d5e97573fd81c04b5360318b7f0eef.jpeg

Zhang Xiaofeng

Amazon cloud technology solution architect manager, game technology expert, has very rich practical experience in the field of architecture and development.

f9ed5997f5d03614d95a35b137a37800.gif

e488c3da9e397ae24ecc9f369161a84c.gif

I heard, click the 4 buttons below

You will not encounter bugs!

eb7384b13be0c350d7212c724c653104.gif

Guess you like

Origin blog.csdn.net/u012365585/article/details/132002356