[Source Code & Tutorial] Animation avatar generation system based on GAN

1.Research background

We all love anime characters and try to create our customized characters. However, it takes a huge effort to master the painting skills, after which we first have the ability to design our own characters. To bridge this gap, automatic generation of animated characters provides an opportunity to introduce the presence of customized characters without professional skills. In addition to the benefits for non-professionals, professional creators can also take advantage of automatic generation to get inspiration for animation and game character design; Doujin RPG developers can use copyright-free facial images to reduce design costs for game production.

2. Picture demonstration

1.png

2.png

3.png

3. Video demonstration

GAN-based animation avatar generation system (source code & tutorial)_bilibili_bilibili

4. Project Difficulties

(1) How to accurately control the generation results of GAN based on the attributes set by the user? (Note that GAN is a black box model)
(2) The generation results of the model are determined by the attributes set by the user and random noise (random numbers). Therefore, even if the set attributes remain unchanged, the avatar generated will be different each time. So how to ensure the accuracy of attribute control and the randomness of generated avatars?
(3) How to seamlessly integrate different combinations of attributes into the same avatar?
(4) The output results of the previously proposed avatar generation method have problems of blurring and confusing images, and the quality of the generated results is very unstable. So, how to make the model generate high-quality avatars? How to make the model output high-quality results with a higher success rate?

5.Network structure

Builder

image.png

discriminator

image.png

6. Preparation of data set

6.png

In order to train a high-quality animation avatar generation model, the necessary prerequisites are: high-quality illustrations in the data set, consistent painting style, and low noise. Although the existing large-scale illustration data sets provide a large number of illustrations, there are problems such as huge differences in painting styles and a lot of noise in the data sets. In order to avoid these problems, this article chooses "game character introduction illustrations" on a game sales website as the data source.

How to automatically find the coordinates of the area where the avatar is located in batch downloaded illustrations? Thesis uses the "Face Detection" algorithm based on Local Binary Pattern (LBP) features [foot note-6]. In order to make the detection results include hairstyle, the detection box (bounding box) output by the algorithm is enlarged to 1.5 times the original size.

A question worth thinking about: After having an avatar data set with attribute annotations, why do you still need to train an automatic avatar generation model? Can "dataset" replace "automatically generate model"? {*}
(1) In theory, when the size of the data set tends to be infinite, so that every possible combination of attributes contains a large number of examples, there may be less need to retrain the automatic generation model . But in fact, as the number of attributes increases, the number of possible attribute combinations will explode! The design and drawing of high-quality avatars requires higher costs, so it is unlikely to have such a large-scale data set.
In addition, the author believes that this issue should be thought of from the perspective of "innovation" (rather than "imitation"):
(2) The GAN generator uses random Noise is used as input, so that even if the properties set are exactly the same, the results generated will change slightly but randomly each time. This is impossible to achieve with a data set with a fixed total number of images.
(3) GAN can generate attribute combinations that do not exist in the data set. For example, there are avatars with blue hair and green eyes in the data set, but there are no avatars with blue hair and green eyes. After training, GAN can generate an avatar with blue hair and green eyes. (The premise is that GAN has fully learned the characteristics of blue hair and green eyes.)
(4) GAN can learn the "features" of different images in the training set ("features" include but are not limited to annotated attributes), and seamlessly blend features from different images into one generated result. Therefore, GAN can "create" avatars that do not exist in the data set.
(5) GAN can realize "interpolation between two avatars" and "avatar gradient animation". (Introduced below)

6. Code implementation

import  tensorflow as tf
from    tensorflow import keras
from    tensorflow.keras import layers






class Generator(keras.Model):

    def __init__(self):
        super(Generator, self).__init__()

        # z: [b, 100] => [b, 3*3*512] => [b, 3, 3, 512] => [b, 64, 64, 3]
        self.fc = layers.Dense(3*3*512)

        self.conv1 = layers.Conv2DTranspose(256, 3, 3, 'valid')
        self.bn1 = layers.BatchNormalization()

        self.conv2 = layers.Conv2DTranspose(128, 5, 2, 'valid')
        self.bn2 = layers.BatchNormalization()

        self.conv3 = layers.Conv2DTranspose(3, 4, 3, 'valid')

    def call(self, inputs, training=None):
        # [z, 100] => [z, 3*3*512]
        x = self.fc(inputs)
        x = tf.reshape(x, [-1, 3, 3, 512])
        x = tf.nn.leaky_relu(x)

        #
        x = tf.nn.leaky_relu(self.bn1(self.conv1(x), training=training))
        x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training))
        x = self.conv3(x)
        x = tf.tanh(x)

        return x


class Discriminator(keras.Model):

    def __init__(self):
        super(Discriminator, self).__init__()

        # [b, 64, 64, 3] => [b, 1]
        self.conv1 = layers.Conv2D(64, 5, 3, 'valid')

        self.conv2 = layers.Conv2D(128, 5, 3, 'valid')
        self.bn2 = layers.BatchNormalization()

        self.conv3 = layers.Conv2D(256, 5, 3, 'valid')
        self.bn3 = layers.BatchNormalization()

        # [b, h, w ,c] => [b, -1]
        self.flatten = layers.Flatten()
        self.fc = layers.Dense(1)


    def call(self, inputs, training=None):

        x = tf.nn.leaky_relu(self.conv1(inputs))
        x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training))
        x = tf.nn.leaky_relu(self.bn3(self.conv3(x), training=training))

        # [b, h, w, c] => [b, -1]
        x = self.flatten(x)
        # [b, -1] => [b, 1]
        logits = self.fc(x)

        return logits

def main():

    d = Discriminator()
    g = Generator()


    x = tf.random.normal([2, 64, 64, 3])
    z = tf.random.normal([2, 100])

    prob = d(x)
    print(prob)
    x_hat = g(z)
    print(x_hat.shape)




if __name__ == '__main__':
    main()

7. System integration

The following source code & environment deployment video tutorial & data set & custom UI interface
5.png
Reference Blog "Anime Avatar Generation System Based on GAN ( Source code & tutorial)》

Guess you like

Origin blog.csdn.net/cheng2333333/article/details/134969133