Use key point detection to create a small tool Padoodle, let the graffiti villains learn to dance with real people

c8bacd21c8af569ef80989fbe8d4dd83.png

The drawing software that comes with Windows has accompanied me for a few years when I first came into contact with computers when I was a child. This simple gadget was magical to me when I was a child, as if I could draw anything. I have drawn a lot of portraits, and I also fantasized about making these portraits move with me (I didn't know there was such a thing as flash at the time). Some time ago, I used a flying paddle to realize a small project to make the graffiti move. Today, I will share with you the idea of ​​my childhood - let the graffiti villain move with people.

8b8171027c11cd47f5042875d372ad5e.gif

4cdfc134bb160d503c789945bb4d57c2.png

Projects published by MetaAI

keypoint detection

To make the graffiti guy do the same thing as me, first we need a human key point detection model. Paddle Paddle provides a human key point detection model with PaddleHub , a pre-training model application tool for Paddle Paddle, and Paddle Detection, a target detection suite for Paddle Paddle .

This project uses the human_pose_estimation_resnet50_mpii model in PaddleHub. This model is faster than openpose, but the effect is slightly worse. If the accuracy of the action of the graffiti villain is particularly high, or if you want the model to return the confidence information of the key point coordinates, you can try the openpose_body_estimation model, or Paddle Detection The HRNet family of models in and the latest PP-TinyPose.

Paddle Detection link : https://github.com/PaddlePaddle/Paddle Detection

PaddleHub link : https://github.com/PaddlePaddle/PaddleHub _

Since every point in the graffiti must be bound with a bone point, there should not be too few bone points, otherwise the points that are not close to the bone points will also be bound together for some special reasons. We need to extend the detected bone point K to a certain extent. This project calculates the midpoint between every two adjacent bone points as the extended bone point, and then repeats this operation twice.

Human_pose_estimation_resnet50_mpii The training dataset for this model is mpii with 16 keypoints. We first create a center point between the "thorax" and "pelvis" keypoints as the root of all keypoints. Then construct a tree structure with key points as nodes of our entire human body. This tree structure will be used later. Finally, the number of our expanded keypoints is 65, and this expanded keypoint group is denoted as K'.

Encapsulate the keypoint detection model:

class estUtil():
    #封装的关键点检测类
    def __init__(self):
        super(estUtil, self).__init__()
        # 使用human_pose_estimation_resnet50_mpii模型
        self.module = hub.Module(name='human_pose_estimation_resnet50_mpii')

    def do_est(self, frame):
        res = self.module.keypoint_detection(images=[frame], use_gpu=True)
        return res[0]['data']

Ways to expand keypoints:

def complexres(res, FatherAndSon):
#扩充关键点,但仍然要保持逻辑上的关键点的节点顺序
    cres = copy.deepcopy(res)
    for key,pos in res.items():
        father = FatherAndSon[key]
        if father == key:
#当时根节点的时候停止
            continue 
        if key[0] == 'm' or father[0] == 'm':
#子节点第一种命名规则
            midkey = 'm'+key+'_'+father
        else:
            kn = ''
            for t in key.split('_'):
                kn += t[0]
            fn = ''
            for t in father.split('_'):
                fn += t[0]
#子节点第二种命名规则
            midkey = 'm_'+kn+'_'+fn
#计算中点,并把结果按逻辑顺序存到字典中
        midvalue = [(pos[0] + res[father][0]) / 2, (pos[1] + res[father][1])/2]
        FatherAndSon[key] = midkey
        FatherAndSon[midkey] = father
        cres[midkey] = midvalue
    return cres, FatherAndSon

d23c664279e87bd6b883d501c5c6cc83.png

The left is the key point k before expansion, and the right is the key point k' after expansion.

Tuya recording and optimization

For the convenience of interaction, I used OpenCV to make a simple small drawing board, the user can choose different colors to draw doodles. Here, in order to bind the key points and graffiti later, I first drew the key points of the human body of the template on the canvas, and the user has a set of reference coordinates, which makes it easier to draw their own graffiti villains.


OpenCV will continuously collect the current position (Mouse_x, Mouse_y) of our brush (mouse) after the user presses the mouse. When the user releases the mouse, the program stops recording. Connecting the points just recorded in turn is the track that our mouse just clicked and moved. A painting can be completed with one stroke or multiple strokes with multiple colors. We denote the doodle villain obtained here as B.

Since OpenCV collects our brush position according to a certain frame rate, for a line of the same length, if we draw slowly, the sampling points will be more; if we draw fast, the sampling points will be less. In the subsequent use, because a large number of relative relationships between sampling points and bone points need to be calculated (through evaluation, this part of the time will be much larger than the model's computing time, which will become a bottleneck for smooth operation), so here we have to deal with these The sampling point B is filtered. I use the most intuitive filtering method here, that is, when three consecutive points are on the same straight line, the middle point is filtered out, and only the two endpoints are retained. Through this method, the number of points for a simple doodle villain can be reduced from thousands to dozens, and the project can run more smoothly. Here we denote the filtered sampling point as B'.

Ways to filter and simplify skin data:

def linesFilter():
    global lines
    for line in lines:
        linelen = len(line)
        sindex = 0
        mindex = 1
        while mindex < len(line):
            eindex = mindex + 1
            if eindex >= len(line):
                break
            d1 = line[mindex][0] - line[sindex][0]
            d2 = line[mindex][1] - line[sindex][1]
            d3 = line[eindex][0] - line[sindex][0]
            d4 = line[eindex][1] - line[sindex][1]
#判断三个点是否在一条直线上
            if abs(d1*d4-d2*d3) <= 1e-6:
                line.pop(mindex)
            else:
                sindex += 1
                mindex += 1

def linesCompose():
#防止删除点过多,在每两个点中间插值出一个新的点
    global lines
    tlines = []
    for line in lines:
        tlines.append([line[0]])
        for i in range(1,len(line)):
            l_1 = tlines[-1][-1]
            tlines[-1].append(((l_1[0] + line[i][0]) / 2,(l_1[1] + line[i][1]) / 2))
            tlines[-1].append((line[i]))
    lines = tlines

Binding of key points and graffiti

The number of anchor point bindings: Before the animation starts, there is a very important step, which is to bind the skin B' we sampled and drawn with the expanded key point group K'. Through the above description, we know that the skin B' is actually a point, and this process is the binding between the points of the skin and the key points. In more professional terms, we need to select their anchor points for the skin points (Anchor points). ), these anchor points are derived from bone key points. In this project, each skin point is bound to up to four key points. This number is related to the number of our key points K'. When our key points K' are dense enough, the number of our anchor points can be a little less.

Anchor point binding standard: The measure of the selected anchor point here is the distance, that is, the m key points closest to the skin point n are selected. This method has disadvantages. For example, in the example, since we selected the nearest key point, among the corresponding anchor points of the beard we drew, the anchor points of our shoulders are closer than some points of the face, This causes the whiskers to follow our shoulders. If you want to match anchors more precisely, you can also manually intervene in this process and delete some of the above-mentioned unreasonable bindings.

30f067dc2ad4db0a950e6bb3348b6e95.png

The left is the graffiti we have drawn, and the right is the effect of binding the graffiti and key points

Bind skin data and bone data:

def buildskin(lines, colors, cirRads, nodes):
    if lines is None or nodes is None or len(lines) == 0 or len(nodes) == 0:
        return []
    skins = []
    print("doodle node length", len(nodes))
    #将opencv获取的皮肤点列表封装成skinItem类的对象列表
    for lineindex in range(len(lines)):
        init = True
        line = lines[lineindex]
        color = colors[lineindex]
        cirRad = cirRads[lineindex]
        for p in line:
            if init:
                skins.append(skinItem(p[0], p[1], True, color, cirRad))
                init = False
            else:
                skins.append(skinItem(p[0], p[1], False, color, cirRad))
    #计算每个skinItem对象最近的四个骨骼点并封装为锚点
    for skin in skins:
        md = [float("inf"), float("inf"), float("inf"), float("inf")]
        mn = [None, None, None, None]
        mdlen = 0
        for key,node in nodes.items():
            d = distance(skin.getPos(), node.getPos())
            maxi = judge(md)
            if d < md[maxi]:
                md[maxi] = d
                mn[maxi] = node
                mdlen += 1

        if mdlen < 4:
            md = md[:mdlen]
            mn = mn[:mdlen]
        ws = dist2weight(md)
        # 分配每个锚点的权重
        for j in range(len(mn)):
            th = math.atan2(skin.y-mn[j].y, skin.x-mn[j].x)
            r = distance(skin.getPos(), mn[j].getPos())
            w = ws[j]
            skin.appendAnchor(anchorItem(mn[j], th-mn[j].thabs, r, w))
    return skins

Updates to Graffiti

After we initialize the previous step, we can calculate the new position of the skin point in each subsequent frame. When binding in the previous step, we also recorded some other information about the anchor point: the distance and angle information between the skin point and the anchor point. After getting the four anchor points, we also calculate an initial weight α. In this way, when the position of our key point changes, we can calculate a weighted new position S'' of the skin point based on the new position of the anchor point. We drew all the skins in the S'' order and the whole project was completed.

db7599fcce0ed330b194f10d40e32bc8.png

Calculate new skin points based on new bones every frame:

def calculateSkin(skins, scale):
    for skin in skins:
        xw = 0
        yw = 0
#根据皮肤点每个锚点的坐标与角度,计算出新的皮肤点的坐标
        for anchor in skin.getAnchor():
            x = anchor.node.x + math.cos(anchor.th+anchor.node.thabs) * anchor.r * scale
            y = anchor.node.y + math.sin(anchor.th+anchor.node.thabs) * anchor.r * scale
            xw += x * anchor.w
            yw += y * anchor.w
        skin.x = xw
        skin.y = yw
    return skins


Some current problems and improvement directions

1) Problem

A big disadvantage of the human_pose_estimation_resnet50_mpii model is that there is no output of joint confidence, so we have no way to filter the results. If an incomplete human image is input, the model will still output 16 key points, among which the key points that should not be in the image will also exist, and these false result points will lead to the phenomenon that the skin points are also drawn incorrectly. In addition, for better effect, it is better to input video with less background to reduce some influencing factors.

2) Improvement direction

  • You can try other keypoint detection models in the Paddle Detection kit. It is worth noting that if the model you use is based on the COCO dataset, you need to change the doodle file .

  • In order to have a smoother experience, you can try to put the keypoint detection model in another thread, and the effect will be better.

I also shared this project live in the Paddle Developer Said column. Welcome everyone to visit the B station to watch the video sharing.

Share the video:

https://www.bilibili.com/video/BV1N34y1y72o?spm_id_from=333.999.0.0

Project link:

https://aistudio.baidu.com/aistudio/projectdetail/2498845

PaddleDetection:

https://github.com/PaddlePaddle/PaddleDetection

PaddleHub

https://github.com/PaddlePaddle/PaddleHub

related suggestion

80f271ec9f779781b8596f4464fc1f02.gif

Pay attention to the public account of [ PaddlePaddle ]

Get more technical content~

a891f5fd1156567bb945e765351012d2.png

If you think the content is good, click " Watching "

ba1e22f0886a94a67f014f22e89d53eb.gif

This article is shared on the blog "PaddlePaddle" (CSDN).
If there is any infringement, please contact [email protected] to delete it.
This article participates in the " OSC Yuanchuang Project ", you are welcome to join and share with us.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324244716&siteId=291194637