AI Painting Technology Practice Issue 2|Using Tencent Cloud Intelligent Picture Fusion to Optimize the Effect of AI Painting

The previous article " AI Painting Technology Practice Issue 1 " mentioned how to use Tencent Cloud's intelligent capabilities to implement a simplified version of AI painting. After its release, it attracted a lot of attention from netizens, who are also thinking about whether they can achieve better results. Recently, I found that AI painting gameplay has also set off a wave of craze on short video platforms. Combined with seeing some excellent AI painting models on the Internet, I also want to try to create a better experience based on the previous article.

Next, I will completely share my practical process, and interested friends can also try it.

1. Implement ideas

A portrait image is generated through AI, and then Tencent Cloud's intelligent capabilities are used to perform face fusion, and finally a portrait image with better effect is generated.

1.1 Detailed process:

 

2. Preparation

2.1 Stable-Diffusion deployment

Stable Diffusion is an open source text-to-image model that can generate a semantically correct picture by inputting a piece of text. For details, please see the introduction on github:  GitHub - CompVis/stable-diffusion: A latent text-to-image diffusion model

Install according to the documentation. The installation process is similar and will not be repeated.

Generate images through scripts: 

from torch import autocast
from diffusers import StableDiffusionPipeline
import sys

# 指定模型
pipe = StableDiffusionPipeline.from_pretrained(
        # "CompVis/stable-diffusion-v1-4", 
        "runwayml/stable-diffusion-v1-5",
        # "hakurei/waifu-diffusion",
        use_auth_token=True
).to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
prompt = sys.argv[1]
with autocast("cuda"):
    image = pipe(prompt, num_inference_steps=100).images[0]  
    image.save(sys.argv[2] + ".png")

Specify keywords, call the output, and see the generated effect: 

python3 interface.py "*******" out

 

3. Mini program demo practice

The following is my process of implementing AI painting through the mini program.

3.1 AI painting server:

After the model is deployed, it can only be executed locally. We simply implement the following functions:

1. The user submits the task to cos, and the service executes the AI ​​drawing task by pulling the content of cos.

2. Execute the shell command and upload the generated image to cos.

COS Document:  Introduction to Object Storage_Object Storage Buying Guide_Object Storage Operation Guide-Tencent Cloud

AI painting model execution code:

type Request struct {
	SessionId string `json:"session_id"`
	JobId     string `json:"job_id"`
	Prompt    string `json:"prompt"`
	ModelUrl  string `json:"model_url"`
	ImageUrl  string `json:"image_url"`
}

type JobInfo struct {
	JobId string `json:"job_id"`
	Request
}
func run(req *JobInfo) {
	begin := time.Now()

	Log("got a job, %+v", req)
	jobId := req.JobId
	cmd := exec.Command("sh", "start.sh", req.Prompt, jobId)

	err := cmd.Run()
	if err != nil {
		fmt.Println("Execute Command failed:" + err.Error())
		return
	}

	result, err := os.ReadFile(fmt.Sprintf("output/%s.png", jobId))
	if err != nil {
		panic(err)
	}
	url, err := cos.PutObject(context.Background(), fmt.Sprintf("aidraw/%s.png", jobId), result)
	if err != nil {
		panic(err)
	}
	resp := &Response{
		SessionId: req.SessionId,
		JobId:     jobId,
		JobStatus: "FINISNED",
		CostTime:  time.Since(begin).Milliseconds(),
		ResultUrl: url,
	}
	Log("job finished, %+v", resp)
	data, _ := json.Marshal(resp)
	pushResult(jobId, string(data))
}

Task management is implemented through cos, which involves task pulling and result uploading. The following is the implementation code:

func pullJob() *JobInfo {
	res, _, err := cos.GetInstance().Bucket.Get(context.Background(), &cossdk.BucketGetOptions{
		Prefix:       JOB_QUEUE_PUSH,
		Delimiter:    "",
		EncodingType: "",
		Marker:       "",
		MaxKeys:      10000,
	})
	if err != nil {
		return nil
	}
	var jobId string
	for _, v := range res.Contents {
		if !objectExist(fmt.Sprintf("%s/%s", JOB_QUEUE_RESULT, getNameByPath(v.Key))) {
			jobId = v.Key
			break
		}
	}
	if len(jobId) == 0 {
		return nil
	}
	jobId = getNameByPath(jobId)
	Log("new job %s", jobId)
	resp, err := cos.GetInstance().Object.Get(context.Background(), fmt.Sprintf("%s/%s", JOB_QUEUE_PUSH, jobId), &cossdk.ObjectGetOptions{})
	if err != nil {
		panic(err)
	}
	defer resp.Body.Close()
	if resp.StatusCode != 200 {
		return nil
	}
	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return nil
	}
	job := &JobInfo{
		JobId: jobId,
	}
	err = json.Unmarshal(body, &job)
	if err != nil {
		return nil
	}

	return job
}

func pullResult(jobId string) *Response {
	resp, err := cos.GetInstance().Object.Get(context.Background(), fmt.Sprintf("%s/%s", JOB_QUEUE_RESULT, jobId), &cossdk.ObjectGetOptions{})
	if err != nil {
		return nil
	}
	defer resp.Body.Close()
	if resp.StatusCode != 200 {
		return nil
	}
	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return nil
	}
	rsp := &Response{}
	json.Unmarshal(body, &rsp)
	return rsp
}

func pushResult(jobId, result string) {
	_, err := cos.PutObject(context.Background(), fmt.Sprintf("%s/%s", JOB_QUEUE_RESULT, jobId), []byte(result))
	if err != nil {
		panic(err)
	}
}

3.2 Mini program server:

The applet needs to process messages asynchronously through the relay service. Let’s sort out the functions of the server:

1. Forward the request to AI Painting.

2. Query the results of AI painting. (transferred through cos)

The following is part of the code:

Agreement related:

type Request struct {
	SessionId string `json:"session_id"`
	JobId     string `json:"job_id"`
	Prompt    string `json:"prompt"`
	ModelUrl  string `json:"model_url"`
	ImageUrl  string `json:"image_url"`
}

type Response struct {
	SessionId string `json:"session_id"`
	JobId     string `json:"job_id"`
	JobStatus string `json:"job_status"`
	CostTime  int64  `json:"cost_time"`
	ResultUrl string `json:"result_url"`
	TotalCnt  int64  `json:"total_cnt"`
}

Submit task:

// submitJobHandler 提交任务
func submitJobHandler(writer http.ResponseWriter, request *http.Request) {
	body, err := io.ReadAll(request.Body)
	req := &Request{}
	err = json.Unmarshal(body, &req)
	if err != nil {
		panic(err)
	}
	Log("got a submit request, %+v", req)
	jobId := GenJobId()
	pushJob(jobId, string(body))
	resp := &Response{
		SessionId: req.SessionId,
		JobId:     jobId,
		TotalCnt:  sumJob(),
	}
	data, _ := json.Marshal(resp)
	writer.Write(data)
}

// describeJobHandler 查询任务
func describeJobHandler(writer http.ResponseWriter, request *http.Request) {
	body, err := io.ReadAll(request.Body)
	req := &Request{}
	err = json.Unmarshal(body, &req)
	if err != nil {
		panic(err)
	}
	Log("got a query request, %+v", req.JobId)
	var ret *Response
	ret = pullResult(req.JobId)
	if ret == nil {
		ret = &Response{
			SessionId: req.SessionId,
			JobId:     req.JobId,
			JobStatus: "RUNNING",
		}
	}
	data, _ := json.Marshal(ret)
	writer.Write(data)
}

3.3. Mini program to implement AI drawing: 

index.js

// index.js
// 获取应用实例
const app = getApp()

Page({
  data: {
    totalTask: 0,
    leftTime: 40,
    beginTime: 0,
    processTime: 0,
    taskStatus: "STOP",
    inputValue: "",
    tags: [],
    option: [],
    buttonStatus: false,
    index: 0,
    motto: 'Hello World',
    userInfo: {},
    hasUserInfo: false,
    canIUse: wx.canIUse('button.open-type.getUserInfo'),
    canIUseGetUserProfile: false,
    canIUseOpenData: wx.canIUse('open-data.type.userAvatarUrl') && wx.canIUse('open-data.type.userNickName') // 如需尝试获取用户信息可改为false
  },
  // 事件处理函数
  bindViewTap() {
    wx.navigateTo({
      url: '../logs/logs'
    })
  },
  onLoad() {
    if (wx.getUserProfile) {
      this.setData({
        canIUseGetUserProfile: true
      })
    }
    this.onTimeout();
  },
 
  getUserProfile(e) { 
    // 推荐使用wx.getUserProfile获取用户信息,开发者每次通过该接口获取用户个人信息均需用户确认,开发者妥善保管用户快速填写的头像昵称,避免重复弹窗
    wx.getUserProfile({
      desc: '展示用户信息', // 声明获取用户个人信息后的用途,后续会展示在弹窗中,请谨慎填写
      success: (res) => {
        console.log(res)
        this.setData({
          userInfo: res.userInfo,
          hasUserInfo: true
        })
      }
    })
  },
  getUserInfo(e) {
    // 不推荐使用getUserInfo获取用户信息,预计自2021年4月13日起,getUserInfo将不再弹出弹窗,并直接返回匿名的用户个人信息
    console.log(e)
    this.setData({
      userInfo: e.detail.userInfo,
      hasUserInfo: true
    })
  },

  enentloop() {
    var that = this
    if (!that.data.Resp || !that.data.Resp.job_id) {
      console.log("not found jobid")
      return
    }
    return new Promise(function(yes, no) {
      wx.request({
      url: 'http://127.0.0.1:8000/frontend/query',
      data: {
        "session_id": "123",
        "job_id": that.data.Resp.job_id
      },
      method: "POST",
      header: {
        'Content-Type': "application/json"
      },
      success (res) {
        yes("hello");
        if (res.data == null) {
          wx.showToast({
            icon: "error",
            title: '请求查询失败',
          })
          return
        }
        console.log(Date.parse(new Date()), res.data)
        that.setData({
          Job: res.data,
        })
        console.log("job_status: ", res.data.job_status)
        if (res.data.job_status === "FINISNED") {
          console.log("draw image: ", res.data.result_url)
          that.drawInputImage(res.data.result_url);
          that.setData({
            Resp: {},
            taskStatus: "STOP"
          })
        } else {
          that.setData({
            taskStatus: "PROCESSING",
            processTime: (Date.parse(new Date()) - that.data.beginTime)/ 1000
          })
        }
      },
      fail(res) {
        wx.showToast({
          icon: "error",
          title: '请求查询失败',
        })
        console.log(res)
      }
    })
  })
  },

  onTimeout:  function() {
    // 开启定时器
    var that = this;
    let ticker = setTimeout(async function() {
      console.log("begin")
      await that.enentloop();
      console.log("end")
      that.onTimeout();
    }, 3 * 1000); // 毫秒数
    // clearTimeout(ticker);
    that.setData({
      ticker: ticker
    });
  },

  imageDraw() {
    var that = this
    var opt = {}
    if (that.data.option && that.data.option.length > 0) {
      opt = {
        "tags": that.data.option
      }
    }
    console.log("option:", opt)
    wx.request({
      url: 'http://127.0.0.1:8000/frontend/create',
      data: {
        "prompt": that.data.inputValue
      },
      method: "POST",
      header: {
        'Content-Type': "application/json"
      },
      success (res) {
        if (res.data == null) {
          wx.showToast({
            icon: "error",
            title: '请求失败',
          })
          return
        }
        console.log(res.data)
        // let raw = JSON.parse(res.data)
        that.setData({
          Resp: res.data,
        })
        that.setData({
          totalTask: res.data.total_cnt,
          beginTime: Date.parse(new Date())
        })
      },
      fail(res) {
        wx.showToast({
          icon: "error",
          title: '请求失败',
        })
      }
    })
  },

  drawInputImage: function(url) {
    var that = this;
    console.log("result_url: ", url)

    let resUrl = url; // that.data.Job.result_url;
    
    wx.downloadFile({
      url: resUrl,
      success: function(res) {
        var imagePath = res.tempFilePath
        wx.getImageInfo({
          src: imagePath,
          success: function(res) {
            wx.createSelectorQuery()
            .select('#input_canvas') // 在 WXML 中填入的 id
            .fields({ node: true, size: true })
            .exec((r) => {
              // Canvas 对象
              const canvas = r[0].node
              // 渲染上下文
              const ctx = canvas.getContext('2d')
              // Canvas 画布的实际绘制宽高 
              const width = r[0].width
              const height = r[0].height
              // 初始化画布大小
              const dpr = wx.getWindowInfo().pixelRatio
              canvas.width = width * dpr
              canvas.height = height * dpr
              ctx.scale(dpr, dpr)
              ctx.clearRect(0, 0, width, height)

              let radio = height / res.height
              console.log("radio:", radio)
              const img = canvas.createImage()
              var x = width / 2 - (res.width * radio / 2)

              img.src = imagePath
              img.onload = function() {
                ctx.drawImage(img, x, 0, res.width * radio, res.height * radio)
              }
            })
          }
        })
      }
    })
  },

  handlerInput(e) {
    this.setData({
      inputValue: e.detail.value
    })
  },

  handlerSearch(e) {
    console.log("input: ", this.data.inputValue)

    if (this.data.inputValue.length == 0) {
      wx.showToast({
        icon: "error",
        title: '请输入你的创意 ',
      })
      return
    }
    this.imageDraw()
  },
  handlerInputPos(e) {
    console.log(e)
    this.setData({
      inputValue: e.detail.value
    })
  },
  handlerInputFusion(e) {
    console.log(e)
    this.setData({
      inputUrl: e.detail.value
    })
  },
  handlerInputImage(e) {
    console.log(e)
  },
  clickItem(e) {
    let $bean = e.currentTarget.dataset
    console.log(e)
    console.log("value: ", $bean.bean)
    this.setData({
      option: $bean.bean
    })
    this.imageDraw()
  }
})

index.wxml:

<view class="container" style="width: 750rpx; height: 1229rpx; display: flex; box-sizing: border-box">
  <div class="form-item" style="width: 673rpx; height: 70rpx; display: block; box-sizing: border-box">
    <input placeholder="写下你的创意" class="input" bindinput="handlerInput" />
    <input placeholder="待融合URL" class="input" bindinput="handlerInputFusion" />
    <button class="button" loading="{
   
   {buttonStatus}}" bindtap="handlerSearch" size="mini" style="width: 158rpx; height: 123rpx; display: block; box-sizing: border-box; left: 0rpx; top: -60rpx; position: relative"> 立即生成 </button>
  </div>
  <view class="text_box">
    <text class="text_line" style="position: relative; left: 18rpx; top: 0rpx">完成任务数:</text>
    <text class="text_line" style="position: relative; left: 8rpx; top: 0rpx">{
   
   {totalTask}},</text>
    <text class="text_line" style="position: relative; left: 38rpx; top: 0rpx">{
   
   {taskStatus}}</text>
    <text class="text_line" style="position: relative; left: 43rpx; top: 0rpx">{
   
   {processTime}}/{
   
   {leftTime}}s</text>
  </view>

  <view class="output_line" style="position: relative; left: 2rpx; top: 51rpx; width: 714rpx; height: 40rpx; display: flex; box-sizing: border-box">
    <text class="text_line" style="width: 199rpx; height: 0rpx; display: block; box-sizing: border-box; position: relative; left: 1rpx; top: -92rpx">作品图片</text>
    <view style="position: relative; left: -15rpx; top: 2rpx; width: 571rpx; height: 0rpx; display: block; box-sizing: border-box"></view>
  </view>
  <canvas type="2d" id="input_canvas" style="background: rgb(228, 228, 225); width: 673rpx; height: 715rpx; position: relative; left: 2rpx; top: -64rpx; display: block; box-sizing: border-box">
  </canvas>
  <view class="output_line" style="position: relative; left: 0rpx; top: 50rpx; width: 714rpx; height: 58rpx; display: flex; box-sizing: border-box">
  </view>
</view>

At this point, an AI drawing applet has been implemented. Next, take a look at the effect. By entering keywords, you can get the picture of the work: 

 

A new problem came. After testing, I found that the face part of the picture directly generated by the AI ​​​​model was not ideal, as shown in the following picture:

 

How to make portraits more natural? I researched the existing AI capabilities on the market and found that the face fusion of Tencent Cloud AI can realize the face-changing function. Let’s take a look at the detailed introduction below.

3.4. Face fusion

3.4.1 Introduction to face fusion

 

3.5.2 Fusion function demonstration:

​​​​​​​

 

3.4.3 Fusion console:

Used to create activities and materials. 

 

3.4.4 Material management:

 

Just add materials:

The material here refers to the picture we generated through AI. Let’s take a look at the effect.

3.4.5 Verify AI painting + fusion effect

We uploaded the above problematic pictures to the picture fusion DEMO page , we did a picture and face fusion, and found that the effect is quite amazing:

 

The following is the normal face-changing effect:

 

Based on the above results and combined with our usage scenarios, we can add Tencent Cloud image fusion capabilities to the existing AI painting.

3.5 Mini program adds fusion effect:

We add integration steps based on the original process. The following is the specific process: 

3.5.1 General idea: 

 

3.5.2 Detailed process: 

Added face fusion operation.

 

3.5.3 Add face fusion processing interface to the server:

Add integrated task processing on the mini program server: 

// facefusionHandler ...
func facefusionHandler(writer http.ResponseWriter, request *http.Request) {
	body, err := io.ReadAll(request.Body)
	req := &Request{}
	err = json.Unmarshal(body, &req)
	if err != nil {
		panic(err)
	}

	ret := &Response{
		SessionId: req.SessionId,
 		// 将AI画画的图上传至素材管理, 并和输入图做融合
		ResultUrl: rawCloud(req.ModelUrl, req.ImageUrl),
	}
	data, _ := json.Marshal(ret)
	writer.Write(data)
}

Uploading AI-drawn pictures to material management generally needs to be executed on the console. I call it directly through the API. It requires a handwritten V3 signature . The code will not be posted. If you are interested, you can take a look here .

3.5.4 The mini program adds post-fusion tasks:

After getting the image drawn by AI, the applet goes through the fusion operation as needed.

facefusion(modelUrl, imageUrl) {
    var that = this;
    that.setData({
      taskStatus: "融合中...",
      processTime: (Date.parse(new Date()) - that.data.beginTime)/ 1000
    })
    wx.request({
      url: 'http://127.0.0.1:8000/frontend/fusion',
      data: {
        "session_id": "123",
        "model_url": modelUrl,
        "image_url": imageUrl
      },
      method: "POST",
      header: {
        'Content-Type': "application/json"
      },
      success (res) {
        if (res.data == null) {
          wx.showToast({
            icon: "error",
            title: '请求融合失败',
          })
          return
        }
        
        if (res.data.result_url !== "") {
          console.log("draw image: ", res.data.result_url)
          that.drawInputImage(res.data.result_url);
          that.setData({
            Resp: {}
          })
          that.setData({
            taskStatus: "STOP"
          })
          // clearTimeout(that.data.ticker);
        } else {
          that.setData({
            taskStatus: "PROCESSING",
            processTime: (Date.parse(new Date()) - that.data.beginTime)/ 1000
          })
        }
        // a portrait of an old coal miner in 19th century, beautiful painting with highly detailed face by greg rutkowski and magali villanueve
      },
      fail(res) {
        wx.showToast({
          icon: "error",
          title: '请求融合失败',
        })
        console.log(res)
      }
    })
  },

When compilation is started, the status of "Converging" will be added to the task status:

 

Take a look at the before and after comparison. This is the picture generated by AI:

The fused picture:

The following interface has been optimized, take a look at the final version:: 

​​​​​​​

 

Summarize

At this point, a demo of AI drawing + portrait fusion has been implemented. Using the two together can generate better face effects, and you can also organize better prompts yourself to generate better portrait images. There are many models and keywords worth exploring on huggingface . We will introduce them here first in this issue.

Guess you like

Origin blog.csdn.net/tencentAI/article/details/128316153