The previous article " AI Painting Technology Practice Issue 1 " mentioned how to use Tencent Cloud's intelligent capabilities to implement a simplified version of AI painting. After its release, it attracted a lot of attention from netizens, who are also thinking about whether they can achieve better results. Recently, I found that AI painting gameplay has also set off a wave of craze on short video platforms. Combined with seeing some excellent AI painting models on the Internet, I also want to try to create a better experience based on the previous article.
Next, I will completely share my practical process, and interested friends can also try it.
1. Implement ideas
A portrait image is generated through AI, and then Tencent Cloud's intelligent capabilities are used to perform face fusion, and finally a portrait image with better effect is generated.
1.1 Detailed process:
2. Preparation
2.1 Stable-Diffusion deployment
Stable Diffusion is an open source text-to-image model that can generate a semantically correct picture by inputting a piece of text. For details, please see the introduction on github: GitHub - CompVis/stable-diffusion: A latent text-to-image diffusion model
Install according to the documentation. The installation process is similar and will not be repeated.
Generate images through scripts:
from torch import autocast
from diffusers import StableDiffusionPipeline
import sys
# 指定模型
pipe = StableDiffusionPipeline.from_pretrained(
# "CompVis/stable-diffusion-v1-4",
"runwayml/stable-diffusion-v1-5",
# "hakurei/waifu-diffusion",
use_auth_token=True
).to("cuda")
prompt = "a photo of an astronaut riding a horse on mars"
prompt = sys.argv[1]
with autocast("cuda"):
image = pipe(prompt, num_inference_steps=100).images[0]
image.save(sys.argv[2] + ".png")
Specify keywords, call the output, and see the generated effect:
python3 interface.py "*******" out
3. Mini program demo practice
The following is my process of implementing AI painting through the mini program.
3.1 AI painting server:
After the model is deployed, it can only be executed locally. We simply implement the following functions:
1. The user submits the task to cos, and the service executes the AI drawing task by pulling the content of cos.
2. Execute the shell command and upload the generated image to cos.
COS Document: Introduction to Object Storage_Object Storage Buying Guide_Object Storage Operation Guide-Tencent Cloud
AI painting model execution code:
type Request struct {
SessionId string `json:"session_id"`
JobId string `json:"job_id"`
Prompt string `json:"prompt"`
ModelUrl string `json:"model_url"`
ImageUrl string `json:"image_url"`
}
type JobInfo struct {
JobId string `json:"job_id"`
Request
}
func run(req *JobInfo) {
begin := time.Now()
Log("got a job, %+v", req)
jobId := req.JobId
cmd := exec.Command("sh", "start.sh", req.Prompt, jobId)
err := cmd.Run()
if err != nil {
fmt.Println("Execute Command failed:" + err.Error())
return
}
result, err := os.ReadFile(fmt.Sprintf("output/%s.png", jobId))
if err != nil {
panic(err)
}
url, err := cos.PutObject(context.Background(), fmt.Sprintf("aidraw/%s.png", jobId), result)
if err != nil {
panic(err)
}
resp := &Response{
SessionId: req.SessionId,
JobId: jobId,
JobStatus: "FINISNED",
CostTime: time.Since(begin).Milliseconds(),
ResultUrl: url,
}
Log("job finished, %+v", resp)
data, _ := json.Marshal(resp)
pushResult(jobId, string(data))
}
Task management is implemented through cos, which involves task pulling and result uploading. The following is the implementation code:
func pullJob() *JobInfo {
res, _, err := cos.GetInstance().Bucket.Get(context.Background(), &cossdk.BucketGetOptions{
Prefix: JOB_QUEUE_PUSH,
Delimiter: "",
EncodingType: "",
Marker: "",
MaxKeys: 10000,
})
if err != nil {
return nil
}
var jobId string
for _, v := range res.Contents {
if !objectExist(fmt.Sprintf("%s/%s", JOB_QUEUE_RESULT, getNameByPath(v.Key))) {
jobId = v.Key
break
}
}
if len(jobId) == 0 {
return nil
}
jobId = getNameByPath(jobId)
Log("new job %s", jobId)
resp, err := cos.GetInstance().Object.Get(context.Background(), fmt.Sprintf("%s/%s", JOB_QUEUE_PUSH, jobId), &cossdk.ObjectGetOptions{})
if err != nil {
panic(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
return nil
}
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil
}
job := &JobInfo{
JobId: jobId,
}
err = json.Unmarshal(body, &job)
if err != nil {
return nil
}
return job
}
func pullResult(jobId string) *Response {
resp, err := cos.GetInstance().Object.Get(context.Background(), fmt.Sprintf("%s/%s", JOB_QUEUE_RESULT, jobId), &cossdk.ObjectGetOptions{})
if err != nil {
return nil
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
return nil
}
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil
}
rsp := &Response{}
json.Unmarshal(body, &rsp)
return rsp
}
func pushResult(jobId, result string) {
_, err := cos.PutObject(context.Background(), fmt.Sprintf("%s/%s", JOB_QUEUE_RESULT, jobId), []byte(result))
if err != nil {
panic(err)
}
}
3.2 Mini program server:
The applet needs to process messages asynchronously through the relay service. Let’s sort out the functions of the server:
1. Forward the request to AI Painting.
2. Query the results of AI painting. (transferred through cos)
The following is part of the code:
Agreement related:
type Request struct {
SessionId string `json:"session_id"`
JobId string `json:"job_id"`
Prompt string `json:"prompt"`
ModelUrl string `json:"model_url"`
ImageUrl string `json:"image_url"`
}
type Response struct {
SessionId string `json:"session_id"`
JobId string `json:"job_id"`
JobStatus string `json:"job_status"`
CostTime int64 `json:"cost_time"`
ResultUrl string `json:"result_url"`
TotalCnt int64 `json:"total_cnt"`
}
Submit task:
// submitJobHandler 提交任务
func submitJobHandler(writer http.ResponseWriter, request *http.Request) {
body, err := io.ReadAll(request.Body)
req := &Request{}
err = json.Unmarshal(body, &req)
if err != nil {
panic(err)
}
Log("got a submit request, %+v", req)
jobId := GenJobId()
pushJob(jobId, string(body))
resp := &Response{
SessionId: req.SessionId,
JobId: jobId,
TotalCnt: sumJob(),
}
data, _ := json.Marshal(resp)
writer.Write(data)
}
// describeJobHandler 查询任务
func describeJobHandler(writer http.ResponseWriter, request *http.Request) {
body, err := io.ReadAll(request.Body)
req := &Request{}
err = json.Unmarshal(body, &req)
if err != nil {
panic(err)
}
Log("got a query request, %+v", req.JobId)
var ret *Response
ret = pullResult(req.JobId)
if ret == nil {
ret = &Response{
SessionId: req.SessionId,
JobId: req.JobId,
JobStatus: "RUNNING",
}
}
data, _ := json.Marshal(ret)
writer.Write(data)
}
3.3. Mini program to implement AI drawing:
index.js
// index.js
// 获取应用实例
const app = getApp()
Page({
data: {
totalTask: 0,
leftTime: 40,
beginTime: 0,
processTime: 0,
taskStatus: "STOP",
inputValue: "",
tags: [],
option: [],
buttonStatus: false,
index: 0,
motto: 'Hello World',
userInfo: {},
hasUserInfo: false,
canIUse: wx.canIUse('button.open-type.getUserInfo'),
canIUseGetUserProfile: false,
canIUseOpenData: wx.canIUse('open-data.type.userAvatarUrl') && wx.canIUse('open-data.type.userNickName') // 如需尝试获取用户信息可改为false
},
// 事件处理函数
bindViewTap() {
wx.navigateTo({
url: '../logs/logs'
})
},
onLoad() {
if (wx.getUserProfile) {
this.setData({
canIUseGetUserProfile: true
})
}
this.onTimeout();
},
getUserProfile(e) {
// 推荐使用wx.getUserProfile获取用户信息,开发者每次通过该接口获取用户个人信息均需用户确认,开发者妥善保管用户快速填写的头像昵称,避免重复弹窗
wx.getUserProfile({
desc: '展示用户信息', // 声明获取用户个人信息后的用途,后续会展示在弹窗中,请谨慎填写
success: (res) => {
console.log(res)
this.setData({
userInfo: res.userInfo,
hasUserInfo: true
})
}
})
},
getUserInfo(e) {
// 不推荐使用getUserInfo获取用户信息,预计自2021年4月13日起,getUserInfo将不再弹出弹窗,并直接返回匿名的用户个人信息
console.log(e)
this.setData({
userInfo: e.detail.userInfo,
hasUserInfo: true
})
},
enentloop() {
var that = this
if (!that.data.Resp || !that.data.Resp.job_id) {
console.log("not found jobid")
return
}
return new Promise(function(yes, no) {
wx.request({
url: 'http://127.0.0.1:8000/frontend/query',
data: {
"session_id": "123",
"job_id": that.data.Resp.job_id
},
method: "POST",
header: {
'Content-Type': "application/json"
},
success (res) {
yes("hello");
if (res.data == null) {
wx.showToast({
icon: "error",
title: '请求查询失败',
})
return
}
console.log(Date.parse(new Date()), res.data)
that.setData({
Job: res.data,
})
console.log("job_status: ", res.data.job_status)
if (res.data.job_status === "FINISNED") {
console.log("draw image: ", res.data.result_url)
that.drawInputImage(res.data.result_url);
that.setData({
Resp: {},
taskStatus: "STOP"
})
} else {
that.setData({
taskStatus: "PROCESSING",
processTime: (Date.parse(new Date()) - that.data.beginTime)/ 1000
})
}
},
fail(res) {
wx.showToast({
icon: "error",
title: '请求查询失败',
})
console.log(res)
}
})
})
},
onTimeout: function() {
// 开启定时器
var that = this;
let ticker = setTimeout(async function() {
console.log("begin")
await that.enentloop();
console.log("end")
that.onTimeout();
}, 3 * 1000); // 毫秒数
// clearTimeout(ticker);
that.setData({
ticker: ticker
});
},
imageDraw() {
var that = this
var opt = {}
if (that.data.option && that.data.option.length > 0) {
opt = {
"tags": that.data.option
}
}
console.log("option:", opt)
wx.request({
url: 'http://127.0.0.1:8000/frontend/create',
data: {
"prompt": that.data.inputValue
},
method: "POST",
header: {
'Content-Type': "application/json"
},
success (res) {
if (res.data == null) {
wx.showToast({
icon: "error",
title: '请求失败',
})
return
}
console.log(res.data)
// let raw = JSON.parse(res.data)
that.setData({
Resp: res.data,
})
that.setData({
totalTask: res.data.total_cnt,
beginTime: Date.parse(new Date())
})
},
fail(res) {
wx.showToast({
icon: "error",
title: '请求失败',
})
}
})
},
drawInputImage: function(url) {
var that = this;
console.log("result_url: ", url)
let resUrl = url; // that.data.Job.result_url;
wx.downloadFile({
url: resUrl,
success: function(res) {
var imagePath = res.tempFilePath
wx.getImageInfo({
src: imagePath,
success: function(res) {
wx.createSelectorQuery()
.select('#input_canvas') // 在 WXML 中填入的 id
.fields({ node: true, size: true })
.exec((r) => {
// Canvas 对象
const canvas = r[0].node
// 渲染上下文
const ctx = canvas.getContext('2d')
// Canvas 画布的实际绘制宽高
const width = r[0].width
const height = r[0].height
// 初始化画布大小
const dpr = wx.getWindowInfo().pixelRatio
canvas.width = width * dpr
canvas.height = height * dpr
ctx.scale(dpr, dpr)
ctx.clearRect(0, 0, width, height)
let radio = height / res.height
console.log("radio:", radio)
const img = canvas.createImage()
var x = width / 2 - (res.width * radio / 2)
img.src = imagePath
img.onload = function() {
ctx.drawImage(img, x, 0, res.width * radio, res.height * radio)
}
})
}
})
}
})
},
handlerInput(e) {
this.setData({
inputValue: e.detail.value
})
},
handlerSearch(e) {
console.log("input: ", this.data.inputValue)
if (this.data.inputValue.length == 0) {
wx.showToast({
icon: "error",
title: '请输入你的创意 ',
})
return
}
this.imageDraw()
},
handlerInputPos(e) {
console.log(e)
this.setData({
inputValue: e.detail.value
})
},
handlerInputFusion(e) {
console.log(e)
this.setData({
inputUrl: e.detail.value
})
},
handlerInputImage(e) {
console.log(e)
},
clickItem(e) {
let $bean = e.currentTarget.dataset
console.log(e)
console.log("value: ", $bean.bean)
this.setData({
option: $bean.bean
})
this.imageDraw()
}
})
index.wxml:
<view class="container" style="width: 750rpx; height: 1229rpx; display: flex; box-sizing: border-box">
<div class="form-item" style="width: 673rpx; height: 70rpx; display: block; box-sizing: border-box">
<input placeholder="写下你的创意" class="input" bindinput="handlerInput" />
<input placeholder="待融合URL" class="input" bindinput="handlerInputFusion" />
<button class="button" loading="{
{buttonStatus}}" bindtap="handlerSearch" size="mini" style="width: 158rpx; height: 123rpx; display: block; box-sizing: border-box; left: 0rpx; top: -60rpx; position: relative"> 立即生成 </button>
</div>
<view class="text_box">
<text class="text_line" style="position: relative; left: 18rpx; top: 0rpx">完成任务数:</text>
<text class="text_line" style="position: relative; left: 8rpx; top: 0rpx">{
{totalTask}},</text>
<text class="text_line" style="position: relative; left: 38rpx; top: 0rpx">{
{taskStatus}}</text>
<text class="text_line" style="position: relative; left: 43rpx; top: 0rpx">{
{processTime}}/{
{leftTime}}s</text>
</view>
<view class="output_line" style="position: relative; left: 2rpx; top: 51rpx; width: 714rpx; height: 40rpx; display: flex; box-sizing: border-box">
<text class="text_line" style="width: 199rpx; height: 0rpx; display: block; box-sizing: border-box; position: relative; left: 1rpx; top: -92rpx">作品图片</text>
<view style="position: relative; left: -15rpx; top: 2rpx; width: 571rpx; height: 0rpx; display: block; box-sizing: border-box"></view>
</view>
<canvas type="2d" id="input_canvas" style="background: rgb(228, 228, 225); width: 673rpx; height: 715rpx; position: relative; left: 2rpx; top: -64rpx; display: block; box-sizing: border-box">
</canvas>
<view class="output_line" style="position: relative; left: 0rpx; top: 50rpx; width: 714rpx; height: 58rpx; display: flex; box-sizing: border-box">
</view>
</view>
At this point, an AI drawing applet has been implemented. Next, take a look at the effect. By entering keywords, you can get the picture of the work:
A new problem came. After testing, I found that the face part of the picture directly generated by the AI model was not ideal, as shown in the following picture:
How to make portraits more natural? I researched the existing AI capabilities on the market and found that the face fusion of Tencent Cloud AI can realize the face-changing function. Let’s take a look at the detailed introduction below.
3.4. Face fusion
3.4.1 Introduction to face fusion
3.5.2 Fusion function demonstration:
3.4.3 Fusion console:
Used to create activities and materials.
3.4.4 Material management:
Just add materials:
The material here refers to the picture we generated through AI. Let’s take a look at the effect.
3.4.5 Verify AI painting + fusion effect
We uploaded the above problematic pictures to the picture fusion DEMO page , we did a picture and face fusion, and found that the effect is quite amazing:
The following is the normal face-changing effect:
Based on the above results and combined with our usage scenarios, we can add Tencent Cloud image fusion capabilities to the existing AI painting.
3.5 Mini program adds fusion effect:
We add integration steps based on the original process. The following is the specific process:
3.5.1 General idea:
3.5.2 Detailed process:
Added face fusion operation.
3.5.3 Add face fusion processing interface to the server:
Add integrated task processing on the mini program server:
// facefusionHandler ...
func facefusionHandler(writer http.ResponseWriter, request *http.Request) {
body, err := io.ReadAll(request.Body)
req := &Request{}
err = json.Unmarshal(body, &req)
if err != nil {
panic(err)
}
ret := &Response{
SessionId: req.SessionId,
// 将AI画画的图上传至素材管理, 并和输入图做融合
ResultUrl: rawCloud(req.ModelUrl, req.ImageUrl),
}
data, _ := json.Marshal(ret)
writer.Write(data)
}
Uploading AI-drawn pictures to material management generally needs to be executed on the console. I call it directly through the API. It requires a handwritten V3 signature . The code will not be posted. If you are interested, you can take a look here .
3.5.4 The mini program adds post-fusion tasks:
After getting the image drawn by AI, the applet goes through the fusion operation as needed.
facefusion(modelUrl, imageUrl) {
var that = this;
that.setData({
taskStatus: "融合中...",
processTime: (Date.parse(new Date()) - that.data.beginTime)/ 1000
})
wx.request({
url: 'http://127.0.0.1:8000/frontend/fusion',
data: {
"session_id": "123",
"model_url": modelUrl,
"image_url": imageUrl
},
method: "POST",
header: {
'Content-Type': "application/json"
},
success (res) {
if (res.data == null) {
wx.showToast({
icon: "error",
title: '请求融合失败',
})
return
}
if (res.data.result_url !== "") {
console.log("draw image: ", res.data.result_url)
that.drawInputImage(res.data.result_url);
that.setData({
Resp: {}
})
that.setData({
taskStatus: "STOP"
})
// clearTimeout(that.data.ticker);
} else {
that.setData({
taskStatus: "PROCESSING",
processTime: (Date.parse(new Date()) - that.data.beginTime)/ 1000
})
}
// a portrait of an old coal miner in 19th century, beautiful painting with highly detailed face by greg rutkowski and magali villanueve
},
fail(res) {
wx.showToast({
icon: "error",
title: '请求融合失败',
})
console.log(res)
}
})
},
When compilation is started, the status of "Converging" will be added to the task status:
Take a look at the before and after comparison. This is the picture generated by AI:
The fused picture:
The following interface has been optimized, take a look at the final version::
Summarize
At this point, a demo of AI drawing + portrait fusion has been implemented. Using the two together can generate better face effects, and you can also organize better prompts yourself to generate better portrait images. There are many models and keywords worth exploring on huggingface . We will introduce them here first in this issue.