Python and Deep Learning (16): CNN and Pokémon Model II

1. Description

This article is to test the model trained by the Pokémon model in the previous article. The first is to reload the trained model, then use opencv to load the picture, and finally send the loaded picture to the model and display the result.

2. CNN model test of Pokémon model

2.1 Import related libraries

Import the required third-party library here, such as cv2. If not, you need to download it yourself. When you download it yourself, it is generally recommended to mirror the source, so that the download is fast.

import tensorflow as tf
from PIL import ImageFont, Image, ImageDraw
from tensorflow import keras
import cv2, os, sys
import numpy as np
label = ['妙蛙种子', '小火龙', '超梦', '皮卡丘', '杰尼龟']

2.2 Load the model

Also load the trained model, there is no need to load the data here, because the data is self-made.

network = keras.models.load_model('my_bkm.h5')
network.summary()

2.3 Set the path to save the picture

Save a certain data of the data set in the form of a picture, which is convenient for the visualization of the test. Here, the test set has been divided before, so just set the picture path.
Set the storage location of the picture here, so that the picture can be stored conveniently.

path = os.path.join(sys.path[0], 'test.png')

The above code is to test the test.png in the test folder. If you want to test others, you only need to change it to x.jpg.
insert image description here

2.4 Load pictures

Use cv2 to load the picture. When using the opencv library, that is, cv2 to read the picture, the picture is three-channel, and the trained model is three-channel, so it is not only a single channel, but three channels. Here and before Grayscale images are different.

image = cv2.imread(path)
img = image.copy()
img = cv2.resize(img, (96, 96))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

2.5 Data processing and normalization

The image input to the network is processed and converted to 0-1, and then normalized.
After normalization, the speed of gradient descent to find the optimal solution is accelerated, that is, the convergence of the training network is accelerated.

def normalize(x):
    img_mean = tf.constant([0.485, 0.456, 0.406])
    img_std = tf.constant([0.229, 0.224, 0.225])
    x = (x - img_mean) / img_std
    return x


def preprocess(x):
    x = tf.expand_dims(x, axis=0)
    x = tf.cast(x, dtype=tf.float32) / 255.
    # x = normalize(x)
    return x

2.6 Predicting pictures

Input the picture to the trained model and make predictions.
Because it is a five-category, the predicted result is five probability values, so it needs to be processed. np.argmax() is the sequence number to get the maximum value of the probability value, that is, the predicted number.

result = network(img)
result = tf.nn.softmax(result)
print(result)
index = tf.argmax(result, axis=-1)
print(label[int(index)])

2.7 Display pictures

Display the predicted picture, and display the predicted number on the picture.
The following 5 lines of code are to create the window, set the window size, display the picture, stay the picture, and clear the memory.

cv2.namedWindow('img', 0)
cv2.resizeWindow('img', 500, 500)   # 自己设定窗口图片的大小
#cv2.putText(image, label[int(index)], (166, 54), cv2.FONT_HERSHEY_SCRIPT_SIMPLEX, 1.2, (255, 0, 0), 2)
cv2.imshow('img', image)
cv2.waitKey()
cv2.destroyAllWindows()

3. Complete code and display results

Below is the complete code and a picture showing the result.

import tensorflow as tf
from PIL import ImageFont, Image, ImageDraw
from tensorflow import keras
import cv2, os, sys
import numpy as np
label = ['妙蛙种子', '小火龙', '超梦', '皮卡丘', '杰尼龟']

network = keras.models.load_model('my_bkm.h5')
network.summary()
path = os.path.join(sys.path[0], 'test.png')
image = cv2.imread(path)
img = image.copy()
img = cv2.resize(img, (96, 96))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)


def show_chinese(img,text,pos):
    img_pil = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    font = ImageFont.truetype(font='msyh.ttc', size=36)
    draw = ImageDraw.Draw(img_pil)
    draw.text(pos, text, font=font, fill=(255, 0, 0))  # PIL中RGB=(255,0,0)表示红色
    img_cv = np.array(img_pil)                         # PIL图片转换为numpy
    img = cv2.cvtColor(img_cv, cv2.COLOR_RGB2BGR)      # PIL格式转换为OpenCV的BGR格式
    return img


def normalize(x):
    img_mean = tf.constant([0.485, 0.456, 0.406])
    img_std = tf.constant([0.229, 0.224, 0.225])
    x = (x - img_mean) / img_std
    return x


def preprocess(x):
    x = tf.expand_dims(x, axis=0)
    x = tf.cast(x, dtype=tf.float32) / 255.
    # x = normalize(x)
    return x


img = preprocess(img)

# img= tf.cast(img, dtype=tf.uint8)

result = network(img)
result = tf.nn.softmax(result)
print(result)
index = tf.argmax(result, axis=-1)
print(label[int(index)])

# # 显示
image = show_chinese(image, label[int(index)], (356, 54))
cv2.namedWindow('img', 0)
cv2.resizeWindow('img', 500, 500)   # 自己设定窗口图片的大小
#cv2.putText(image, label[int(index)], (166, 54), cv2.FONT_HERSHEY_SCRIPT_SIMPLEX, 1.2, (255, 0, 0), 2)
cv2.imshow('img', image)
cv2.waitKey()
cv2.destroyAllWindows()

tf.Tensor([[1.1600139e-09 2.5695030e-05 8.4645586e-15 9.9997413e-01 6.6168944e-08]], shape=(1, 5), dtype=float32)
皮卡丘

A

4. The complete code and results of testing with multiple pictures

In order to test more pictures, a loop is introduced to perform multiple tests, and the effect is better.

import tensorflow as tf
from PIL import ImageFont, Image, ImageDraw
from tensorflow import keras
import cv2, os, sys
import numpy as np
label = ['妙蛙种子', '小火龙', '超梦', '皮卡丘', '杰尼龟']
def show_chinese(img,text,pos):
    img_pil = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    font = ImageFont.truetype(font='msyh.ttc', size=36)
    draw = ImageDraw.Draw(img_pil)
    draw.text(pos, text, font=font, fill=(255, 0, 0))  # PIL中RGB=(255,0,0)表示红色
    img_cv = np.array(img_pil)                         # PIL图片转换为numpy
    img = cv2.cvtColor(img_cv, cv2.COLOR_RGB2BGR)      # PIL格式转换为OpenCV的BGR格式
    return img


def normalize(x):
    img_mean = tf.constant([0.485, 0.456, 0.406])
    img_std = tf.constant([0.229, 0.224, 0.225])
    x = (x - img_mean) / img_std
    return x


def preprocess(x):
    x = tf.expand_dims(x, axis=0)
    x = tf.cast(x, dtype=tf.float32) / 255.
    # x = normalize(x)
    return x


network = keras.models.load_model('my_bkm.h5')
network.summary()
prepicture = int(input("input the number of test picture :"))
for i in range(prepicture):
    path1 = input("input the test picture path:")
    path = os.path.join(sys.path[0], path1)
    image = cv2.imread(path)
    img = image.copy()
    img = cv2.resize(img, (96, 96))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = preprocess(img)
    # img= tf.cast(img, dtype=tf.uint8)
    result = network(img)
    result = tf.nn.softmax(result)
    print(result)
    index = tf.argmax(result, axis=-1)
    print(label[int(index)])
    # # 显示
    image = show_chinese(image, label[int(index)], (356, 54))
    cv2.namedWindow('img', 0)
    cv2.resizeWindow('img', 500, 500)   # 自己设定窗口图片的大小
    #cv2.putText(image, label[int(index)], (166, 54), cv2.FONT_HERSHEY_SCRIPT_SIMPLEX, 1.2, (255, 0, 0), 2)
    cv2.imshow('img', image)
    cv2.waitKey()
    cv2.destroyAllWindows()
input the number of test picture :2
input the test picture path:1.png
tf.Tensor([[9.9998260e-01 1.2735860e-07 4.3719947e-06 3.5391193e-07 1.2507204e-05]], shape=(1, 5), dtype=float32)
妙蛙种子

insert image description here

input the test picture path:4.png
tf.Tensor([[1.6705857e-11 9.9999821e-01 2.3859246e-12 1.7547414e-06 3.2666370e-09]], shape=(1, 5), dtype=float32)
小火龙

insert image description here

Guess you like

Origin blog.csdn.net/qq_47598782/article/details/132111415