Table of contents
1. Tensorflow library installation
(1) Historical versions of TensorFlow and corresponding Python versions
(3) Find the version box above to download the corresponding TensorFlow
(4) The installation is successful
(5) TensorFlow successfully verified
(3) Define the image and confirm the path of the test set and verification set
(4) Model training parameter settings
(5) Dimension definition and image generator
(6) Settings for each level of convolutional neural network
(7) Function callback result injection training
(8) Visualization of training results
1. Tensorflow library installation
(1) Historical versions of TensorFlow and corresponding Python versions
TensorFlow version | Python | release time |
tensorflow-2.4.0 | 3.6-3.8 | December 2020 |
tensorflow-2.3.0 | 3.5-3.8 | July 2020 |
tensorflow-2.2.0 | 3.5-3.8 | May 2020 |
tensorflow-2.1.0 | 3.5-3.7 | January 2020 |
tensorflow-2.0.0 | 3.5-3.7 | October 2019 |
tensorflow-1.15.0 | 3.5-3.7 | October 2019 |
tensorflow-1.14.0 | 3.5-3.7 | June 2019 |
tensorflow-1.13.0 | 3.5-3.7 | February 2019 |
tensorflow-1.12.0 | 3.5-3.6 | November 2018 |
tensorflow-1.11.0 | 3.5-3.6 | September 2018 |
tensorflow-1.10.0 | 3.5-3.6 | August 2018 |
tensorflow-1.9.0 | 3.5-3.6 | July 2018 |
tensorflow-1.8.0 | 3.5-3.6 | April 2018 |
tensorflow-1.7.0 | 3.5-3.6 | March 2018 |
tensorflow-1.6.0 | 3.5-3.6 | March 2018 |
tensorflow-1.5.0 | 3.5-3.6 | January 2018 |
tensorflow-1.4.0 | 3.5-3.6 | November 2017 |
tensorflow-1.3.0 | 3.5-3.6 | August 2017 |
tensorflow-1.2.0 | 3.5-3.6 | June 2017 |
tensorflow-1.1.0 | 3.5 | April 2017 |
tensorflow-1.0.0 | 3.5 | February 2017 |
(2) Python version query
The following methods can be used when the python environment variable is configured:
win+R or search for cmd in the search box to open the Windows terminal, and enter python --version in the terminal
If the python environment is not configured, use the anaconda prompt terminal to query the version number and subsequent operations.
(3) Find the version box above to download the corresponding TensorFlow
At the time of installation, I didn't think that the version problem would affect the subsequent use of TensorFlow, so I chose a version at random. Of course, you can choose the same version if you want. The follow-up method will be the same as mine.
Here we download the image file of TensorFlow from Tsinghua Park, followed by the version number
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow==1.14
Notice:
①The command here is run directly under the Windows terminal when running, provided that there is a Python environment
②If the download encounters an error report in a non-environment, please copy the URL after Downloading, go directly to the URL to download the installation file, and then switch to, the download path, the suffix of the file downloaded with pip insatll is .whl file
(4) The installation is successful
When the installation occurs as follows, the installation is successful
(5) TensorFlow successfully verified
Import the TensorFlow library to do a simple calculation
import tensorflow as tf
sess = tf.Session()
a = tf.constant(10)
b = tf.constant(12)
sess.run(a+b)
The following results are obtained to prove that the installation is successful
Note:
The following errors may occur during the verification phase
The first is jupyter's error, and the second is anaconda's error message
This is because the protobuf version level of some machines does not match TensorFlow. You only need to reinstall according to the protobuf >= version number later.
The command is as follows:
pip install protobuf==3.19.0
2. Project introduction
(1) Project description
Image recognition technology is an important foundation of artificial intelligence computer vision. Using machine learning/deep learning algorithms can efficiently and accurately identify the main features of pictures, so as to classify and recognize pictures with different picture contents.
There is a classic data set in the field of image recognition research: Cat_vs_Dogs (cat and dog recognition data set), many computer vision algorithm research uses this data set to verify its effect. In the attachment, 1500 photos of cats and 1500 photos of dogs are collected, and put into the following directory structure respectively:
cats_vs_dogs:
train:
cats: [cat.0.jpg, cat.1.jpg, cat.2.jpg ....]
dogs: [dog.0.jpg, dog.1.jpg, dog.2.jpg ...]
validation:
cats: [cat.2000.jpg, cat.2001.jpg, cat.2002.jpg ....]
dogs: [dog.2000.jpg, dog.2001.jpg, dog.2002.jpg ...]
Note: train is used as a training set, with 1000 photos of cats and dogs; validation is used as a verification set, with 500 photos of cats and dogs. Examples of pictures are as follows:
(2) Project purpose
The purpose of the project is mainly to realize the establishment of a model for cat and dog pattern recognition, and then verify the accuracy of the pattern recognition model through the verification set. This article mainly describes the project model building process, as for model design, parameter tuning and model result analysis, it will be given in subsequent articles.
3. Implementation process
(1) Library import
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array, array_to_img
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import Activation, Dropout, Flatten, Dense
from tensorflow.keras import backend as K
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.applications import VGG16,InceptionV3,ResNet50,MobileNet
import numpy as np
import matplotlib.pyplot as plt
import glob, os, random
The role of the os module
It can handle files and directories, which we need to do manually every day. This module is especially important if you want your program to be platform-independent.
The role of the glob module
It is mainly used to find directories and files that meet specific rules, and return the searched results to a list.
(2) Match graphics
path = 'data'
os.path.join(path, '*/*/*.*')
# 使用 glob 模块批量匹配图像, * 代表匹配所有东西
img_list = glob.glob(os.path.join(path, '*/*/*.*'))
print('>>>图像数量:', len(img_list))
print(img_list[:5])
for i, img_path in enumerate(img_list[:6]):
img_plot = load_img(img_path) # 加载图像
arr = img_to_array(img_plot) # 将图像转换成数组
print(arr.shape) # 图像形状
plt.subplot(2, 3, i + 1)
plt.imshow(img_plot)
os.path.join(path,name): join directory with filename or directory.
glob.glob(): Returns the paths of all files matching the matching criteria.
Note: It is necessary here that the ipynb file and the data file are in the same directory, as shown in the figure below
(3) Define the image and confirm the path of the test set and verification set
# 统一定义图像像素的宽度和高度
img_width, img_height = 100, 100
# 定义训练集、验证集的图形路径(文件夹路径即可)
train_data_dir = 'data/train/'
validation_data_dir = 'data/validation/'
Note: What can be debugged here is the uniform definition of the width and height of the image pixels. You can debug the data yourself without reading the follow-up analysis article. I won’t explain too much here.
(4) Model training parameter settings
# 模型训练的参数设置
nb_train_samples = 30
nb_validation_samples = 10
epochs = 20 # 迭代次数
batch_size = 32 # 每个批量观测数
Note: The setting of model training parameters affects the results of model training. The number of iterations and the number of observations per batch affect the results of the model. The first two parameters are the parameters of models.fit_generator() later, which is generally a generator function. The main function is to use the generator to send data to the model in batches, which can effectively save a single memory consumption.
(5) Dimension definition and image generator
# 图像输入维度设置
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
# 定义图像生成器
train_datagen = ImageDataGenerator(rescale=1. / 255, # 重缩放因子
shear_range=0.2, # 剪切强度(以弧度逆时针方向剪切角度)
zoom_range=0.2, # 随机缩放范围
horizontal_flip=True, # 随机水平翻转
rotation_range=360 # 360度范围内随机旋转
)
# 使用图像生成器,从train_data_dir目录中读取图片,生成训练集(X_train图片数据, y_train图片所在的目录名称)
train_generator = train_datagen.flow_from_directory(train_data_dir, # 训练数据的文件夹路径
target_size=(img_width, img_height), # 统一像素大小
batch_size=batch_size, # 每一批次的观测数
class_mode='categorical' # 指定分类模式,指定二分类
)
test_datagen = ImageDataGenerator(rescale=1. / 255,
shear_range=0.2, # 剪切强度(以弧度逆时针方向剪切角度)
zoom_range=0.2, # 随机缩放范围
horizontal_flip=True) # 随机水平翻转
validation_generator = test_datagen.flow_from_directory(validation_data_dir, # 验证集文件夹路径
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical' # 二分类
)
Note: k.image_data_format() in the image dimension setting returns the order of image dimensions ("channels_first" or "channels_last"). The properties of color images generally include: width, height, channels.
(6) Settings for each level of convolutional neural network
model = Sequential()
# -----------------------------------------------------
# 输入层:第一层
# 添加第一个卷积层/最大池化层(必选)
model.add(Conv2D(filters=32, # 32 个过滤器
kernel_size=(3, 3), # 卷积核大小 3 x 3
input_shape=input_shape, # 图像输入维度
activation='relu')) # 'relu' 激活函数
model.add(MaxPooling2D(pool_size=(2, 2))) # 池化核大小 2 x 2
# ----------------------------------------------------
# 隐藏层:介于第一层和最后一层之间
# 添加第二个卷积层/最大池化层(可选)
model.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# 添加第三个卷积层/最大池化层(可选)
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# 添加第三个卷积层/最大池化层(可选)
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# 由于卷积层是 2D 空间,训练时需要将数据展平为 1D 空间
model.add(Flatten()) # 添加展平层(必选)
model.add(Dense(units=64, activation='relu')) # 添加全连接层(必选) 64 个神经元
model.add(Dropout(0.5)) # 添加丢弃层,防止过拟合
# ---------------------------------------------------
# 输出层:最后一层,神经元控制输出的维度,并指定分类激活函数
model.add(Dense(units=2, activation='sigmoid')) # 指定分类激活函数
model.summary()
model.compile(loss='binary_crossentropy', # 指定损失函数类型
optimizer='rmsprop', # 优化器
metrics=['accuracy']) # 评价指标
After the operation is completed, the following process diagram will appear:
(7) Function callback result injection training
# tensorboard回调函数
logs = os.path.join("logs")
if not os.path.exists(logs):
os.mkdir(logs)
train_callbacks = [
TensorBoard(
log_dir=r'./logs',
histogram_freq=1,
)
]
tensorboard_dir = os.path.join(r'.\logs\plugins\profile')
history = model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples,
callbacks=train_callbacks
)
The parameters of models.fit_generator() are generally a generator function, the main function is to use the generator to send data to the model in batches, which can effectively save the consumption of single memory.
Note: The content of the logs\plugins\profile file needs to be created manually. This experiment code does not write code to create a module. After creating logs, you need to create a plugins folder under the logs folder and a profile folder under the plugins folder. . Otherwise, the following error will be reported:
After running, the following process diagram will appear
(8) Visualization of training results
#现在将训练后的结果可视化。
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(20, 10))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
Visualize the result graph after training: