图像多分类——卷积神经网络

例子参考:https://www.jiqizhixin.com/articles/2019-05-15-2

数据集:https://www.cs.ccu.edu.tw/~wtchu/projects/MoviePoster/index.html


将获取到原始数据集,其中,有三个文件,   Movie Poster Dataset是1980-2015年部分影片海报图片, Movie Poster Metadata是1980-2015年部分影片的数据详情,example:

                          

Readme则是对 Movie Poster Metadata文件里边的字段解释,在训练过程中只用到IMPId和 Genre(影片类型)。

                                       

步骤:

  • 数据处理

获取到影片的类型对影片类型实现one-hot编码,如果是属于哪个类型,用1表示,其他为0,得到如下文件,

                         

考虑到特征的相关性,删除影片比较少的类型列(将数量小于50的类型列进行删除),最终留下22个电影类型,如下:

                                            

将电影类型作为最终的结果值,然后加载图片:

for i in tqdm(range(train.shape[0])):  

    img = image.load_img('D:/aayu/实例/图像多分类/data/Images/'+train['ID'][i]+'.jpg',target_size=(400,400,3))  
    img = image.img_to_array(img)  
    img = img/255  
    train_image.append(img)      
X = np.array(train_image)
  • 模型构建

模型是由4层卷积和3层全连接层构成,具体参数如下:

                                     

训练结果为:

  • 模型预测

新增一个复仇者联盟的海报对数据进行预测(此处可更换为任意海报数据),加载数据:

img = image.load_img('F:/aayu/图像/data/GOT.jpg',target_size=(400,400,3))  
img = image.img_to_array(img)  
img = img/255 

预测结果:

 

完整代码:


import keras  
from keras.models import Sequential  
from keras.layers import Dense, Dropout, Flatten  
from keras.layers import Conv2D, MaxPooling2D  
from keras.utils import to_categorical  
from keras.preprocessing import image  
import numpy as np  
import pandas as pd  
import matplotlib.pyplot as plt  
from sklearn.model_selection import train_test_split  
from tqdm import tqdm  
#%matplotlib inline  

train = pd.read_csv('F:/aayu/图像/data/multi-data.csv')

print(train.head())


train_image = []  

for i in tqdm(range(train.shape[0])):  

    img = image.load_img('F:/aayu/图像/data/Images/'+train['ID'][i]+'.jpg',target_size=(400,400,3))  
    img = image.img_to_array(img)  
    img = img/255  
    train_image.append(img)  
    
X = np.array(train_image)  

y = np.array(train.drop(['ID', 'Genre','News','Reality-TV','Italian','Polish','Adult','Talk-Show',
                  'Spanish','Russian','Cantonese','R','PG','German','English','Japanese',
                  'Filipino','French','G','Game-Show','Hungarian'],axis=1)) 


X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.1)  

#model
model = Sequential()  
model.add(Conv2D(filters=16, kernel_size=(5, 5), activation="relu", input_shape=(400,400,3)))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Conv2D(filters=32, kernel_size=(5, 5), activation='relu'))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Conv2D(filters=64, kernel_size=(5, 5), activation="relu"))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu'))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Flatten())  
model.add(Dense(128, activation='relu'))  
model.add(Dropout(0.5))  
model.add(Dense(64, activation='relu'))  
model.add(Dropout(0.5))  
model.add(Dense(22, activation='sigmoid'))  


model.summary()  
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])  
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test), batch_size=64)  



#precise  
#加入新数据,进行测试
img = image.load_img('F:/aayu/图像/data/GOT.jpg',target_size=(400,400,3))  
img = image.img_to_array(img)  
img = img/255  


classes = np.array(train.columns[:22])  
proba = model.predict(img.reshape(1,400,400,3))  
top_3 = np.argsort(proba[0])[:-4:-1]  
for i in range(3):  
    print("{}".format(classes[top_3[i]])+" ({:.3})".format(proba[0][top_3[i]]))  
plt.imshow(img)

总结:与minist数据集相比,该数据集的分类中存在一张图片多个类的情况,而minist数据集当中一张图片代表一个数字,也就是一个分类,所以图像分类和图像多分类在本质上的区别在于数据集,算法实现基本都是一样的。

(数据集正在处理中,github网址为:https://github.com/YUXUEPENG/ImageMulti-Classification.git)

 

猜你喜欢

转载自blog.csdn.net/qq_28409193/article/details/103583344