Image multi-classification-convolutional neural network

Example reference: https://www.jiqizhixin.com/articles/2019-05-15-2

Data set: https://www.cs.ccu.edu.tw/~wtchu/projects/MoviePoster/index.html


The original data set will be obtained. Among them, there are three files. Movie Poster Dataset is the pictures of some movie posters from 1980 to 2015, and Movie Poster Metadata is the data details of some movies from 1980 to 2015, example:

                          

Readme is an explanation of the fields in the Movie Poster Metadata file. Only IMPID and Genre (movie types) are used in the training process.

 

                                       

 

step:

  • data processing

Get the type of the movie to implement one-hot encoding for the type of the movie, if it belongs to which type, use 1 to indicate, and the others to 0, get the following files,

                         

Taking into account the relevance of features, delete the genre column with fewer movies (delete the genre column with less than 50), and finally leave 22 movie genres, as follows:

                                            

Use the movie type as the final result value, and then load the picture:

for i in tqdm(range(train.shape[0])):  

    img = image.load_img('D:/aayu/实例/图像多分类/data/Images/'+train['ID'][i]+'.jpg',target_size=(400,400,3))  
    img = image.img_to_array(img)  
    img = img/255  
    train_image.append(img)      
X = np.array(train_image)
  • Model building

The model is composed of 4 layers of convolution and 3 layers of fully connected layers. The specific parameters are as follows:

                                     

The training results are:

  • Model prediction

Add a new Avengers poster to predict the data (here can be replaced with any poster data), and load the data:

img = image.load_img('F:/aayu/图像/data/GOT.jpg',target_size=(400,400,3))  
img = image.img_to_array(img)  
img = img/255 

forecast result:

 

Complete code:


import keras  
from keras.models import Sequential  
from keras.layers import Dense, Dropout, Flatten  
from keras.layers import Conv2D, MaxPooling2D  
from keras.utils import to_categorical  
from keras.preprocessing import image  
import numpy as np  
import pandas as pd  
import matplotlib.pyplot as plt  
from sklearn.model_selection import train_test_split  
from tqdm import tqdm  
#%matplotlib inline  

train = pd.read_csv('F:/aayu/图像/data/multi-data.csv')

print(train.head())


train_image = []  

for i in tqdm(range(train.shape[0])):  

    img = image.load_img('F:/aayu/图像/data/Images/'+train['ID'][i]+'.jpg',target_size=(400,400,3))  
    img = image.img_to_array(img)  
    img = img/255  
    train_image.append(img)  
    
X = np.array(train_image)  

y = np.array(train.drop(['ID', 'Genre','News','Reality-TV','Italian','Polish','Adult','Talk-Show',
                  'Spanish','Russian','Cantonese','R','PG','German','English','Japanese',
                  'Filipino','French','G','Game-Show','Hungarian'],axis=1)) 


X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.1)  

#model
model = Sequential()  
model.add(Conv2D(filters=16, kernel_size=(5, 5), activation="relu", input_shape=(400,400,3)))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Conv2D(filters=32, kernel_size=(5, 5), activation='relu'))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Conv2D(filters=64, kernel_size=(5, 5), activation="relu"))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu'))  
model.add(MaxPooling2D(pool_size=(2, 2)))  
model.add(Dropout(0.25))  
model.add(Flatten())  
model.add(Dense(128, activation='relu'))  
model.add(Dropout(0.5))  
model.add(Dense(64, activation='relu'))  
model.add(Dropout(0.5))  
model.add(Dense(22, activation='sigmoid'))  


model.summary()  
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])  
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test), batch_size=64)  



#precise  
#加入新数据,进行测试
img = image.load_img('F:/aayu/图像/data/GOT.jpg',target_size=(400,400,3))  
img = image.img_to_array(img)  
img = img/255  


classes = np.array(train.columns[:22])  
proba = model.predict(img.reshape(1,400,400,3))  
top_3 = np.argsort(proba[0])[:-4:-1]  
for i in range(3):  
    print("{}".format(classes[top_3[i]])+" ({:.3})".format(proba[0][top_3[i]]))  
plt.imshow(img)

Summary: Compared with the minist data set, the classification of this data set has a picture with multiple classes, and a picture in the minist data set represents a number, that is, a classification, so image classification and image classification are in The essential difference lies in the data set, and the algorithm implementation is basically the same.

(The data set is being processed, the github URL is: https://github.com/YUXUEPENG/ImageMulti-Classification.git)

 

 

 

 

Guess you like

Origin blog.csdn.net/qq_28409193/article/details/103583344