Keras build multi-classification problem solving DNN

Keras Introduction

Keras is an open source high-level neural network API, written in pure Python, the rear end can be based on Tensorflow, Theano, MXNet and CNTK. Keras To support the fast experiment born, can quickly turn your idea into a result. Keras suitable Python version: Python 2.7-3.6.
Keras, Greek for "horn" (horn), in March 2015 the first issue, it can run on Windows, Linux, Mac and other systems. Well, as with TensorFlow (or Theano, MXNet, CNTK), why do we need Keras it? This is because, although we can use to create depth and other TensorFlow neural network system, but Tensorflow such as the use of relatively low-level abstraction of code written directly TensorFlow have some challenges, but Keras on the basis of TensorFlow, increased easier to use abstraction layer, use more simple and efficient.
What kind of occasion suitable for use Keras it? If you have the following requirements, please select Keras:

- Easy and rapid prototyping (keras has a highly modular, minimalist, and scalable characteristics)
- CNN and support combined with RNN, or both
- Seamless handover CPU and GPU

If you want to use on your computer using Keras, you need the following tools:

- Python
- TensorFlow
- Hard

　　Here, we choose TensorFlow as Keras back-end tools. Use the following Python code, you can output the version number of Python, TensorFlow and Keras of:

import sys
import keras as K
import tensorflow as tf

py_ver = sys.version
k_ver = K.__version__
tf_ver = tf.__version__

print("Using Python version " + str(py_ver))
print("Using Keras version " + str(k_ver))
print("Using TensorFlow version " + str(tf_ver))

In the author's computer, the output results are as follows:

Using TensorFlow backend.
Using Python version 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)]
Using Keras version 2.1.5
Using TensorFlow version 1.6.0

Below, I will use IRIS data set (iris data set, a classic machine learning data sets, suitable for multi-classification test data), use Keras build a deep neural network (DNN), to solve the multi-IRIS data set classification, as the first example of entry Keras.

IRIS data set introduced

IRIS data set (Iris data set), is a classic machine learning data sets, suitable as classification of test data, its download address is: http://archive.ics.uci.edu/ml/machine- Databases-Learning / IRIS / .
IRIS iris data set is used to make the classification of the data set, a total of 150 samples, each sample comprising a length of calyx (sepal length in cm), the width of the calyx (sepal width in cm), the length of the petal (petal length in cm ), the width of the petal (petal width in cm) four features, the iris divided into three categories, namely, Iris setosa, Iris Versicolour, Iris Virginica , each of which has 50 samples.
IRIS data set as follows (only shows part of the data, the sequence has been disrupted):

iris data set preview

Reading the data set

Author IRIS data set stored in csv format, the author uses to read Pandas IRIS data set, and the encoding target variables 0-1 (One-hot Encoding), and finally the data set into training and test sets, ratio of 7: 3. Complete Python code is as follows:

Import PANDAS AS PD
 from sklearn.model_selection Import train_test_split
 from sklearn.preprocessing Import LabelBinarizer 

# reading CSV data set, and split into training and test sets 
# arguments passed to the function CSV_FILE_PATH: csv file path 
DEF load_data (CSV_FILE_PATH) : 
    the IRIS = pd.read_csv (CSV_FILE_PATH) 
    target_var = ' class '   # target variable 
    # feature dataset 
    features = List (IRIS.columns) 
    features.remove (target_var) 
    # objective variable classes 
    class = the IRIS [target_var] .unique ( )
    # Category dictionary target variable 
    Class_dict = dict (ZIP (Class, Range (len (Class))))
     # increasing a target, the target variable for encoding 
    the IRIS [ ' target ' ] = the IRIS [target_var] .apply ( the lambda X: Class_dict [X])
     # of 0-1 encoding target variables (One-Hot encoding) 
    LB = LabelBinarizer () 
    lb.fit (List (Class_dict.values ())) 
    transformed_labels = lb.transform (the IRIS [ ' target ' ] ) 
    y_bin_labels = []   # multi-variable coding classification 0-1 
    for I in Range (transformed_labels.shape [. 1 ]): 
        y_bin_labels.append (' Y ' + STR (I)) 
        the IRIS [ ' Y ' + STR (I)] = transformed_labels [:, I]
     # the data set into a training set and a test set 
    train_x, test_x, train_y, test_y = train_test_split (the IRIS [ Features], the IRIS [y_bin_labels], \ 
                                                        train_size = 0.7, test_size = 0.3, random_state = 0)
     return train_x, test_x, train_y, test_y, Class_dict

Build DNN

Next, we will show how to build a simple Keras depth neural network (DNN) to solve this multi-classification problem. DNN to build our structure as shown below:

DNN schematic structural model

　　我们搭建的DNN由输入层、隐藏层、输出层和softmax函数组成，其中输入层由4个神经元组成，对应IRIS数据集中的4个特征，作为输入向量，隐藏层有两层，每层分别有5和6个神经元，之后就是输出层，由3个神经元组成，对应IRIS数据集的目标变量的类别个数，最后，就是一个softmax函数，用于解决多分类问题而创建。
对应以上的DNN结构，用Keras来搭建的话，其Python代码如下：

　　import keras as K
    # 2. 定义模型
    init = K.initializers.glorot_uniform(seed=1)
    simple_adam = K.optimizers.Adam()
    model = K.models.Sequential()
    model.add(K.layers.Dense(units=5, input_dim=4, kernel_initializer=init, activation='relu'))
    model.add(K.layers.Dense(units=6, kernel_initializer=init, activation='relu'))
    model.add(K.layers.Dense(units=3, kernel_initializer=init, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer=simple_adam, metrics=['accuracy'])

　　在这个模型中，我们选择的神经元激活函数为ReLU函数，损失函数为交叉熵（cross entropy），迭代的优化器（optimizer）选择Adam，最初各个层的连接权重（weights）和偏重（biases）是随机生成的。这样我们就讲这个DNN的模型定义完毕了。这么简单？Yes, that's it!

训练及预测

OK，定义完模型后，我们需要对模型进行训练、评估及预测。对于模型训练，我们每次训练的批数为1，共迭代100次，代码如下（接以上代码）：

 # 3. 训练模型
    b_size = 1
    max_epochs = 100
    print("Starting training ")
    h = model.fit(train_x, train_y, batch_size=b_size, epochs=max_epochs, shuffle=True, verbose=1)
    print("Training finished \n")

为了对模型有个评估，感知模型的表现，需要输出该DNN模型的损失函数的值以及在测试集上的准确率，其Python代码如下（接以上代码）：

   # 4. 评估模型
    eval = model.evaluate(test_x, test_y, verbose=0)
    print("Evaluation on test data: loss = %0.6f accuracy = %0.2f%% \n" \
          % (eval[0], eval[1] * 100) )

　　训练100次，输出的结果如下（中间部分的训练展示已忽略）：

Starting training 
Epoch 1/100

  1/105 [..............................] - ETA: 17s - loss: 0.3679 - acc: 1.0000
 42/105 [===========>..................] - ETA: 0s - loss: 1.8081 - acc: 0.3095 
 89/105 [========================>.....] - ETA: 0s - loss: 1.5068 - acc: 0.4270
105/105 [==============================] - 0s 3ms/step - loss: 1.4164 - acc: 0.4667
Epoch 2/100

  1/105 [..............................] - ETA: 0s - loss: 0.4766 - acc: 1.0000
 45/105 [===========>..................] - ETA: 0s - loss: 1.0813 - acc: 0.4889
 93/105 [=========================>....] - ETA: 0s - loss: 1.0335 - acc: 0.4839
105/105 [==============================] - 0s 1ms/step - loss: 1.0144 - acc: 0.4857

......

Epoch 99/100

  1/105 [..............................] - ETA: 0s - loss: 0.0013 - acc: 1.0000
 43/105 [===========>..................] - ETA: 0s - loss: 0.0447 - acc: 0.9767
 84/105 [=======================>......] - ETA: 0s - loss: 0.0824 - acc: 0.9524
105/105 [==============================] - 0s 1ms/step - loss: 0.0711 - acc: 0.9619
Epoch 100/100

  1/105 [..............................] - ETA: 0s - loss: 2.3032 - acc: 0.0000e+00
 51/105 [=============>................] - ETA: 0s - loss: 0.1122 - acc: 0.9608    
 99/105 [===========================>..] - ETA: 0s - loss: 0.0755 - acc: 0.9798
105/105 [==============================] - 0s 1ms/step - loss: 0.0756 - acc: 0.9810
Training finished 

Evaluation on test data: loss = 0.094882 accuracy = 97.78%

　　可以看到，训练完100次后，在测试集上的准确率已达到97.78%，效果相当好。
最后是对新数据集进行预测，我们假设一朵鸢尾花的4个特征为6.1,3.1,5.1,1.1，我们想知道这个DNN模型会把它预测到哪一类，其Python代码如下：

 　　import numpy as np
    # 5. 使用模型进行预测
    np.set_printoptions(precision=4)
    unknown = np.array([[6.1, 3.1, 5.1, 1.1]], dtype=np.float32)
    predicted = model.predict(unknown)
    print("Using model to predict species for features: ")
    print(unknown)
    print("\nPredicted softmax vector is: ")
    print(predicted)
    species_dict = {v:k for k,v in Class_dict.items()}
    print("\nPredicted species is: ")
    print(species_dict[np.argmax(predicted)])

　　输出的结果如下：

Using model to predict species for features: 
[[ 6.1  3.1  5.1  1.1]]

Predicted softmax vector is: 
[[  2.0687e-07   9.7901e-01   2.0993e-02]]

Predicted species is: 
versicolor

　　如果我们仔细地比对IRIS数据集，就会发现，这个预测结果令人相当满意，这个鸢尾花样本的预测结果，以人类的眼光来看，也应当是versicolor。

总结

到此为止，笔者就把这个演示例子给讲完了，作为入门Keras的第一步，这个例子还是可以的。回顾该模型，首先我们利用Pandas读取IRIS数据集，并分为训练集和测试集，然后用Keras搭建了一个简单的DNN模型，并对该模型进行训练及评估，最后看一下该模型在新数据集上的预测能力。从中，读者不难体会到Keras的优越性，因为，相比TensorFlow,搭建同样的DNN模型及模型训练、评估、预测，其Python代码无疑会比Keras来得长。
最后，附上该DNN模型的完整Python代码：

# iris_keras_dnn.py
# Python 3.5.1, TensorFlow 1.6.0, Keras 2.1.5
# ========================================================
# 导入模块
import os
import numpy as np
import keras as K
import tensorflow as tf
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'

# 读取CSV数据集，并拆分为训练集和测试集
# 该函数的传入参数为CSV_FILE_PATH: csv文件路径
def load_data(CSV_FILE_PATH):
    IRIS = pd.read_csv(CSV_FILE_PATH)
    target_var = 'class'  # 目标变量
    # 数据集的特征
    features = list(IRIS.columns)
    features.remove(target_var)
    # 目标变量的类别
    Class = IRIS[target_var].unique()
    # 目标变量的类别字典
    Class_dict = dict(zip(Class, range(len(Class))))
    # 增加一列target, 将目标变量进行编码
    IRIS['target'] = IRIS[target_var].apply(lambda x: Class_dict[x])
    # 对目标变量进行0-1编码(One-hot Encoding)
    lb = LabelBinarizer()
    lb.fit(list(Class_dict.values()))
    transformed_labels = lb.transform(IRIS['target'])
    y_bin_labels = []  # 对多分类进行0-1编码的变量
    for i in range(transformed_labels.shape[1]):
        y_bin_labels.append('y' + str(i))
        IRIS['y' + str(i)] = transformed_labels[:, i]
    # 将数据集分为训练集和测试集
    train_x, test_x, train_y, test_y = train_test_split(IRIS[features], IRIS[y_bin_labels], \
                                                        train_size=0.7, test_size=0.3, random_state=0)
    return train_x, test_x, train_y, test_y, Class_dict

def main():

    # 0. 开始
    print("\nIris dataset using Keras/TensorFlow ")
    np.random.seed(4)
    tf.set_random_seed(13)

    # 1. 读取CSV数据集
    print("Loading Iris data into memory")
    CSV_FILE_PATH = 'E://iris.csv'
    train_x, test_x, train_y, test_y, Class_dict = load_data(CSV_FILE_PATH)

    # 2. 定义模型
    init = K.initializers.glorot_uniform(seed=1)
    simple_adam = K.optimizers.Adam()
    model = K.models.Sequential()
    model.add(K.layers.Dense(units=5, input_dim=4, kernel_initializer=init, activation='relu'))
    model.add(K.layers.Dense(units=6, kernel_initializer=init, activation='relu'))
    model.add(K.layers.Dense(units=3, kernel_initializer=init, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer=simple_adam, metrics=['accuracy'])

    # 3. 训练模型
    b_size = 1
    max_epochs = 100
    print("Starting training ")
    h = model.fit(train_x, train_y, batch_size=b_size, epochs=max_epochs, shuffle=True, verbose=1)
    print("Training finished \n")

    # 4. 评估模型
    eval = model.evaluate(test_x, test_y, verbose=0)
    print("Evaluation on test data: loss = %0.6f accuracy = %0.2f%% \n" \
          % (eval[0], eval[1] * 100) )

    # 5. 使用模型进行预测
    np.set_printoptions(precision=4)
    unknown = np.array([[6.1, 3.1, 5.1, 1.1]], dtype=np.float32)
    predicted = model.predict(unknown)
    print("Using model to predict species for features: ")
    print(unknown)
    print("\nPredicted softmax vector is: ")
    print(predicted)
    species_dict = {v:k for k,v in Class_dict.items()}
    print("\nPredicted species is: ")
    print(species_dict[np.argmax(predicted)])

main()

参考文献

Keras中文文档： https://keras-cn.readthedocs.io/en/latest/
Keras Succinctly: http://ebooks.syncfusion.com/downloads/keras-succinctly/keras-succinctly.pdf?AWSAccessKeyId=AKIAJ5W3G2Z6F2ZHAREQ&Expires=1539315050&Signature=r6qJ%2BP7KUEU442WMObSLd2%2Flkqw%3D
IRIS数据集： http://archive.ics.uci.edu/ml/machine-learning-databases/iris/