尝试使用Keras创建神经网络对数据集CIFAR-10分类

0. 环境

Ubuntu 18.04,64bit,i3-6100,8G

Python 3.6 + tensorflow + keras

Ubuntu为了想知道参数值,特意安装了IDLE。安装后发现只支持Python 3,于是又使用pip3安装了一遍各种软件包。

1. 代码

# import the necessary packages
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
from keras.models import Sequential
from keras.layers.core import Dense
from keras.optimizers import SGD
from keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
import argparse

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-o", "--output", required=True,
    help="path to the output loss/accuracy plot")
args = vars(ap.parse_args())

# load the training and testing data, scale it into the range [0, 1],
# then reshape the design matrix
print("[INFO] loading CIFAR-10 data...")
((trainX, trainY), (testX, testY)) = cifar10.load_data()
trainX = trainX.astype("float") / 255.0
testX = testX.astype("float") / 255.0
trainX = trainX.reshape((trainX.shape[0], 3072))
testX = testX.reshape((testX.shape[0], 3072))

# convert the labels from integers to vectors
lb = LabelBinarizer()
trainY = lb.fit_transform(trainY)
testY = lb.transform(testY)
# initialize the label names for the CIFAR-10 dataset
labelNames = ["airplane", "automobile", "bird", "cat", "deer",
    "dog", "frog", "horse", "ship", "truck"]

# define the 3072-1024-512-10 architecture using Keras
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), activation="relu"))
model.add(Dense(512, activation="relu"))
model.add(Dense(10, activation="softmax"))

# train the model using SGD
print("[INFO] training network...")
sgd = SGD(0.01)
model.compile(loss="categorical_crossentropy", optimizer=sgd,
metrics=["accuracy"])
H = model.fit(trainX, trainY, validation_data=(testX, testY),
    epochs=100, batch_size=32)

# evaluate the network
print("[INFO] evaluating network...")
predictions = model.predict(testX, batch_size=32)
print(classification_report(testY.argmax(axis=1),
predictions.argmax(axis=1), target_names=labelNames))

# plot the training loss and accuracy
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, 100), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, 100), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, 100), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, 100), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend()
plt.savefig(args["output"])

2. 数据集

数据集总共有70K个样本。执行代码的过程中,Ubuntu环境下会自动把数据集下载到~/.keras/datasets/中。下载完可以用ls -hl 查看这个 压缩包的大小。是100多MB的。

3. 运行结果

[INFO] evaluating network...
             precision    recall  f1-score   support

   airplane       0.65      0.63      0.64      1000
 automobile       0.69      0.65      0.67      1000
       bird       0.45      0.47      0.46      1000
        cat       0.38      0.39      0.39      1000
       deer       0.54      0.46      0.49      1000
        dog       0.44      0.55      0.49      1000
       frog       0.66      0.60      0.63      1000
      horse       0.67      0.61      0.64      1000
       ship       0.67      0.71      0.69      1000
      truck       0.60      0.63      0.61      1000

avg / total       0.58      0.57      0.57     10000

损失函数曲线:

训练使用了50000个样本。每次迭代需要近一分钟。迭代了100次。平均精度只有58%。是难度比较大的一个训练集。

代码来源于Deep.Learning.for.Computer.Vision.with.Python.Starter.Bundle

猜你喜欢

转载自blog.csdn.net/qq_27158179/article/details/82919337