一、比赛说明
大米(Oryza sativa)是世界范围内的主食之一。稻谷是去壳前的粗粮,主要在亚洲国家在热带气候中种植。水稻种植需要持续监督,因为多种疾病和害虫可能会影响水稻作物,导致高达 70% 的产量损失。通常需要专家监督来减轻这些疾病并防止作物损失。由于作物保护专家的可用性有限,人工疾病诊断既繁琐又昂贵。因此,通过利用在各个领域取得可喜成果的基于计算机视觉的技术来自动化疾病识别过程变得越来越重要。
本次比赛的主要目标是开发一种基于机器或深度学习的模型来准确分类给定的稻叶图像。我们提供了一个包含 10,407 个 (75%) 标记图像的训练数据集,涵盖 10 个类别(9 个疾病类别和正常叶片)。此外,我们还为每个图像提供额外的元数据,例如稻谷品种和年龄。您的任务是将给定测试数据集中的3,469 个 (25%) 图像中的每个水稻图像分类为九种疾病类别之一或正常叶子。
页面地址
二、数据集概述
train.csv - 训练集
image_id
- 唯一图像标识符对应于train_images目录中的图像文件名 (.jpg)。label
- 水稻病害类型,也是目标类别。有十类,包括正常的叶子。variety
- 水稻品种的名称。age
- 以天为单位的稻谷年龄。
sample_submission.csv - 样本提交文件。
train_images - 该目录包含 10,407 张训练图像,存储在对应于 10 个目标类的不同子目录下。文件名对应image_id
于train.csv
.
test_images - 此目录包含 3,469 个测试集图像。
三、选用不同模型进行训练
这个比赛使用的数据集很有代表性,有些类内变异的意思,套一些现成的模型可能不会达到特别好的效果,看到已经kaggle上已经提交的人的准确率已经高达98%了,有兴趣的小伙伴们可以一起参与下。
1、基于FCN进行训练
网络架构如下表,训练集和验证集按9:1划分,进行了数据增强,训练若干轮后在验证集上效果较差,准确率再38%左右。
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None, None, 3)] 0
_________________________________________________________________
conv2d (Conv2D) (None, None, None, 32) 896
_________________________________________________________________
dropout (Dropout) (None, None, None, 32) 0
_________________________________________________________________
batch_normalization (BatchNo (None, None, None, 32) 128
_________________________________________________________________
activation (Activation) (None, None, None, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, None, None, 64) 18496
_________________________________________________________________
dropout_1 (Dropout) (None, None, None, 64) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, None, None, 64) 256
_________________________________________________________________
activation_1 (Activation) (None, None, None, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, None, None, 128) 73856
_________________________________________________________________
dropout_2 (Dropout) (None, None, None, 128) 0
_________________________________________________________________
batch_normalization_2 (Batch (None, None, None, 128) 512
_________________________________________________________________
activation_2 (Activation) (None, None, None, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, None, None, 256) 295168
_________________________________________________________________
dropout_3 (Dropout) (None, None, None, 256) 0
_________________________________________________________________
batch_normalization_3 (Batch (None, None, None, 256) 1024
_________________________________________________________________
activation_3 (Activation) (None, None, None, 256) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, None, None, 512) 1180160
_________________________________________________________________
dropout_4 (Dropout) (None, None, None, 512) 0
_________________________________________________________________
batch_normalization_4 (Batch (None, None, None, 512) 2048
_________________________________________________________________
activation_4 (Activation) (None, None, None, 512) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, None, None, 64) 32832
_________________________________________________________________
dropout_5 (Dropout) (None, None, None, 64) 0
_________________________________________________________________
batch_normalization_5 (Batch (None, None, None, 64) 256
_________________________________________________________________
activation_5 (Activation) (None, None, None, 64) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, None, None, 10) 650
_________________________________________________________________
dropout_6 (Dropout) (None, None, None, 10) 0
_________________________________________________________________
batch_normalization_6 (Batch (None, None, None, 10) 40
_________________________________________________________________
global_max_pooling2d (Global (None, 10) 0
_________________________________________________________________
activation_6 (Activation) (None, 10) 0
=================================================================
Total params: 1,606,322
Trainable params: 1,604,190
Non-trainable params: 2,132
_________________________________________________________________
None
Total number of layers: 30
2、基于更大的模型
采用448*448的图像大小,训练35轮在验证集上达到90%的准确率,提交到kaggle得分0.81814。
Epoch 31/100
1172/1172 [==============================] - 58s 50ms/step - loss: 0.1692 - accuracy: 0.9433 - val_loss: 0.3538 - val_accuracy: 0.9170 - lr: 3.0000e-04
Epoch 32/100
1172/1172 [==============================] - 59s 50ms/step - loss: 0.1677 - accuracy: 0.9438 - val_loss: 0.4083 - val_accuracy: 0.9006 - lr: 3.0000e-04
Epoch 33/100
1172/1172 [==============================] - 58s 50ms/step - loss: 0.1746 - accuracy: 0.9432 - val_loss: 0.3254 - val_accuracy: 0.8996 - lr: 3.0000e-04
Epoch 34/100
1172/1172 [==============================] - 59s 51ms/step - loss: 0.1511 - accuracy: 0.9500 - val_loss: 0.3413 - val_accuracy: 0.9237 - lr: 3.0000e-04
Epoch 35/100
1172/1172 [==============================] - 59s 50ms/step - loss: 0.1636 - accuracy: 0.9465 - val_loss: 0.3900 - val_accuracy: 0.9015 - lr: 3.0000e-04
损失曲线如下
网络架构如下表。
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 126, 126, 16) 448
max_pooling2d (MaxPooling2D (None, 63, 63, 16) 0
)
conv2d_1 (Conv2D) (None, 61, 61, 32) 4640
max_pooling2d_1 (MaxPooling (None, 30, 30, 32) 0
2D)
conv2d_2 (Conv2D) (None, 28, 28, 64) 18496
max_pooling2d_2 (MaxPooling (None, 14, 14, 64) 0
2D)
conv2d_3 (Conv2D) (None, 12, 12, 128) 73856
max_pooling2d_3 (MaxPooling (None, 6, 6, 128) 0
2D)
flatten (Flatten) (None, 4608) 0
dense (Dense) (None, 2048) 9439232
dense_1 (Dense) (None, 1024) 2098176
dense_2 (Dense) (None, 128) 131200
dense_3 (Dense) (None, 10) 1290
=================================================================
Total params: 11,767,338
Trainable params: 11,767,338
Non-trainable params: 0
3、基于预训练的VGG16进行训练
使用基于ImageNet预训练的VGG16模型,冻结了主干层来进行迁移学习,在进行了约110 epochs之后,再验证集上的准确率停止在77%,不再继续增加。
准确率太低,所以也就没有必要提交,尝试下一个模型。
4、基于预训练的ResNet50进行训练
和上面的VGG16差不太多,在进行了约110 epochs之后,再验证集上的准确率停止在75%,不再继续增加,有点奇怪,先继续试试其他模型。
5、基于预训练的进行Xception进行训练
待整理
......