Introduction and download MNIST datasets

# 1.MNIST dataset Profile

1.MNIST data set that is MNIST handwritten collection of digital images, machine learning is a very classic data sets.

2.MNIST constituted by a digital image data set from 0 to 9. There are sixty thousand training image, test images have ten thousand. These images can be used for learning and reasoning.

3.MNIST are grayscale image data 28px * 28px, the value of each pixel between 0 and 255. Each image respectively labeled "0", "1", "2", "3", "4", "5" ... other labels.

General use 4.MNIST data sets:
first image of learning by training, and then learning to be able to measure the model test images can be classified correctly to what extent.

Download and use # 2.MNIST datasets

"Getting deep learning - theory and implementation of Python-based" (Saito with Kang Yi, Lu Yujie translated) book provides a download URL of the source code (http://www.ituring.com.cn/book/1921) . MNIST download the code data set in the dataset / mnist.py file, the following example (source code ch03 / mnist_show) is a simple to use and analysis of the data set MNIST.

# coding: utf-8
# 这里是对便利脚本dataset/mnist.py中load_mnist()函数的使用

import sys, os              # os模块用于提供系统级别的操作,sys模块用于提供对解释器相关的操作
sys.path.append(os.pardir)  # sys.path.append()用于添加系统环境变量,os.pardir就是当前目录的父目录
import numpy as np
from dataset.mnist import load_mnist            # 因为已添加父目录为系统环境变量,所以此处可顺利导入父目录下dataset.mnist中的函数
from PIL import Image

# 用于显示图片的函数
def img_show(img):
    pil_img = Image.fromarray(np.uint8(img))    # 将数组转换成图像
    pil_img.show()                              # 显示图像

(x_train, t_train), (x_test, t_test) = load_mnist(flatten=True, normalize=False) 
 # 这里通过load_mnist函数读入了MNIST数据                                                                                
 # 该函数以(训练图像,训练标签),(测试图像,测试标签)的形式返回读入的MNIST数据																				  
 # 该函数可设置三个参数:normalize,flatten,one_hot_label																				  
 # normalize=true表示将输入图像像素正规化为0.0到1.0的值;反之,图像像素保持原来的0到255						  
 # flatten=true表示输入图像会保存为由784个元素构成的一维数组;反之,图像保存为1*28*28的三维数组						  
 # one_hot_label=true表示标签保存为one_hot形式(仅正确解标签为1其余为0的数组形式:[0,0,1,0,0,0,0,0,0,0]);反之,标签简单保存。(如:1、2、3...)


																				   
img = x_train[0]   # 将第一张训练图像存至img变量
label = t_train[0] # 将第一个训练标签存至label变量
print(label)       # 打印出该训练标签

print(img.shape)   # 打印出该训练图像的存储形状(像素数组存储格式)
img = img.reshape(28, 28)  # 利用reshape函数把图像(像素数组)的形状变为原来的尺寸
print(img.shape)   # (28, 28)

img_show(img)      # 显示图像

Results are as follows:

 

# This blog reference to the "deep learning portal - based on theory and implementation of Python" (Saito with Kang Yi, Lu Yujie translation), especially in this statement.

Published 25 original articles · won praise 6 · views 1972

Guess you like

Origin blog.csdn.net/weixin_44711653/article/details/104098225