Make your own mnist set when using tensorflow training model (with code)

Make your own mnist set when using tensorflow training model (with code)

Exploration process

(Ps: The first written, badly written forgive me!)
MNIST collection is a set of training images is handwritten digits with rotten, but in practice we often use their own data set, you need to own the picture is converted into data sets mnist form of data, or by other methods (used before keras, a third-party library, although the library is useful, simplified a lot of steps may be because of my limited ability to always trained model accuracy and the loss is not ideal, if you have small partners also want to know how to teach me down, I just started really good food (^ △ ^)).

The neural network data fed two main methods: introducing local, load directly.

Probably that the next mnist set as follows by four compressed packages:
Here Insert Picture Description
After extracting this file is a IDX3-UBYTE file, which is stored in a binary units, so as a non-professional, we are not open.
Here Insert Picture Description
So how do we understand it inside the storage form of it? as follows:

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('mnist_data',one_hot=True)

test_x = mnist.test.images[:3000]
test_y = mnist.test.labels[:3000]
print(len(test_x)," ",len(test_x[0]))
print(test_x)

Thus it can be obtained which train-images.idx3-ubyte of some data:

3000   784
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]

You can find it by the 3000 list, each of which is a vector picture 784 = 28 * 28.
We look the same test_y :

3000   10
[[0. 0. 0. ... 1. 0. 0.]
 [0. 0. 1. ... 0. 0. 0.]
 [0. 1. 0. ... 0. 0. 0.]
 ...
 [0. 1. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [1. 0. 0. ... 0. 0. 0.]]

3000 meaning as above, 10 is its class label which category which index is 1, and 0 otherwise.
Here we begin to learn to make your own mnist set, and saved as csv or data can be.

Code (python)

# coding:utf-8
import cv2
import os
import random
import pandas as pd

def progress_bar(i):
    a = int(i)*10
    b = '='*int(i)
    c = '->'
    d = '·'*(10-int(i))
    if(i==0):
        print("******执行开始******")
    print(a,'\t',"%","[",b+c+d,"]")
    if (i == 10):
        print("******执行结束******")

def mnist_change(path,img_width,img_height):
    images = []
    labels = []
    tags = os.listdir(path)

    n = 0
    for tag in tags:
        _tag_ = os.listdir(path+tag)
        n += len(_tag_)
    key = 0
    for tag in tags:
        _tag_ = os.listdir(path+tag)
        i = 0
        for image in _tag_:
            img_path = path+tag+'/'+image
            img = cv2.imread(img_path)
            img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
            img = cv2.resize(img, (img_width,img_height),interpolation=cv2.INTER_CUBIC)
            img_data = []
            # images中的一个元素
            for data in img:
                img_data.extend(data)
            # labels中的一个元素
            zero_ = [0 for _ in range(len(tags))]
            zero_[i] = 1

            if(key == 0):
                images.append(img_data)
                labels.append(zero_)
            else:
                rand_i = random.randint(0,len(labels)-1)
                temp1,temp2 = images[rand_i],labels[rand_i]
                images[rand_i],labels[rand_i] = img_data,zero_
                images.append(temp1)
                labels.append(temp2)
            if(key%(n//10)==0 and key/(n//10)<=10):
                progress_bar(key/(n//10))
            key += 1
        i += 1
    return images,labels

def text_save(file, data):
    data = pd.DataFrame(data)
    data.to_csv(file,index=None)
    print("保存文件成功")

if __name__ == '__main__':
    # 自定义参数
    path = 'data/'
    width,height = 200,200
    images,labels = mnist_change(path,width,height)
    text_save('images.csv',images)
    text_save('labels.csv',labels)
    '''
    图片存放格式
    data
    	n0
    		4546asd.jpg
    		asdw4145.jpg
    	n1
    		asd4.jpg
    '''

result:

******执行开始******
0 	 % [ ->·········· ]
10 	 % [ =->········· ]
20 	 % [ ==->········ ]
30 	 % [ ===->······· ]
40 	 % [ ====->······ ]
50 	 % [ =====->····· ]
60 	 % [ ======->···· ]
70 	 % [ =======->··· ]
80 	 % [ ========->·· ]
90 	 % [ =========->· ]
100  % [ ==========-> ]
******执行结束******
保存文件成功
保存文件成功

idea

This is the first blog I wrote, the level of both the content and the code may not be very high, we apologize.
If you have any ideas, we welcomed the comments section exchanges.

I built a small group, we welcome the exchange.
Here Insert Picture Description

Released six original articles · won praise 0 · Views 73

Guess you like

Origin blog.csdn.net/qq_44851357/article/details/104054064