Convolution and pooling layers of tensorflow (2): cifar10 in practice

In the two blogs, Convolution and Pooling Layer (1) and Convolution of Various Convolution Types in tensorflow, the core layers of convolutional neural networks are mainly explained, and the current popular Caffe and tf frameworks are also introduced. , This blog will follow the content of the convolution and pooling layer (1) in tensorflow, and continue to introduce the use of the convolutional neural network CNN in the tf framework.

Therefore, the cifar10\100 project, an entry-level tutorial for CNN, will be introduced next. The cifar10\100 dataset was collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton, both of which were selected from a dataset of 80 million. So the pictures that make up themselves are very similar, with the difference being:

  • cifar-10 is composed of 60,000 color pictures of 32*32 size representing 10 categories of objects. As the name suggests, each category has exactly 6,000 pictures, and the data between classes is balanced, and 5,000 pictures are used for training, 1,000 pictures are used for testing and verification, then This dataset has a total of 50,000 training images and 10,000 test images. The 10 categories that are included are as follows: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck.

  • cifar-100 is composed of 60,000 color pictures of 32*32 size representing 100 categories of objects. As the name suggests, each category has exactly 600 pictures, the data within the class is balanced, and 500 pictures are used for training, 100 pictures are used for testing and verification, then This dataset has a total of 50,000 training images and 10,000 test images. The categories included can be viewed on the official website.

The official website address is as follows: The CIFAR-10 and CIFAR-100 dataset   provides us with 3 ways to read the dataset.

No matter what framework you learn, cifar-10\100 is an entry-level classic CNN project. The reason why it is called a project is that it is evolved from the above data set and relies on various tools to solve various engineering problems. Here It is a 10- or 100-class classification problem, so this project is worth learning together. And this blog introduces the cifar-10\100 project implemented under the tf tool.

The code structure of cifar-10 is shown in the following table. There are a total of 5 files, and their functions are as follows:

document effect
cifar10_input.py Read the contents of the native CIFAR-10 binary file format.
cifar10.py Build a model of CIFAR-10.
cifar10_train.py Train CIFAR-10 models on CPU or GPU.
cifar10_multi_gpu_train.py Train a CIFAR-10 model on multiple GPUs.
cifar10_eval.py Evaluate the predictive performance of the CIFAR-10 model.

Give the Baidu cloud address of the above 5 files: cifar-10 project

Since there are detailed comments in the above-mentioned documents, the places that require further understanding of these documents are explained below.

  • cifar10_input.py

This file is used to read the official cifar10\100 dataset, and it is a binary file format among the three datasets, that is, a series of data_batch_num.bin (num=1...5) files and test_batch.bin files , each file is organized like this:

<1 x label><3072 x pixel>
...
<1 x label><3072 x pixel> 

That is to say, each row in each batch records a picture, and the first byte is the label of the picture, which should be an integer variable in the range of 0-9 ; and the other 3072 bytes represent the pixel values ​​of 3 channels, that is, 3*32*32, which should be arranged in the order of RGB, that is, 1024R+1024G+1024B.
In addition, there is another file, batches.meta.txt. As the name suggests, labels alone are not enough. You also need to know what each label represents. This file stores the representation of each plastic label by line, and both are one-to-one correspondence.
This file has 4 functions: read_cifar10, _generate_image_and_label_batch, distorted_inputs, inputs.
The purpose of the first function read_cifar10 is to read the contents of a line of data_batch_num.bin, that is, to read a picture, and to obtain the label of the picture and the pixel values ​​organized according to the [height, width, channel] dimensions.
There are 4 important functions here: tf.FixedLengthRecordReader, tf.slice, tf.reshape, tf.transpose. Among them,
tf.FixedLengthRecordReader is a binary file reader specially used to read fixed-length bytes; tf.slice function is the slice operation of tf, and the function prototype is as follows:
tf.slice(inputs,begin,size,name='' )
Extract the content of size from the inputs at the position of begin, and the name has a default value, for example:
tf.slice(record_bytes, [0], [label_bytes])
tf.slice(record_bytes, [label_bytes], [image_bytes])
tf.reshape is to reorganize the dimensions of a tensor. The function prototype is as follows:
tf.reshape(tensor,shape,name=None)

Reorganize the original tensor into a new tensor according to the shape. E.g:

tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]), [result.depth, result.height, result.width])

In this way, the string of characters obtained by slice can be reorganized into a tensor of the size and dimension of [result.depth, result.height, result.width].

tf.transpose is to exchange the dimension order of a tensor. The function prototype is as follows:
tf.transpose(a, perm=None, name='transpose')

Swap tensor a into a new tensor in the order of perm, for example:

result.uint8image = tf.transpose(depth_major, [1, 2, 0])

This turns [result.depth, result.height, result.width] into [result.height, result.width, result.depth].

The purpose of the second function _generate_image_and_label_batch is to construct a batch of images and corresponding labels. There is a very important function tf.train.shuffle_batch , for example:

images, label_batch = tf.train.shuffle_batch(
      [image, label],
      batch_size=batch_size,
      num_threads=num_preprocess_threads,
      capacity=min_queue_examples + 3 * batch_size,
      min_after_dequeue=min_queue_examples)

Understand it this way: this function builds a queue of capacity size, and then randomly shuffles these pictures in the capacity, and dequeues the number of pictures of size batch_size each time, and at the same time adds a part of the number of pictures to the queue, the whole process The number of pictures in the queue cannot be less than min_after_dequeue to ensure a good shuffling effect between in and out. In this way, this function returns a batch of images and corresponding labels.

The purpose of the third function distorted_inputs and the fourth function inputs is to perform data augmentation and preprocessing on the training and testing datasets, including crop, Flip, random_brightness, random_contrast, Whitening, etc. The final return is the processed data as the real input of the model. The function inside can look at the code.

  • cifar10.py

This file is mainly used to define the cifar10\100 model and define the loss function for training. A total of 10 functions are defined in this file, namely _activation_summary, _variable_on_cpu, _variable_with_weight_decay, distorted_inputs, inputs, inference, loss, _add_loss_summaries, train, maybe_download_and_extract. In addition, there are some hyperparameter configurations, such as batch_size=128, the initial learning rate is 0.1, the attenuation factor of the learning rate is 0.1, the number of cycles when the learning rate starts to decrease is 350, the moving average attenuation is 0.9999, and so on.

The function of the function is as follows:

The _activation_summary function adds a summary to the activation function to facilitate visualization of data transmitted by related nodes in tensorboard. Mainly tf.histogram_summary and tf.scalar_summary, which will be introduced later.

The purpose of the _variable_on_cpu function is to create a variable on the CPU.

The purpose of the _variable_with_weight_decay function is to initialize the variable with a Gaussian distribution and add a weight decay factor weight_decay if needed.

The purpose of the distorted_inputs function is to construct a training data set based on cifar10_input.py , and obtain the input data of the model after data enhancement and preprocessing.

The purpose of the inputs function is to build a test dataset based on cifar10_input.py, and get the input data for model testing after data enhancement and preprocessing.

The purpose of the inference function is to build a CNN network.

The purpose of the loss function is to define the loss of the model.

 The purpose of the _add_loss_summaries function is to add a summary to the loss for easy visualization.

The purpose of the train function is to train the cifar10 model, cost function, etc.

The purpose of the maybe_download_and_extract function is to download the cifar10 dataset from the specified website and decompress it.

 Running the above training code cifar10_train.py is not smooth sailing. The errors and modifications are as follows:

1. AttributeError: module 'tensorflow.python.ops.image_ops' has no attribute 'random_crop'。

This error comes from distorted_image = tf.image.random_crop(reshaped_image, [height, width]) in the cifar10_input.py file, modify this sentence to:

distorted_image = tf.random_crop(reshaped_image, [height, width, 3])

2. AttributeError: module 'tensorflow.python.ops.image_ops' has no attribute 'per_image_whitening'。

This error comes from float_image = tf.image.per_image_whitening(distorted_image) in the cifar10_input.py file, modify this sentence to:

float_image = tf.image.per_image_standardization(distorted_image)

3. AttributeError: module 'tensorflow' has no attribute 'image_summary'。

This error comes from tf.image_summary('images', images) in the cifar10_input.py file, modify this sentence to:

tf.summary.image('images', images)
 tf.summary.scalar('learning_rate', lr)#cifar10.py

Note: Modifications are made in similar places throughout the project.

4. AttributeError: module 'tensorflow' has no attribute 'histogram_summary'。

This error comes from tf.histogram_summary(tensor_name + '/activations', x) in the cifar10.py file, modify this sentence to:

tf.summary.histogram(tensor_name + '/activations', x)
tf.summary.histogram(var.op.name, var)

Note: Modifications are made in similar places throughout the project.

5. AttributeError: module 'tensorflow' has no attribute 'scalar_summary'。

This error comes from tf.scalar_summary(tensor_name + '/sparsity', tf.nn.zero_fraction(x)) in the cifar10.py file, modify this sentence to:

tf.summary.scalar(tensor_name + '/sparsity', tf.nn.zero_fraction(x))

6. AttributeError: module 'tensorflow' has no attribute 'mul'。

This error has been said before, just change to multiply.

7. ValueError: Tried to convert 'tensor' to a tensor and failed. Error: Argument must be a dense tensor: range(0, 128) - got shape [128], but wanted []。

This error is located on this line of code in cifar10.py:

indices = tf.reshape(range(FLAGS.batch_size), [FLAGS.batch_size, 1])

Change this sentence to:

indices = tf.reshape(list(range(FLAGS.batch_size)), [FLAGS.batch_size, 1])

8. ValueError: Shapes (2, 128, 1) and () are incompatible。

This error is triggered by concated = tf.concat(1, [indices, sparse_labels]) in cifar10.py, this sentence is modified to:

concated = tf.concat([indices, sparse_labels], 1)

9. ValueError: Only call `softmax_cross_entropy_with_logits` with named arguments (labels=..., logits=..., ...)。

This error comes from the softmax_cross_entropy_with_logits function, the new tf version updates this function, the function prototype:

tf.nn.softmax_cross_entropy_with_logits(_sentinel=None, labels=None, logits=None, dim=-1, name=None)

Modify it to:

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
      logits=logits, labels=dense_labels, name='cross_entropy_per_example')

10. TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.

Prompted, if grad: should be modified to if grad is not None.

11. AttributeError: module 'tensorflow' has no attribute 'merge_all_summaries'。

The error here comes from summary_op = tf.merge_all_summaries() in cifar10_train.py, modify it to:

summary_op = tf.summary.merge_all()

12. AttributeError: module 'tensorflow.python.training.training' has no attribute 'SummaryWriter'。

This error comes from cifar10_train.py

summary_writer = tf.train.SummaryWriter(FLAGS.train_dir,
                                            graph_def=sess.graph_def)

Modify it to:

summary_writer = tf.summary.FileWriter(FLAGS.train_dir,
                                            graph_def=sess.graph_def)

Only here can we run this example successfully. The above errors are all caused by the incompatibility between the high and low versions of tf. There is no problem with the code itself, but it has been modified in the high version of tf. For example, the version of this machine is 1.7. version should be fine.

The following is the running log

2018-05-05 13:13:40.782676: step 0, loss = 4.68 (0.1 examples/sec; 918.057 sec/batch)
2018-05-05 13:14:37.995829: step 10, loss = 4.66 (19.7 examples/sec; 6.510 sec/batch)
2018-05-05 13:15:38.571923: step 20, loss = 4.64 (19.3 examples/sec; 6.618 sec/batch)
2018-05-05 13:16:37.660061: step 30, loss = 4.62 (20.4 examples/sec; 6.260 sec/batch)
2018-05-05 13:17:35.194066: step 40, loss = 4.60 (22.7 examples/sec; 5.639 sec/batch)
2018-05-05 13:18:36.177244: step 50, loss = 4.58 (22.5 examples/sec; 5.699 sec/batch)
2018-05-05 13:19:37.775057: step 60, loss = 4.57 (20.9 examples/sec; 6.122 sec/batch)
2018-05-05 13:20:38.255898: step 70, loss = 4.55 (21.0 examples/sec; 6.081 sec/batch)
2018-05-05 13:21:39.074639: step 80, loss = 4.53 (18.5 examples/sec; 6.929 sec/batch)
2018-05-05 13:22:42.469230: step 90, loss = 4.51 (21.9 examples/sec; 5.858 sec/batch)
2018-05-05 13:23:43.102476: step 100, loss = 4.50 (20.5 examples/sec; 6.236 sec/batch)
2018-05-05 13:24:53.920811: step 110, loss = 4.48 (19.1 examples/sec; 6.708 sec/batch)
2018-05-05 13:25:55.722164: step 120, loss = 4.46 (21.0 examples/sec; 6.097 sec/batch)
2018-05-05 13:26:58.607399: step 130, loss = 4.44 (20.8 examples/sec; 6.153 sec/batch)
2018-05-05 13:27:56.598621: step 140, loss = 4.42 (19.4 examples/sec; 6.589 sec/batch)
2018-05-05 13:28:57.043367: step 150, loss = 4.41 (20.9 examples/sec; 6.117 sec/batch)
2018-05-05 13:30:00.026865: step 160, loss = 4.39 (19.3 examples/sec; 6.640 sec/batch)
2018-05-05 13:30:57.701242: step 170, loss = 4.38 (19.2 examples/sec; 6.677 sec/batch)
2018-05-05 13:31:54.940464: step 180, loss = 4.36 (24.6 examples/sec; 5.210 sec/batch)
2018-05-05 13:32:54.969103: step 190, loss = 4.34 (20.9 examples/sec; 6.119 sec/batch)
2018-05-05 13:33:57.856344: step 200, loss = 4.32 (20.2 examples/sec; 6.340 sec/batch)
2018-05-05 13:35:03.489890: step 210, loss = 4.31 (21.5 examples/sec; 5.966 sec/batch)

The two numbers in the brackets above indicate how many pictures per second and how many seconds to run a batch. The multiplication of the two is approximately equal to the number of pictures in a batch of 128. It can be seen from the code, as shown in the following code:

if step % 10 == 0:
        num_examples_per_step = FLAGS.batch_size
        examples_per_sec = num_examples_per_step / duration
        sec_per_batch = float(duration)

        format_str = ('%s: step %d, loss = %.2f (%.1f examples/sec; %.3f '
                      'sec/batch)')
        print (format_str % (datetime.now(), step, loss_value,
                             examples_per_sec, sec_per_batch))

Because the computer configuration is low, it runs too slowly, 20,000 times, which is very time-consuming, so the final result will not be posted.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325322443&siteId=291194637