"Tensorflow Machine Learning Practice Guide" study notes and corresponding version code

TensorFlow Cookbook for version 1.12

Code for "Tensorflow Machine Learning Practice Guide" for TellnsorFlow-1.12

When I was studying the book "Tensorflow Machine Learning Practice Guide", I used the version of Tensorflow-1.12. Although a lot of code can be run and ok, there are still situations where the function is renamed and the code runs incorrectly, so I put The code for Tensorflow-1.12 is uploaded here, I hope it will be a little help for the rookie like me. If you are in doubt, refer to the author's source code [1], and the official Tensorflow website. You can also leave a message to learn from each other ~ _

Some sporadic notes are also recorded below

Chapter3 Linear regression based on TensorFlow

3.2

After seeing these two examples, I finally understood that although Ax = b is written, the result is x. This x is the real coefficient matrix.
Understand the cholesky matrix decomposition, for the code, mainly know the meaning of the function tf.matrix_solve

3.5

for i in range(iterations):
rand_index = np.random.choice(len(x_vals), size=batch_size)
rand_x = np.transpose([x_vals[rand_index]])
rand_y = np.transpose([y_vals[rand_index]]) 

I know that training for Mao is going to be transposed, because the format of the data here is like this, so it needs to be transposed.

3.9

I used the birth weight data, but the source link could not be opened, and the data was found in the github source code, but the data here was processed, and the result was a long time. The following code is not correct at the beginning, the result of the training is completely wrong

# 这里的数据错误,导致最后结果和书上不一样,书上写的是x[0], x[2:9],但是索引是从0开始的,坑爹
# 书上用的是源数据,而我这里用的是处理过的数据,书上也说了去掉了实际出生体特征和ID两列,估计是源数据里面的第一列和最后一列
y_vals = np.array([x[0] for x in birth_data])
x_vals = np.array([x[1:8] for x in birth_data])

Chapter4 TensorFlow-based support vector machine

Support vector machine is to find the hyperplane to maximize the margin. So how to ask? Finally, the Lagrangian multiplier is used, and for each hyperplane, only the points on the hyperplane (ie, support vectors) are valid for the extremum, other points outside the hyperplane are not involved in the calculation. Therefore, for any hyperplane, if its support vector makes Margin the largest for training data, then it is the optimal hyperplane

It is not possible to separate the data exactly every time, the parameter estimation still has to be adjusted. In addition, increase the number of trainings to 1000 times, almost every time you can get good data, the accuracy rate of the training data set and the test data set is almost 100%. This mainly means that the training times are not enough

4.4 Use of kernel functions on TensorFlow

  • tf.reduce_sum the sum of the
    specified dimensions, one dimension is reduced after execution
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.reduce_sum(x) # 6
# 第0维相加[1, 1, 1] + [1, 1, 1]
tf.reduce_sum(x, 0) # [2, 2, 2]
# 第1维相加,[1+1+1], [1+1+1]
tf.reduce_sum(x, 1) # [3, 3]
tf.reduce_sum(x, 1, keep_dims=True) # [[3], [3]]
tf.reduce_sum(x, [0, 1]) # 6
  • Gaussian kernel function
    is difficult to understand at the beginning k ( x i , p i ) = e y x i p i 2 k(x_i, p_i) = e^{-y{||x_i - p_i||}^2}
    Finally found that the code is written after the square is disassembled: ( a b ) 2 = a 2 2 a b + b 2 {(a - b)}^2 = a^2 - 2ab + b^2 , where a = x_data [0], b = x_data [1], because dist is the squared value
# Gaussian (RBF) kernel
gamma = tf.constant(-50.0)
dist = tf.reduce_sum(tf.square(x_data), 1)
dist = tf.reshape(dist, [-1,1])
sq_dists = tf.add(tf.sub(dist, tf.mul(2., tf.matmul(x_data, tf.transpose(x_data)))), tf.transpose(dist))
my_kernel = tf.exp(tf.mul(gamma, tf.abs(sq_dists)))

Chapter5 Nearest Neighbor Method

5.1 Introduction to nearest neighbor method

  • numpy.ptp function
    Calculation range, equivalent to (maximum value-minimum value)

5.3 How to measure text distance

  • The meaning of subtraction of tf.subtract in different dimensions: cross subtraction
# 1.这里的x_data_train 减去 扩维的x_data_test,是为了让x_data_train的每一分数据都能减去x_data_test的每一分数据。这样才能计算,任意一个x_data_test的数据到x_data_train任意一个数据的距离
# 2.降维:把每一分数据相减后的和作为两份数据之前的距离,求的距离就是cols_used列的和
distance = tf.reduce_sum(tf.abs(tf.subtract(x_data_train, tf.expand_dims(x_data_test,1))), reduction_indices=2)

5.4 Using TensorFlow to achieve hybrid distance calculation

  • numpy.std
    finds the standard deviation of a certain dimension. For x_vals.shape = (506, 10), the final result is a one-dimensional array of shape (10,)

  • tf.diag
    generates diagonal matrix, other values ​​are filled with 0

  • tf.tile
    *This operation creates a new tensor by replicating input multiples times. *

  • The second parameter perm of tf.transpose
    specifies the transformation of the specified dimension

  • tf.argmax
    tf.argmax is implemented by tensorflow using numpy np.argmax, which can give the index value of the maximum value of a tensor object in a certain dimension, which is often used in the calculation of metric (such as acc)

Chapter6 Neural Network Algorithm

6.4 Using TensorFlow to implement a single-layer neural network

  • Gate function The
    previous gate function does not know what it means, in fact, it is just the expression of each node of the neural network

6.5 Using TensorFlow to implement common layers of neural networks

  • Layer function thinking data

It is the thinking of input data [batch_size, width, height, channels], batch_size is the number of images processed in batches, just like the batch data in the animation learned the day before is 3 or 3 image data. Width and height are the width and height of the image, which is easy to understand. The last channel is a color channel, black and white is 1 (such as minist data), and RGB is 3.

6.8 Using TensorFlow to implement tic-tac-toe based on neural network

To train our model, we will have a list of board positions followed by the best optimal response for a number of different boards. We can reduce the amount of boards to train on by considering only board positions that are different with respect to symmetries. The non-identity transformations of a Tic Tac Toe board are a rotation (either direction) by 90 degrees, 180 degrees, 270 degrees, a horizontal reflection, and a vertical reflection.Given this idea, we will use a shortlist of boards with the optimal move, apply two random transformations, and feed that into our neural network to learn. This
means that because of symmetry, many different data can be generated by rotation and mirroring, without having to read it from the file

image

Chapter8 Convolutional Neural Network

8.4 Retrain existing CNN models

python data/build_image_data.py --train_directory=“temp/train_dir/” --validation_directory=“temp/validation_dir” --output_directory=“temp/” --labels_file="temp/cifar10_labels.txt

python data/build_image_data.py --train_directory="…/…/…/…/…/‘TensorFlow Machine Learning Cookbook/dataset’/train_dir" --validation_directory="…/…/…/…/…/‘TensorFlow Machine Learning Cookbook’/dataset/validation_dir" --output_directory="…/…/…/…/…/‘TensorFlow Machine Learning Cookbook’/dataset/" --labels_file=""…/…/…/…/…/‘TensorFlow Machine Learning Cookbook’/dataset/cifar10_labels.txt"

Error:

(ENV) E:\study\machinelearning\code\models\research\inception\inception>python data/build_image_data.py --train_directory="temp/train_dir/" --validation_directory="temp/validation_dir" --output_directory="temp/" --labels_file="temp/cifar10_labels.txt
Saving results to temp/
Determining list of input files and labels from temp/validation_dir.
WARNING:tensorflow:From data/build_image_data.py:369: FastGFile.__init__ (from tensorflow.python.platform.gfile) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.gfile.GFile.
Traceback (most recent call last):
  File "data/build_image_data.py", line 437, in <module>
    tf.app.run()
  File "E:\study\machinelearning\ENV\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
    _sys.exit(main(argv))
  File "data/build_image_data.py", line 431, in main
    FLAGS.validation_shards, FLAGS.labels_file)
  File "data/build_image_data.py", line 417, in _process_dataset
    filenames, texts, labels = _find_image_files(directory, labels_file)
  File "data/build_image_data.py", line 369, in _find_image_files
    labels_file, 'r').readlines()]
  File "E:\study\machinelearning\ENV\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 188, in readlines
    self._preread_check()
  File "E:\study\machinelearning\ENV\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 85, in _preread_check
    compat.as_bytes(self.__name), 1024 * 512, status)
  File "E:\study\machinelearning\ENV\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: temp/cifar10_labels.txt : ϵͳ\udcd5Ҳ\udcbb\udcb5\udcbdָ\udcb6\udca8\udcb5\udcc4·\udcbe\udcb6\udca1\udca3
; No such process

Change to full path solution: reference [1]

python data/build_image_data.py --train_directory="E:/temp/train_dir/" --validation_directory="E:/temp/validation_dir" --output_directory="E:/temp/" --labels_file="E:/temp/cifar10_labels.txt

8.5 Using TensorFlow to imitate master painting

The github code has been updated, it is really great, refer to [1]

def vgg_network(network_weights, init_image):
 network = {}
 image = init_image
 for i, layer in enumerate(vgg_layers):
  if layer[0] == 'c':
   weights, bias = network_weights[i][0][0][0][0]
   weights = np.transpose(weights, (1, 0, 2, 3))
   bias = bias.reshape(-1)
   conv_layer = tf.nn.conv2d(image, tf.constant(weights), (1,1,1,1), 'SAME')
   image = tf.nn.bias_add(conv_layer, bias)
  elif layer[0] == 'r':
   image = tf.nn.relu(image)
  else:
   image = tf.nn.max_pool(image, (1,2,2,1), (1,2,2,1), 'SAME')
  network[layer] = image
 return (network)

The original code is layer [1], just change to layer [0], this problem is also found from reference [1]

Chapter9 Recursive Neural Network

9.2 Using TensorFlow to implement RNN model for spam prediction

  • re regular expression
Python 的 re 模块提供了re.sub用于替换字符串中的匹配项。

语法:

re.sub(pattern, repl, string, count=0, flags=0)
参数:

pattern : 正则中的模式字符串。
repl : 替换的字符串,也可为一个函数。
string : 要被查找替换的原始字符串。
count : 模式匹配后替换的最大次数,默认 0 表示替换所有的匹配。

来源: https://www.runoob.com/python/python-reg-expressions.htm

text_string = re.sub (r '([^ \ s \ w] | _ | [0-9]) +', '', text_string)
This sentence is to replace the specified character with a blank '', that is, remove
(0 ) Matching the expression in parentheses, also means a group
(1) [...] is used to represent a group of characters, listed separately
(2) [ …]: means characters that are not in [], [ \ s \ w] is Any alphanumeric underscore and blank character exclusion
(3) \ s: Match any blank character, equivalent to [\ t \ n \ r \ f]
(4) \ w: Match alphanumeric underscore
(5) |
(6) re +: matches one or more expressions.
So, this sentence is to match non-letter characters, and all others are replaced. The following * | _ | [0-9] * is to put underscores and numbers. Count it up. However, I don't know if the blank characters are included in the final calculation! Obviously ruled out

Chapter11 Advanced Application of TensorFlow

11.2 TensorFlow visualization: Tensorboard

See the log for the address of Tensorboard: http: // localhost: 6006

Chrome cannot be opened, just change to the following command:

tensorboard --logdir=tensorboard/ --host localhost --port 8088

Code

Put it on github, reference [2]

reference

[1]Tensorflow cookbook
[2]github

Published 41 original articles · praised 7 · 20,000+ views

Guess you like

Origin blog.csdn.net/pkxpp/article/details/92412896