It's just a simple example, focusing on understanding the process of machine learning and the difficulties of machine learning, such as:
- Data (number of samples, quality of samples)
- Model (composition, algorithm)
- Learning method (node initial value, learning rate )
The premise of machine learning is that a large number of training samples are required, but it is not so easy to obtain a certain scale of sample data and label them one by one. The general process is as follows:
1- Use a crawler to crawl images according to the specified keywords (such as Baidu, Google)
2- Do special processing on the crawled images according to requirements (such as identifying and cropping out faces through OpenCV)
3- Check and organize Images (screening images and resizing images, etc.)
4-Organizing markup files
5-Writing models
6-Training data 7 - Test
confirmation
Version TensorFlow
1.2 + OpenCV 2.5 Building a character recognition system with TensorFlow (1) File structure /usr/local/tensorflow/sample/tf-she-image
quote
├ ckpt [checkpoint file]
├ data [learning result]
├ eval_images [test image]
├ face [face extracted by OpenCV]
│ ├ ella
│ └ selina
├ original [raw image captured from Baidu pictures]
│ ├ ella
│ └ selina
├ test [image for testing model accuracy after learning]
│ ├ data.txt [image path and mark]
│ ├ ella
│ └ selina
└ train [image for training and learning]
├ data.txt [image path and mark]
├ ella
└ selina
├ data [learning result]
├ eval_images [test image]
├ face [face extracted by OpenCV]
│ ├ ella
│ └ selina
├ original [raw image captured from Baidu pictures]
│ ├ ella
│ └ selina
├ test [image for testing model accuracy after learning]
│ ├ data.txt [image path and mark]
│ ├ ella
│ └ selina
└ train [image for training and learning]
├ data.txt [image path and mark]
├ ella
└ selina
(2) Grab images Grab pictures
from the search results of Baidu pictures according to keywords. There are many examples of Python grabbing Baidu pictures that can be referred to on the Internet, and they are all relatively simple. Since it is to be provided to the machine as a sample to learn, it is necessary to capture as many high-quality images with facial features as possible.
/usr/local/tensorflow/sample/tf-she-image/original/ella
/usr/local/tensorflow/sample/tf-she-image/original/selina
(3) Extract faces
as samples for face recognition, only Only the local data of the face is needed, so the captured image needs special processing, identify the face in the image through OpenCV, extract and save it.
/usr/local/tensorflow/sample/tf-she-image/face/ella
/usr/local/tensorflow/sample/tf-she-image/face/selina
face_detect.py
import cv2 import numpy as np import os.path input_data_path = '/usr/local/tensorflow/sample/tf-she-image/original/ella/' save_path = '/usr/local/tensorflow/sample/tf-she-image/face/ella/' cascade_path = '/usr/share/OpenCV/haarcascades/haarcascade_frontalface_default.xml' faceCascade = cv2.CascadeClassifier(cascade_path) image_count = 16000 face_detect_count = 0 for i in range(image_count): if os.path.isfile(input_data_path + str(i) + '.jpg'): try: img = cv2.imread(input_data_path + str(i) + '.jpg', cv2.IMREAD_COLOR) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) face = faceCascade.detectMultiScale(gray, 1.1, 3) if len(face) > 0: for rect in face: x = rect[0] y = straight[1] w = rect[2] h = rect[3] cv2.imwrite(save_path + 'face-' + str(face_detect_count) + '.jpg', img[y:y+h, x:x+w]) face_detect_count = face_detect_count + 1 else: print('image' + str(i) + ': No Face') except Exception as e: print('image' + str(i) + ': Exception - ' + str(e)) else: print('image' + str(i) + ': No File')
(3) Sorting out the images
Due to the quality of the captured images and the recognition rate of OpenCV, the images need to be re-screened, leaving only images with real facial features.
/usr/local/tensorflow/sample/tf-she-image/train/ella
/usr/local/tensorflow/sample/tf-she-image/train/selina
This step is very time consuming! Because the higher the quality of the training samples provided, the more accurate the recognition is. In the end, 380 ella and 350 selina were extracted here. So a special thank you to the providers of datasets that have been open sourced!
Organize the labels file of the image: data.txt
quote
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00001.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00002.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00003.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00004.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00005.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00006.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00007.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00008.jpg 0
。。。
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00344.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00345.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00346.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00347.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00348.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00349.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00350.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00002.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00003.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00004.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00005.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00006.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00007.jpg 0
/usr/local/tensorflow/sample/tf-she-image/train/ella/ella-00008.jpg 0
。。。
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00344.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00345.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00346.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00347.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00348.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00349.jpg 1
/usr/local/tensorflow/sample/tf-she-image/train/selina/selina-00350.jpg 1
***The images used to test the model accuracy can be selected the same as training.
(4) Write the model
train.py
import sys import cv2 import random import numpy as np import tensorflow as tf import tensorflow.python.platform NUM_CLASSES = 2 IMAGE_SIZE = 28 IMAGE_PIXELS = IMAGE_SIZE*IMAGE_SIZE*3 flags = tf.app.flags FLAGS = flags.FLAGS flags.DEFINE_string('train', '/usr/local/tensorflow/sample/tf-she-image/train/data.txt', 'File name of train data') flags.DEFINE_string('test', '/usr/local/tensorflow/sample/tf-she-image/test/data.txt', 'File name of test data') flags.DEFINE_string('train_dir', '/usr/local/tensorflow/sample/tf-she-image/data/', 'Directory to put the training data') flags.DEFINE_integer('max_steps', 100, 'Number of steps to run trainer.') flags.DEFINE_integer('batch_size', 20, 'Batch size Must divide evenly into the dataset sizes.') flags.DEFINE_float('learning_rate', 1e-4, 'Initial learning rate.') def inference(images_placeholder, keep_prob): def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') x_image = tf.reshape(images_placeholder, [-1, IMAGE_SIZE, IMAGE_SIZE, 3]) with tf.name_scope('conv1') as scope: W_conv1 = weight_variable([5, 5, 3, 32]) b_conv1 = bias_variable([32]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) with tf.name_scope('pool1') as scope: h_pool1 = max_pool_2x2(h_conv1) with tf.name_scope('conv2') as scope: W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) with tf.name_scope('pool2') as scope: h_pool2 = max_pool_2x2(h_conv2) with tf.name_scope('fc1') as scope: W_fc1 = weight_variable([7*7*64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) with tf.name_scope('fc2') as scope: W_fc2 = weight_variable([1024, NUM_CLASSES]) b_fc2 = bias_variable([NUM_CLASSES]) with tf.name_scope('softmax') as scope: y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) return y_conv def loss(logits, labels): cross_entropy = -tf.reduce_sum(labels*tf.log(logits)) tf.summary.scalar("cross_entropy", cross_entropy) return cross_entropy def training(loss, learning_rate): train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss) return train_step def accuracy(logits, labels): correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) tf.summary.scalar("accuracy", accuracy) return accuracy if __name__ == '__main__': f = open(FLAGS.train, 'r') train_image = [] train_label = [] for line in f: line = line.rstrip() l = line.split() img = cv2.imread(l[0]) img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE)) train_image.append(img.flatten().astype(np.float32)/255.0) tmp = np.zeros(NUM_CLASSES) tmp[int(l[1])] = 1 train_label.append(tmp) train_image = np.asarray(train_image) train_label = np.asarray(train_label) f.close() f = open(FLAGS.test, 'r') test_image = [] test_label = [] for line in f: line = line.rstrip() l = line.split() img = cv2.imread(l[0]) img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE)) test_image.append(img.flatten().astype(np.float32)/255.0) tmp = np.zeros(NUM_CLASSES) tmp[int(l[1])] = 1 test_label.append(tmp) test_image = np.asarray(test_image) test_label = np.asarray(test_label) f.close() with tf.Graph().as_default(): images_placeholder = tf.placeholder("float", shape=(None, IMAGE_PIXELS)) labels_placeholder = tf.placeholder("float", shape=(None, NUM_CLASSES)) keep_prob = tf.placeholder("float") logits = inference(images_placeholder, keep_prob) loss_value = loss(logits, labels_placeholder) train_op = training(loss_value, FLAGS.learning_rate) acc = accuracy(logits, labels_placeholder) saver = tf.train.Saver() sex = tf.Session () sess.run(tf.global_variables_initializer()) summary_op = tf.summary.merge_all() summary_writer = tf.summary.FileWriter(FLAGS.train_dir, sess.graph) for step in range(FLAGS.max_steps): for i in range(int(len(train_image)/FLAGS.batch_size)): batch = FLAGS.batch_size*i sess.run(train_op, feed_dict={ images_placeholder: train_image[batch:batch+FLAGS.batch_size], labels_placeholder: train_label[batch:batch+FLAGS.batch_size], keep_prob: 0.5}) train_accuracy = sess.run(acc, feed_dict={ images_placeholder: train_image, labels_placeholder: train_label, keep_prob: 1.0}) print("step %d, training accuracy %g" % (step, train_accuracy)) summary_str = sess.run(summary_op, feed_dict={ images_placeholder: train_image, labels_placeholder: train_label, keep_prob: 1.0}) summary_writer.add_summary(summary_str, step) print("test accuracy %g" % sess.run(acc, feed_dict={ images_placeholder: test_image, labels_placeholder: test_label, keep_prob: 1.0})) save_path = saver.save(sess, '/usr/local/tensorflow/sample/tf-she-image/ckpt/model.ckpt')
(5) The closer the training data accuracy is to 1, the higher the accuracy.
quote
(tensorflow) [root@localhost tf-she-image]# python train.py
step 0, training accuracy 0.479452
step 1, training accuracy 0.479452
step 2, training accuracy 0.480822
step 3, training accuracy 0.505479
step 4, training accuracy 0.531507
step 5, training accuracy 0.609589
step 6, training accuracy 0.630137
step 7, training accuracy 0.639726
step 8, training accuracy 0.732877
step 9, training accuracy 0.713699
。。。。。。
step 89, training accuracy 0.994521
step 90, training accuracy 0.994521
step 91, training accuracy 0.994521
step 92, training accuracy 0.994521
step 93, training accuracy 0.994521
step 94, training accuracy 0.994521
step 95, training accuracy 0.994521
step 96, training accuracy 0.994521
step 97, training accuracy 0.994521
step 98, training accuracy 0.994521
step 99, training accuracy 0.994521
test accuracy 0.994521
step 0, training accuracy 0.479452
step 1, training accuracy 0.479452
step 2, training accuracy 0.480822
step 3, training accuracy 0.505479
step 4, training accuracy 0.531507
step 5, training accuracy 0.609589
step 6, training accuracy 0.630137
step 7, training accuracy 0.639726
step 8, training accuracy 0.732877
step 9, training accuracy 0.713699
。。。。。。
step 89, training accuracy 0.994521
step 90, training accuracy 0.994521
step 91, training accuracy 0.994521
step 92, training accuracy 0.994521
step 93, training accuracy 0.994521
step 94, training accuracy 0.994521
step 95, training accuracy 0.994521
step 96, training accuracy 0.994521
step 97, training accuracy 0.994521
step 98, training accuracy 0.994521
step 99, training accuracy 0.994521
test accuracy 0.994521
After the execution is complete, the following files will be generated in /usr/local/tensorflow/sample/tf-she-image/ckpt/:
quote
model.ckpt.index
model.ckpt.meta
model.ckpt.data-00000-of-00001
checkpoint
model.ckpt.meta
model.ckpt.data-00000-of-00001
checkpoint
(6) View the training results
(tensorflow) [root@localhost tf-she-image]# tensorboard --logdir=/usr/local/tensorflow/sample/tf-she-image/data
(7) Test confirmation
Prepare four images to test whether they can be recognized correctly:
test-ella-01.jpg
test-ella-02.jpg
test-selina-01.jpg
test-selina-02.jpg
Confirm the result:
(tensorflow) [root@localhost tf-she-image]# python eval.py /usr/local/tensorflow/sample/tf-she-image/eval_images/test-ella-01.jpg [{'name': 'ella', 'rate': 85.299999999999997, 'label': 0}, {'name': 'selina', 'rate': 14.699999999999999, 'label': 1}] /usr/local/tensorflow/sample/tf-she-image/eval_images/test-ella-02.jpg [{'name': 'ella', 'rate': 99.799999999999997, 'label': 0}, {'name': 'selina', 'rate': 0.20000000000000001, 'label': 1}] /usr/local/tensorflow/sample/tf-she-image/eval_images/test-selina-01.jpg [{'name': 'selina', 'rate': 100.0, 'label': 1}, {'name': 'ella', 'rate': 0.0, 'label': 0}] /usr/local/tensorflow/sample/tf-she-image/eval_images/test-selina-02.jpg [{'name': 'selina', 'rate': 99.900000000000006, 'label': 1}, {'name': 'ella', 'rate': 0.10000000000000001, 'label': 0}]
It can be seen that the recognition rates are: 85.2%, 99.7%, 100%, and 99.9%. Recognition is not bad!
eval.py
import sys import numpy as np import cv2 import tensorflow as tf import them import random import train cascade_path = '/usr/share/OpenCV/haarcascades/haarcascade_frontalface_default.xml' faceCascade = cv2.CascadeClassifier(cascade_path) HUMAN_NAMES = { 0: u"ella", 1: u"selina" } def evaluation(img_path, ckpt_path): tf.reset_default_graph() f = open(img_path, 'r') img = cv2.imread(img_path, cv2.IMREAD_COLOR) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) face = faceCascade.detectMultiScale(gray, 1.1, 3) if len(face) > 0: for rect in face: random_str = str(random.random()) cv2.rectangle(img, tuple(rect[0:2]), tuple(rect[0:2]+rect[2:4]), (0, 0, 255), thickness=2) face_detect_img_path = '/usr/local/tensorflow/sample/tf-she-image/eval_images/' + random_str + '.jpg' cv2.imwrite(face_detect_img_path, img) x = rect[0] y = straight[1] w = rect[2] h = rect[3] cv2.imwrite('/usr/local/tensorflow/sample/tf-she-image/eval_images/' + random_str + '.jpg', img[y:y+h, x:x+w]) target_image_path = '/usr/local/tensorflow/sample/tf-she-image/eval_images/' + random_str + '.jpg' else: print('image:No Face') return f.close() f = open(target_image_path, 'r') image = [] img = cv2.imread(target_image_path) img = cv2.resize(img, (28, 28)) image.append(img.flatten().astype(np.float32)/255.0) image = np.asarray(image) logits = train.inference(image, 1.0) sex = tf.InteractiveSession () saver = tf.train.Saver() sess.run(tf.global_variables_initializer()) if ckpt_path: saver.restore(sess, ckpt_path) softmax = logits.eval() result = softmax[0] rates = [round(n * 100.0, 1) for n in result] humans = [] for index, rate in enumerate(rates): name = HUMAN_NAMES[index] humans.append({ 'label': index, 'name': name, 'rate': rate }) rank = sorted(humans, key=lambda x: x['rate'], reverse=True) print(img_path) print(rank) return [rank, os.path.basename(img_path), random_str + '.jpg'] if __name__ == '__main__': TEST_IMAGE_PATHS = ['test-ella-01.jpg', 'test-ella-02.jpg', 'test-selina-01.jpg', 'test-elina-02.jpg'] for image_path in TEST_IMAGE_PATHS: evaluation('/usr/local/tensorflow/sample/tf-she-image/eval_images/'+image_path, '/usr/local/tensorflow/sample/tf-she-image/ckpt/model.ckpt')
Reference:
http://qiita.com/neriai/items/bd7bc36ec42c8ef65b2e