Easy to use Anaconda
Create a new environment
conda create -n rcnn python=3.6
Delete the environment
conda remove -n rcnn --all
Rename environment
conda fact, did not rename command to rename achieved by clone is done in two steps:
- To clone a new name of the environment
- Delete the old name of the environment
For example, trying to rename environment rcnn tf
step 1
conda create -n tf --clone rcnn
Source: /anaconda3/envs/rcnn
Destination: /anaconda3/envs/tf
Packages: 37
Files: 8463
Step 2
conda remove -n rcnn --all
result
conda info -e
# conda environments:
#
crawl /anaconda3/envs/crawl
flask /anaconda3/envs/flask
tf /anaconda3/envs/tf
root * /anaconda3
tensorflow gpu installation
First, the graphics card must support
I did not expect that GTX 1050TI, GTX 1070TI other mainstream graphics cards do not actually support
(Fortunately, I bought a GTX 1050)
(I did not need a hint TESLA)
Click Here to See CUDA Support List
Secondly, the need for good version number, different versions of TensorFlow corresponding CUDA driver version numbers are different
However, this is not enough, you also need to install CUDNN perfect run, CUDNN CUDA version number and version number but also on good
But the need to register to download CUDNN NVIDIA account, then click join registered myself
Registration beginning I used the QQ mailbox, it stands to reason that nothing wrong
But to validate mailbox step by belch fart
Your e-mail it to verify, verify the e-mail, mail it? ? ? ? ?
After Baidu multi-party review, the original can not be used QQ-mail
Pit father was over three hours and it sent me, yes, that QQ mailbox, it sent me. . .
But I 163 mailboxes are registered good. . . . .
So I use 163 mailbox registered an account
She finally download
Download finished very ignorant force
Which is compressed so long:
The use of such things is completely beyond the scope of my abilities Yeah, how do
Ever since they Baidu, the original is placed in the installation directory CUDA Yeah. . . .
Good installed, you can use audibility Niangshuo installation directory \ extras \ bandwidthTest.exe and deviceQuery.exe be detected under demo_suite
It seems to be no problem detected
(Picture by a pause pause to view)
Then complete environment equipped, to the much-anticipated installation link
pip install tensorflow-gpu
Of course, you need to uninstall the previous version of tensorflow
20KB / s at high speed not know how long
Anyway, finally installed a
Something like this
It looks pretty good there
But run with it
。。。。。。。
The following error I bear, red one. . . . .
(The picture is too bloody, has been shielded)
Then continue to help omnipotent degree of your mother
Finally I found this post
Win10 +VS2017+ python3.66 + CUDA10 + cuDNNv7.3.1 + tensorflow-gpu 1.12.0
You do not support early CUDA10.0 Well, I spent so vigorously victims
So I looked at the postings in the installation package which is included with the creation of big brother
tensorflow_gpu-1.12.0-cp36-cp36m-win_amd64.whl
Cudal0k0 + Cudnhn7k3kl
Then ran to reinstall CUDNN7.3.1
And then the installation package directory cd
pip install tensorflow_gpu-1.12.0-cp36-cp36m-win_amd64.whl
Critical nonetheless, the final installation is complete
Figure
(At that time I was excited to see can be miserable)
At this point the installation is complete
Now that it is installed to test myself, not test, then seemed very own fishing
Mother of help to find the code for big brother wrote five convolution neural network
Tensorflow AlexNet comparison operation efficiency of the CPU and GPU
For simplicity, it is placed directly after the Magic's big brother code change
1 from datetime import datetime
2 import math
3 import time
4 import tensorflow as tf
5 import os
6 #os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
7 #os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
8 batch_size = 32
9 num_batches = 100
10 # 该函数用来显示网络每一层的结构,展示tensor的尺寸
11
12 def print_activations(t):
13 print(t.op.name, ' ', t.get_shape().as_list())
14
15 # with tf.name_scope('conv1') as scope # 可以将scope之内的variable自动命名为conv1/xxx,便于区分不同组件
16
17 def inference(images):
18 parameters = []
19 # 第一个卷积层
20 with tf.name_scope('conv1') as scope:
21 # 卷积核、截断正态分布
22 kernel = tf.Variable(tf.truncated_normal([11, 11, 3, 64],
23 dtype=tf.float32, stddev=1e-1), name='weights')
24 conv = tf.nn.conv2d(images, kernel, [1, 4, 4, 1], padding='SAME')
25 # 可训练
26 biases = tf.Variable(tf.constant(0.0, shape=[64], dtype=tf.float32), trainable=True, name='biases')
27 bias = tf.nn.bias_add(conv, biases)
28 conv1 = tf.nn.relu(bias, name=scope)
29 print_activations(conv1)
30 parameters += [kernel, biases]
31 # 再加LRN和最大池化层,除了AlexNet,基本放弃了LRN,说是效果不明显,还会减速?
32 lrn1 = tf.nn.lrn(conv1, 4, bias=1.0, alpha=0.001 / 9, beta=0.75, name='lrn1')
33 pool1 = tf.nn.max_pool(lrn1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID', name='pool1')
34 print_activations(pool1)
35 # 第二个卷积层,只有部分参数不同
36 with tf.name_scope('conv2') as scope:
37 kernel = tf.Variable(tf.truncated_normal([5, 5, 64, 192], dtype=tf.float32, stddev=1e-1), name='weights')
38 conv = tf.nn.conv2d(pool1, kernel, [1, 1, 1, 1], padding='SAME')
39 biases = tf.Variable(tf.constant(0.0, shape=[192], dtype=tf.float32), trainable=True, name='biases')
40 bias = tf.nn.bias_add(conv, biases)
41 conv2 = tf.nn.relu(bias, name=scope)
42 parameters += [kernel, biases]
43 print_activations(conv2)
44 # 稍微处理一下
45 lrn2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9, beta=0.75, name='lrn2')
46 pool2 = tf.nn.max_pool(lrn2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID', name='pool2')
47 print_activations(pool2)
48 # 第三个
49 with tf.name_scope('conv3') as scope:
50 kernel = tf.Variable(tf.truncated_normal([3, 3, 192, 384], dtype=tf.float32, stddev=1e-1), name='weights')
51 conv = tf.nn.conv2d(pool2, kernel, [1, 1, 1, 1], padding='SAME')
52 biases = tf.Variable(tf.constant(0.0, shape=[384], dtype=tf.float32), trainable=True, name='biases')
53 bias = tf.nn.bias_add(conv, biases)
54 conv3 = tf.nn.relu(bias, name=scope)
55 parameters += [kernel, biases]
56 print_activations(conv3)
57 # 第四层
58 with tf.name_scope('conv4') as scope:
59 kernel = tf.Variable(tf.truncated_normal([3, 3, 384, 256], dtype=tf.float32, stddev=1e-1), name='weights')
60 conv = tf.nn.conv2d(conv3, kernel, [1, 1, 1, 1], padding='SAME')
61 biases = tf.Variable(tf.constant(0.0, shape=[256], dtype=tf.float32), trainable=True, name='biases')
62 bias = tf.nn.bias_add(conv, biases)
63 conv4 = tf.nn.relu(bias, name=scope)
64 parameters += [kernel, biases]
65 print_activations(conv4)
66 # 第五个
67 with tf.name_scope('conv5') as scope:
68 kernel = tf.Variable(tf.truncated_normal([3, 3, 256, 256], dtype=tf.float32, stddev=1e-1), name='weights')
69 conv = tf.nn.conv2d(conv4, kernel, [1, 1, 1, 1], padding='SAME')
70 biases = tf.Variable(tf.constant(0.0, shape=[256], dtype=tf.float32), trainable=True, name='biases')
71 bias = tf.nn.bias_add(conv, biases)
72 conv5 = tf.nn.relu(bias, name=scope)
73 parameters += [kernel, biases]
74 print_activations(conv5)
75 # 之后还有最大化池层
76 pool5 = tf.nn.max_pool(conv5, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID', name='pool5')
77 print_activations(pool5)
78 return pool5, parameters
79 # 全连接层
80 # 评估每轮计算时间,第一个输入是tf得Session,第二个是运算算子,第三个是测试名称
81 # 头几轮有显存加载,cache命中等问题,可以考虑只计算第10次以后的
82 def time_tensorflow_run(session, target, info_string):
83 num_steps_burn_in = 10
84 total_duration = 0.0
85 total_duration_squared = 0.0
86 # 进行num_batches+num_steps_burn_in次迭代
87 # 用time.time()记录时间,热身过后,开始显示时间
88 for i in range(num_batches + num_steps_burn_in):
89 start_time = time.time()
90 _ = session.run(target)
91 duration = time.time() - start_time
92 if i >= num_steps_burn_in:
93 if not i % 10:
94 print('%s:step %d, duration = %.3f' % (datetime.now(), i - num_steps_burn_in, duration))
95 total_duration += duration
96 total_duration_squared += duration * duration
97 # 计算每轮迭代品均耗时和标准差sd
98 mn = total_duration / num_batches
99 vr = total_duration_squared / num_batches - mn * mn
100 sd = math.sqrt(vr)
101 print('%s: %s across %d steps, %.3f +/- %.3f sec / batch' % (datetime.now(), info_string, num_batches, mn, sd))
102 def run_benchmark():
103 # 首先定义默认的Graph
104 with tf.Graph().as_default():
105 # 并不实用ImageNet训练,知识随机计算耗时
106 image_size = 224
107 images = tf.Variable(tf.random_normal([batch_size, image_size, image_size, 3], dtype=tf.float32, stddev=1e-1))
108 pool5, parameters = inference(images)
109 init = tf.global_variables_initializer()
110 sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=False))
111 sess.run(init)
112 # 下面直接用pool5传入训练(没有全连接层)
113 # 只是做做样子,并不是真的计算
114 time_tensorflow_run(sess, pool5, "Forward")
115 # 瞎弄的,伪装
116 objective = tf.nn.l2_loss(pool5)
117 grad = tf.gradients(objective, parameters)
118 time_tensorflow_run(sess, grad, "Forward-backward")
119 run_benchmark()
If you use TensorFlow-GPU, then the default is to run with the GPU
GPU operating results:
GPU usage:
CPU usage:
As can be seen more occupy memory
The 6-7 line comment code above release operation is the CPU
CPU operating results:
CPU utilization:
I have come to 3.4GHZ 2.8GHZ CPU of the
So for me it's really good CPU
Test Results:
Forward GPU run-time efficiency is 8.42 times the operating efficiency of CPU
Reverse GPU run-time efficiency is 12.50 times the operating efficiency of CPU
GPU and GPU occupancy rate in operating mode with only about 65%, CPU occupancy rate with only about 45%
And the CPU occupancy rate of the CPU mode time reached 100%, and inefficient
GPU seen directly executed by the CPU after blasting
Precautions:
1. This test uses only a convolution neural network operation, does not mean that all cases GPU must have an advantage;
2. In view of the CPU bottleneck, the CPU may run efficiency is not very satisfactory, the use of more high-end CPU operating results may be improved significantly;