Original Address:
https://blog.csdn.net/c20081052/article/details/82345454
---------------------------------------------------------------------------------------------------
On a server with multi-GPU to do the training, only because I wanted to do training with one of the GPU device can be used are often more GPU memory is filled cleanup code runs deep learning. This phenomenon occurs mainly occupied by default all the time tensorflow training GPU memory.
Is there something like the following code snippet view your source file:
with tf.Graph().as_default(): gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) sess=tf.Session(config=tf.ConfigProto(gpu_options=gpu_options,log_device_placement=False)) with sess.as_default():
The code fragment primarily in the creation of the session, session parameters to use configuration,
tf.ConfigProto () parameters are as follows:
log_device_placement = True: if the printing log dispensing apparatus
allow_soft_placement = True: If you specify a device does not exist, allowing automatic dispensing device TF
tf.ConfigProto ( log_device_placement = True , allow_soft_placement = True )
In the configuration tf .Session by (when) tf.GPUOptions to display the specified ratio to be assigned memory as part of an optional configuration parameters.
per_process_gpu_memory_fraction specified limit use of each GPU process memory, but it can only be applied uniformly to all GPU, you can not set a different limit for different GPU.
Sample code is as follows:
# The allow Growth config = tf.ConfigProto () config.gpu_options.allow_growth = True the session = tf.Session (config = config, ...) # use allow_growth option, just a small amount of GPU capacity allocation beginning, then slowly Demand the increase, due not release # memory, it will lead to fragmentation
# Per_process_gpu_memory_fraction gpu_options = tf.GPUOptions (per_process_gpu_memory_fraction = 0.7 ) config = tf.ConfigProto (gpu_options = gpu_options) the session = tf.Session (config = config, ...) # set how much capacity each GPU process should come to use, 0.4 on behalf of 40%
Specifies the number and device number GPU
Method a: If you specify GPU ID number and the terminal number , if the computer has multiple GPU, tensorflow all use the default. If you want to use only part of the GPU, you can set CUDA_VISIBLE_DEVICES. When calling python program, you can use:
Python your_script.py = 1 CUDA_VISIBLE_DEVICES # designated GPU before running the script device number # General settings there: CUDA_VISIBLE_DEVICES = 1 1 Device Only by Will BE Seen CUDA_VISIBLE_DEVICES = 0, 1 Devices 0 and 1 by Will BE visible CUDA_VISIBLE_DEVICES = " 0, 1 " AS above Same,, Quotation Marks are optional CUDA_VISIBLE_DEVICES = 0 devices 0,2, 3, 2,. 3 Will bE visible; device. 1 IS Masked CUDA_VISIBLE_DEVICES = "" No visible the GPU Will bE # you can also use Export CUDA_VISIBLE_DEVICES = 2 # specified device number
Method Two: If you do python change the original file , add the following at the beginning of the file:
Import OS os.environ [ " CUDA_VISIBLE_DEVICES " ] = " 2 " # indicates the number GPU ID # if a plurality of words GPU os.environ [ " CUDA_VISIBLE_DEVICES " ] = " 1,2 " # indicates the two ID numbers GPU, attention herein does not distinguish between double and single quotation marks
Practical tips:
If you run deep learning python script file in linux terminal, run discovery occupy multiple GPU and memory resources, please see the process consumes resources who are homed:
$ PID No. PS -f
The process can then confirm the recommendation kill off the case:
$kill -9 PID号
ctrl + Z command can only force the end of the current process, the process can not quit, we found a problem with the ctrl + Z also need to kill the process.
---------------------------------------------------------------------------------------------------
----------------
Disclaimer: This article is CSDN blogger "ciky odd" in the original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source and this link statement.
Original link: https: //blog.csdn.net/c20081052/article/details/82345454
-------------------------------------------------------