The problem that deep learning programs cannot call the GPU when running on Windows 10 (solved)

It’s really hard to describe how to run deep learning on Win10, but I’m used to using Windows systems. I’ve used Ubuntu systems in the past, and I’m really not used to writing documents in them, so it’s common for my own experimental projects to mainly use Win10 as the main tool. pycharm+anaconda+win10

I use keras2.3.1. After changing some codes in the program, the model will be interrupted every time I run it.

Record it to prevent similar problems from being forgotten again. . .

This is what it looks like every time the task manager is opened when the program is running. . . . . .

The gpu has no output at all. . . . I think you are at least a 980Ti. If you don’t contribute anything, what’s the use of you? . . .

Enter python in the terminal of pycharm to enter the python environment

Enter the following code in python

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

found in terminal

 

 Problem found, gpu not detected

Only one cpu (-1)

After my inquiry and research analysis, there may be many problems (History of Blood and Tears)

 1. The tensorflow version does not correspond to the graphics card version (or the corresponding gpu version is not installed)

 

2. cudnn does not correspond to cuda and the corresponding python version and the corresponding tensorflow or keras version

The solution to 1 and 2 is very simple. Just find the corresponding version and reinstall it. There are many online tutorials, some of which use code to load in the terminal, and some of which download files from the NVIDIA official website and install them. All of these methods are acceptable.

You can also try to reinstall the graphics card driver or update the graphics card driver, but my problem is not here, so I won’t go into details, but you still need to try.

3. The code does not call the GPU

Try entering the following code at the head of the running program

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"	#使用的gpu的编号,使用第 0 个

But obviously my GPU cannot recognize it, and the problem is not here.

4. The environment under anaconda is not entered in pycharm

This is my problem

The environment created under conda enters File Setting in pycharm

Select interpreter in project 

In the python interpreter on the right, select the required conda environment 

but

but

For my computer pycharm, it did not enter this virtual environment. I still don’t know why. The same settings can be used on other computers. . . .

For example, enter in pycharm terminal

conda info --envs

will find

My program is still in the base environment in pycharm. rather than in the virtual environment created 

https://zhuanlan.zhihu.com/p/441469719

After using the above method

This is considered entry (in my personal opinion) 

5. Lack of necessary CUDA components (I personally think it may be a bug in the win10 running program, which is also the core problem that I have not found yet)

In the past, there were always more or less prompts that a certain component of cudnn was missing, but I didn't pay attention. I focused on solving environmental problems. After the environmental problems were solved, there were no similar prompts.

Again when I think I finally got it

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

result

Still can't recognize it. . . . . . (Please ignore the stupid mistakes I made several times when I typed the wrong code...)

Judging from the research record I wrote at the time below, I was not in a good mood.

However, hard work paid off and people who were dedicated found the problem.

could not locate zlibwapi.dll    Please make sure it is in your library path.

Download the required components from the official website and add them to the path

Official website address

https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-zlib-windows

 Add to path or add to cuda

Right click on this computer----Properties----Advanced system settings

 

 Environment variable ------path--------add

Add the zlibwaip.dll path after decompression of zlib123.dll.zip to path path

then try again

 You can find two cards -1 and 0

Both cpu and gpu can be recognized

solved. . . . scatter flowers

To sum up, if you have the ability, in addition to reading Chinese materials, you can Google and refer to the information on some English forums or official websites. If you use another language, you will have more ideas to refer to.

 

 

 

 

Guess you like

Origin blog.csdn.net/qq_44104483/article/details/125521779