The most detailed Pycharm remote code debugging configuration solution [for GPU clusters]

Since the school's deep learning tasks are all uploaded and run on the school's GPU cluster, it becomes very troublesome to visualize and debug the program. After many twists and turns, I finally found a solution, which is to use Pycharm to connect through the springboard machine through the ssh tunnel. Intranet server. Now let’s summarize this.
The current cluster structure of our school is a three-level structure. If you are the same, then the following method can probably be used.

Insert image description here

1. First configure login

1.1 Modify the local host file

First of all, everyone knows the account number of their own springboard machine, which can be written to the host, so that there is no need to enter the IP every time. The location of the host file under the Windows system is as follows:

C:\Windows\System32\drivers\etc\hosts
host 文件的格式:IP  名称

Insert image description here

1.2 Configure xshell software

Create a new link
name : Springboard test
host name : jiqun
[You have just configured the host, you can enter jiqun directly]
Insert image description here
Next, click the "User Authentication" button and enter your user name and password. After configuration, you do not need to enter it every time. After
Insert image description here
the configuration is completed, perform a login test. If you can log in successfully, it means that the previous configuration is OK.
Insert image description here

1.3 Get the address of the GPU node you want to link to

Enter the command and copy the node address you need to the host file of your local PC

cat /etc/hosts

Insert image description here

Add it to the host file of your local PC
Insert image description here

At this point, the login configuration work is over~

2. Configure the tunnel

Edit the "springboard test" just configured in xshell. Right click->Properties
Insert image description here

Click the Tunnel button, then click Add. Pay
special attention
to the source host name is localhost , the port number is 6000 ,
the destination host name is node06, the port number is 22,
and then click "User Authentication". Enter the account password, the same as the previous steps.
Insert image description here

After the configuration is complete, try to connect! [The first link time may be a bit long]
The following interface appears and the login is successful!
Insert image description here

3. Configure pycharm

3.1 Configure the python interpreter of the project file

Click File -> Settings in pycharm, and then complete the following 5 steps. Note that the Host is localhost
Insert image description here
and enter the password [This step may be stuck. If an error occurs, it is recommended to try a few more times]

Insert image description here
Configure the corresponding file address of the interpreter. It is recommended to create a folder with the same name as the local file on the cluster for mapping project files.
Insert image description here

At this point we have completed the configuration. If you see the following content, the configuration is successful.
Insert image description here

3.2 Configuration file synchronization

Insert image description here
Insert image description here

Insert image description here
Insert image description here

At this point, the file synchronization configuration is completed.

4 Test remote debugging

If your code has parameters, you can add parameters using the following method.
Insert image description here
Set a breakpoint below for debugging.
Insert image description here

The above is all the content. If you have any questions, please feel free to comment. If you see it, you will reply~. Welcome to communicate.

Guess you like

Origin blog.csdn.net/qq_41563601/article/details/123875463