How to Remotely Debug MapReduce Tasks in Windows

1 Overview

We generally deploy Hadoop to the server, then it will appear that the MapReduce task cannot be run directly in Windows. It is necessary to export the MapReduce task as a jar package, and then upload it to the server to run, and run the command:

$hadoop jar [jar文件] [main启动类] [输入文件] [输出文件]

Note: The main startup class needs to be the fully qualified name of the class

However, by executing MapReduce tasks in this way, we cannot debug the execution process of source code with breakpoints. In fact, we can solve it by remote debugging.

2 Solutions

2.1 The server starts the monitoring service

To debug through remote breakpoints, you must first pause and start a monitoring service when starting the MapReduce task on the remote server, and wait for the client to connect and debug, which can be done by setting the parameters of the JVM at runtime.

-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8000

Note: The parameters of this command are provided by JDK. You can view the specific description of each parameter of agentlib through java -agentlib:jdwp=help, where address=8000 indicates that the listening port is 8000

2.2 Where should the parameters be configured?

The MapReduce task is started and executed through the hadoop jar command, then analyze the hadoop running script in the bin directory of the hadoop installation directory, you can find the JVM running parameter settings of hadoop before executing the program in the script, here we can set it to the HADOOP_CLIENT_OPTS parameter , we can temporarily add an environment variable directly through the export command in the current shell

3 [Summary] Remote debugging configuration steps

After the above analysis, it is not difficult to see that the server only needs to configure a runtime parameter, and then it can be remotely debugged through the IDEA development tool. The specific configuration steps are as follows:

3.1 Remote server configuration

(1) Add a temporary environment variable on the server side

$export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8000"

(2) The server side executes MapReduce tasks, such as

$hadoop jar mapreduce-task.jar com.os.china.mapreduce.weather.JobRun /file/weather.txt /out

At this point, it can be found that the startup program is tentatively scheduled and the service listening port 8000 is opened.

3.2 Remote debugging configuration in Windows local IDEA

(1) Create a new remote debug configuration in IDEA

(2) Click ok after configuration

Note: Host: parameter configures the ip address of the remote server, Port: the remote server listens on the port

(3) Debug operation can start the remote debugging of the MapReduce task execution process

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324410737&siteId=291194637