Spark (fifty): Use JvisualVM monitor Spark Executor JVM

guide

JvisulaVM Windows environment typically present in JDK installed directory $ {JAVA_HOME} /bin/JvisualVM.exe, which supports (local and remote) and jstatd two ways to connect remote JMX JVM.

jstatd (the Java Virtual Machine jstat Daemon) - monitor remote server CPU, memory, threads, and other information

JMX (Java Management Extensions, the Java Management Extensions) is a framework for applications, devices, systems, and other management functions of the implant. JMX across a range of heterogeneous operating system platform, system architecture and network transport protocols, flexible development seamlessly integrated system, network and service management applications.

Note: For jstatd I try not successful, so do not mislead people here.

JMX monitoring

Normal configuration:

-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Djava.rmi.server.hostname=<ip>
-Dcom.sun.management.jmxremote.port=<port>

Adding JMX configuration:

In Spark when monitoring executor, you need to configure and start the jmx spark applications, configure three ways:

1) arranged in the three parameters in the spark-defaults.conf

2) In the spark-env.sh: Configure JavaOptions master, worker's

3) arranged at the spark-submit Submit

When using this configuration the spark-submit Submit:

spark-submit \
--class myTest.KafkaWordCount \
--master yarn \
--deploy-mode cluster \
--conf "spark.executor.extraJavaOptions=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=0 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false" \ --verbose \ --executor-memory 1G \ --total-executor-cores 6 \ /hadoop/spark/app/spark/20151223/testSpark.jar *.*.*.*:* test3 wordcount 4 kafkawordcount3 checkpoint4

note:

1) You can not specify a specific ip and port ------ because when you run spark, is likely to assign multiple container processes on a node, this time occupying the same port, will lead to spark the failure to submit an application by spark-submit .

2) because it does not specify a specific ip and port, so the stage will be submitted to the task automatically assigned ports.

3) the top three configuration methods may lead to different levels of monitoring (such as spark-submit only one application for the program, spark-env.sh may be a node All global monitoring executor [unverified], reader's attention.)

Find JMX port allocation

By yarn applicationattempt -list appicationId find applicationattemptid

[root@cdh-143 bin]# yarn applicationattempt -list application_1559203334026_0015
19/06/01 17:57:18 INFO client.RMProxy: Connecting to ResourceManager at CDH-143/10.dx.dx.143:8032
Total number of application attempts :1
         ApplicationAttempt-Id                 State                        AM-Container-Id                            Tracking-URL
appattempt_1559203334026_0015_000001                 RUNNING    container_1559203334026_0015_01_000001  http://CDH-143:8088/proxy/application_1559203334026_0015/

By yarn container -list aaplicationattemptId find container id list

[root@cdh-143 bin]# yarn container -list appattempt_1559203334026_0015_000001
19/06/01 17:57:52 INFO client.RMProxy: Connecting to ResourceManager at CDH-143/10.dx.dx.143:8032
Total number of containers :16
                  Container-Id            Start Time             Finish Time                   State                    Host                                LOG-URL
container_1559203334026_0015_01_000012  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-146:8041    http://CDH-146:8042/node/containerlogs/container_1559203334026_0015_01_000012/dx
container_1559203334026_0015_01_000013  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-146:8041    http://CDH-146:8042/node/containerlogs/container_1559203334026_0015_01_000013/dx
container_1559203334026_0015_01_000010  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-146:8041    http://CDH-146:8042/node/containerlogs/container_1559203334026_0015_01_000010/dx
container_1559203334026_0015_01_000011  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-146:8041    http://CDH-146:8042/node/containerlogs/container_1559203334026_0015_01_000011/dx
container_1559203334026_0015_01_000016  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-146:8041    http://CDH-146:8042/node/containerlogs/container_1559203334026_0015_01_000016/dx
container_1559203334026_0015_01_000014  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-146:8041    http://CDH-146:8042/node/containerlogs/container_1559203334026_0015_01_000014/dx
container_1559203334026_0015_01_000015  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-146:8041    http://CDH-146:8042/node/containerlogs/container_1559203334026_0015_01_000015/dx
container_1559203334026_0015_01_000004  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000004/dx
container_1559203334026_0015_01_000005  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000005/dx
container_1559203334026_0015_01_000002  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000002/dx
container_1559203334026_0015_01_000003  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000003/dx
container_1559203334026_0015_01_000008  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000008/dx
container_1559203334026_0015_01_000009  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000009/dx
container_1559203334026_0015_01_000006  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000006/dx
container_1559203334026_0015_01_000007  Sat Jun 01 13:27:52 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000007/dx
container_1559203334026_0015_01_000001  Sat Jun 01 13:27:38 +0800 2019                   N/A                 RUNNING            CDH-142:8041    http://CDH-142:8042/node/containerlogs/container_1559203334026_0015_01_000001/dx

To specific executor where the node server, use the following command to find the running thread, and pid

[root@cdh-146 ~]# ps -axu | grep container_1559203334026_0015_01_000013
yarn      8844  0.0  0.0 113144  1496 ?        S    13:27   0:00 bash /data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/default_container_executor.sh
yarn      8857  0.0  0.0 113280  1520 ?        Ss   13:27   0:00 /bin/bash -c /usr/java/jdk1.8.0_171-amd64/bin/java -server -Xmx6144m '-Dcom.sun.management.jmxremote' '-Dcom.sun.management.jmxremote.port=0' '-Dcom.sun.management.jmxremote.authenticate=false' '-Dcom.sun.management.jmxremote.ssl=false' -Djava.io.tmpdir=/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/tmp '-Dspark.network.timeout=10000000' '-Dspark.driver.port=47564' '-Dspark.port.maxRetries=32' -Dspark.yarn.app.container.log.dir=/data6/yarn/container-logs/application_1559203334026_0015/container_1559203334026_0015_01_000013 -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@CDH-143:47564 --executor-id 12 --hostname CDH-146 --cores 2 --app-id application_1559203334026_0015 --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/__app__.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/streaming-dx-perf-3.0.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/dx-common-3.0.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/spark-sql-kafka-0-10_2.11-2.4.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/spark-avro_2.11-3.2.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/shc-core-1.1.2-2.2-s_2.11-SNAPSHOT.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/rocksdbjni-5.17.2.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/kafka-clients-0.10.0.1.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/elasticsearch-spark-20_2.11-6.4.1.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/dx_Spark_State_Store_Plugin-1.0-SNAPSHOT.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/bijection-core_2.11-0.9.5.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/bijection-avro_2.11-0.9.5.jar 1>/data6/yarn/container-logs/application_1559203334026_0015/container_1559203334026_0015_01_000013/stdout 2>/data6/yarn/container-logs/application_1559203334026_0015/container_1559203334026_0015_01_000013/stderr
yarn      9000  143  3.3 8736712 4379648 ?     Sl   13:27  24:35 /usr/java/jdk1.8.0_171-amd64/bin/java -server -Xmx6144m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=0 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.io.tmpdir=/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/tmp -Dspark.network.timeout=10000000 -Dspark.driver.port=47564 -Dspark.port.maxRetries=32 -Dspark.yarn.app.container.log.dir=/data6/yarn/container-logs/application_1559203334026_0015/container_1559203334026_0015_01_000013 -XX:OnOutOfMemoryError=kill %p org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@CDH-143:47564 --executor-id 12 --hostname CDH-146 --cores 2 --app-id application_1559203334026_0015 --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/__app__.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/dx-domain-perf-3.0.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/dx-common-3.0.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/spark-sql-kafka-0-10_2.11-2.4.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/spark-avro_2.11-3.2.0.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/shc-core-1.1.2-2.2-s_2.11-SNAPSHOT.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/rocksdbjni-5.17.2.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/kafka-clients-0.10.0.1.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/elasticsearch-spark-20_2.11-6.4.1.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/dx_Spark_State_Store_Plugin-1.0-SNAPSHOT.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/bijection-core_2.11-0.9.5.jar --user-class-path file:/data6/yarn/nm/usercache/dx/appcache/application_1559203334026_0015/container_1559203334026_0015_01_000013/bijection-avro_2.11-0.9.5.jar
root     25939  0.0  0.0 112780   956 pts/1    S+   13:45   0:00 grep --color=auto container_1559203334026_0015_01_000013

And then find the corresponding JMX port by pid

[root@cdh-146 ~]# sudo netstat -antp | grep 9000
tcp        0      0 10.dx.dx.146:9000      0.0.0.0:*               LISTEN      2642/python2.7      
tcp6       0      0 :::48169                :::*                    LISTEN      9000/java           
tcp6       0      0 :::37692                :::*                    LISTEN      9000/java           
tcp6       0      0 10.dx.dx.146:52710     :::*                    LISTEN      9000/java           
tcp6       0      0 10.dx.dx.146:55535     10.dx.dx.142:38397     ESTABLISHED 9000/java                
tcp6   64088      0 10.dx.dx.146:45410     10.206.186.35:9092      ESTABLISHED 9000/java           
tcp6       0      0 10.dx.dx.146:60259     10.dx.dx.143:47564     ESTABLISHED 9000/java           

Results seen, suspected to be 48,169 or 37,692 , slightly to try to connect to the corresponding spark executor

Use monitoring tools to add JvisulaVM.exe

Find the JDK directory on the local windows server, find the file $ {JAVA_HOME} /bin/JvisualVM.exe, and run it. Select the "Remote" Right after the start, add JMX monitoring

Fill monitoring executor where the node ip

Then you can start monitoring:

 

Guess you like

Origin www.cnblogs.com/yy3b2007com/p/10960588.html