Spark startup process (Standalone) - Analysis

 

1, start-all.sh script actually executes java -cp Master and java -cp Worker

 

2, a first wear member RpcEnv Master boot object is responsible for managing all communication logic

 

3, Master communication RpcEnv object creates a Endpoint, Master is a Endpoint, Worker can communicate with their

 

4, when the object is to create a RpcEnv start Worker

 

5, Worker create a Endpoint objects by RpcEnv

 

6, Worker objects establish a connection through RpcEnv Master acquires a RpcEndpointRef object, the object can communicate with the Master

 

7, Worker registered with the Master, registration including host name, port, number of CPU core, the amount of memory

 

8, Master receives the registration of the worker, the registration information is maintained in a table in memory, which also contains RpcEndpointRef object reference to the worker

 

9, Master Worker reply has been received, registered, inform Worker has been registered

 

After 10, Worker received successfully registered the corresponding end, starts to periodically send a heartbeat Master

 

1, start-master.sh Master startup script analysis

start-master.sh

 1 //读取SPARK_HOME
 2 if [ -z "${SPARK_HOME}" ]; then
 3   export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
 4 fi
 5 
 6 # NOTE: This exact class name is matched downstream by SparkSubmit.
 7 # Any changes need to be reflected there.
 8 CLASS="org.apache.spark.deploy.master.Master"
 9 
10 if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
11   echo "Usage: ./sbin/start-master.sh [options]"
12   pattern="Usage:"
13   pattern+="\|Using Spark's default log4j profile:"
14   pattern+="\|Registered signal handlers for"
15 
16   "${SPARK_HOME}"/bin/spark-class $CLASS --help 2>&1 | grep -v "$pattern" 1>&2
17   exit 1
18 fi
19 
20 ORIGINAL_ARGS="$@"
21 
22 . "${SPARK_HOME}/sbin/spark-config.sh"
23 
24 . "${SPARK_HOME}/bin/load-spark-env.sh"
25 
26 if [ "$SPARK_MASTER_PORT" = "" ]; then
27   SPARK_MASTER_PORT=7077
28 fi
29 
30 if [ "$SPARK_MASTER_HOST" = "" ]; then
31   case `uname` in
32       (SunOS)
33       SPARK_MASTER_HOST="`/usr/sbin/check-hostname | awk '{print $NF}'`"
34       ;;
35       (*)
36       SPARK_MASTER_HOST="`hostname -f`"
37       ;;
38   esac
39 fi
40 
41 if [ "$SPARK_MASTER_WEBUI_PORT" = "" ]; then
42   SPARK_MASTER_WEBUI_PORT=8080
43 fi
44 //调用spark-daemon.sh执行
45 "${SPARK_HOME}/sbin"/spark-daemon.sh start $CLASS 1 \
46   --host $SPARK_MASTER_HOST --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT \
47   $ORIGINAL_ARGS

spark-daemon.sh

 1 ...
 2 
 3 execute_command() {
 4   if [ -z ${SPARK_NO_DAEMONIZE+set} ]; then
 5      # 最终以后台守护进程的方式启动 Master
 6       nohup -- "$@" >> $log 2>&1 < /dev/null &
 7       newpid="$!"
 8 
 9       echo "$newpid" > "$pid"
10 
11       # Poll for up to 5 seconds for the java process to start
12       for i in {1..10}
13       do
14         if [[ $(ps -p "$newpid" -o comm=) =~ "java" ]]; then
15            break
16         fi
17         sleep 0.5
18       done
19 
20       sleep 2
21       # Check if the process has died; in that case we'll tail the log so the user can see
22       if [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]; then
23         echo "failed to launch: $@"
24         tail -2 "$log" | sed 's/^/  /'
25         echo "full log in $log"
26       fi
27   else
28       "$@"
}
30  as
29  31 
32 ...

Startup class

1 /opt/module/spark-standalone/bin/spark-class       org.apache.spark.deploy.master.Master 
2             --host hadoop201 
3             --port 7077 
4             --webui-port 8080

bin / spark-class startup command:

1  /opt/module/jdk1.8.0_172/bin/java 
2         -cp /opt/module/spark-standalone/conf/:/opt/module/spark-standalone/jars/* 
3         -Xmx1g org.apache.spark.deploy.master.Master 
4         --host hadoop201 
5         --port 7077 
6         --webui-port 8080

 

Guess you like

Origin www.cnblogs.com/hyunbar/p/12079461.html