Compile + remote debugging spark

A compilation 

  For example to spark2.4 hadoop2.8.4

1, spark project root pom file modification 

   

New pom file

<profile>
  <id>hadoop-2.8</id>
  <properties>
    <hadoop.version>2.8.4</hadoop.version>
  </properties>
</profile>

2, performed in the spark home directory

mvn  -T 4 -Pyarn -Phadoop-2.8 -Dhadoop.version=2.8.4 -DskipTests clean package

 In order to accelerate the implementation of

Compile dev directory 

we make- distribution.sh

modify

#VERSION=$("$MVN" help:evaluate -Dexpression=project.version $@ 2>/dev/null\
#    | grep -v "INFO"\
#    | grep -v "WARNING"\
#    | tail -n 1)
#SCALA_VERSION=$("$MVN" help:evaluate -Dexpression=scala.binary.version $@ 2>/dev/null\
#    | grep -v "INFO"\
#    | grep -v "WARNING"\
#    | tail -n 1)
#SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version $@ 2>/dev/null\
#    | grep -v "INFO"\
#    | grep -v "WARNING"\
#    | tail -n 1)
SPARK_HIVE=$("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive $@ 2>/dev/null\
    | grep -v "INFO"\
    | grep -v "WARNING"\
    | fgrep --count "<id>hive</id>";\
    # Reset exit status to 0, otherwise the script stops here if the last grep finds nothing\
    # because we use "set -o pipefail"
    echo -n)
VERSION=2.4.0
SCALA_VERSION=2.11.8
SPARK_HADOOP_VERSION=2.8.4

 

3, complete maven compiler to package

  Performed in the spark root directory

./dev/make-distribution.sh --name hadoop2.8 --tgz -PR -Phadoop-2.8.4 -Phive -Phive-thriftserver -Pyarn

I asked to speed up the compilation modify dev directory

I.e., to form the corresponding finished version of the root directory of a jar spark_home

 

 

 

Two Remote Debug

1. Compile the files in the remote spark project

 

spark-2.4.0-bin-hadoop2.8/conf/spark-defaults.conf

 

This increase follows spark driver side code to debug 

 

spark.driver.extraJavaOptions -agentlib: jdwp = transport = dt_socket, ip server = n, address = your present machine: 5007, suspend = y

 

The same debugging excutor can be so only need to add content to the spark.executor.extraJavaOptions

2 We will spark the idea to import source

   Configuring remote debug

 

 So listen mode using here because a local network with remote barrier

Local idea spark to start the project first and then start the remote debug task of spark

Figure

 

 Here is injoy yourself 

 

 

 

 

   

 

 

 

Guess you like

Origin www.cnblogs.com/songchaolin/p/12028356.html