spark入门小例子

版权声明:本文出自www.54manong.com,转载请注明原地址,谢谢! https://blog.csdn.net/xiaohuanglv/article/details/88388474

来源:我是码农,转载请保留出处和链接!

本文链接:http://www.54manong.com/?id=1221

1,

pyspark

2,

spark-shell

spark网页管理页面:

http://127.0.0.1:4040/jobs/

3,

设置日志输出:

log4j.properties.template。把这个日志设置模版文件复制一份到conf/log4j.

properties 来作为日志设置文件,接下来找到下面这一行:

log4j.rootCategory=INFO, console

然后通过下面的设定降低日志级别,只显示警告及更严重的信息:

log4j.rootCategory=WARN, console

4,

修改spark临时文件存放路径:

 conf 下的spark-defaults.conf,增加如下一行:

spark.local.dir /diskb/sparktmp,/diskc/sparktmp,/diskd/sparktmp,/diske/sparktmp,/diskf/sparktmp,/diskg/sparktmp

说明:可配置多个目录,以 "," 分隔。

也配置spark-env.sh下增加

export SPARK_LOCAL_DIRS=spark.local.dir /diskb/sparktmp,/diskc/sparktmp,/diskd/sparktmp,/diske/sparktmp,/diskf/sparktmp,/diskg/sparktmp

如果spark-env.sh与spark-defaults.conf都配置,则SPARK_LOCAL_DIRS覆盖spark.local.dir 的配置

5,

运行spark程序的几种方式:

spark submit:

spark-submit \

  --class

  --master [spark://207.184.161.138:7077]

  --deploy-mode []

  --conf []

  --executor-memory []

  --total-executor-cores [] 

  [application .jar ]

  [application args]

PI:

spark-submit --class org.apache.spark.examples.JavaSparkPi --master local[4] spark-example.jar

wordcount:

spark-submit --class org.apache.spark.examples.JavaWordCount --master local[4] spark-example.jar hdfs://localhost:9000/user/lenovo/wordcount/README.md

sql:

spark-submit --class org.apache.spark.examples.sql.JavaSparkSQLExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar

structed streaming:

下载netcat(https://eternallybored.org/misc/netcat/netcat-win32-1.12.zip) 

解压,将nc.exe拷贝到C:\Windows下。

nc -l -p 9999

spark-submit --class org.apache.spark.examples.sql.streaming.StructuredNetworkWordCount --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar localhost 9999

spark streaming:

spark-submit --class org.apache.spark.examples.streaming.JavaNetworkWordCount --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar localhost 9999

随机森林:

spark-submit --class org.apache.spark.examples.ml.JavaRandomForestClassifierExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar

pipeline:

spark-submit --class org.apache.spark.examples.ml.JavaEstimatorTransformerParamExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar

 spark-submit --class org.apache.spark.examples.ml.PipelineExample --master local[4] examples/jars/spark-examples_2.11-2.3.0.jar

猜你喜欢

转载自blog.csdn.net/xiaohuanglv/article/details/88388474