The first program of Spark learning to package & submit tasks to the cluster

Keyless login settings

ssh-keygen
cd .ssh
touch authorized_keys
cat id_rsa.pub > authorized_keys
chmod 600 authorized_keys

Environmental tools

surroundings

System urbuntu jdk 1.7.0_79

scala 2.10.4

hadoop 2.6.0

spark 1.6.2

Packaging tool

IDEA + sbt

Bale

Install plugin

You need to install the scala plugin in advance, click File ->Setting ->Plugins ->input box and enter scala->install. The
IDE needs to be restarted after installation

Create project

File -> New Project ->Scala -> SBT Select the corresponding version ->finish

Write code

build.sbt adds spark related dependencies

name := "demoPro"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.2"

Create WordCount.scala and write the following code

import org.apache.spark.{
    
    SparkContext, SparkConf}

/**
 * Created by Administrator on 2018/2/20.
 */
object WordCount {
    
    

  def main(args: Array[String]) {
    
    
    val conf = new SparkConf().setAppName("wordcount")
    val sc = new SparkContext(conf)
    val input = sc.textFile("/home/dell/helloSpark.txt")
    val lines = input.flatMap(line => (line.split(" ")))
    val count = lines.map(word => (word, 1)).reduceByKey {
    
     case (x, y) => x + y }
    val output=count.saveAsTextFile("/home/dell/helloSparkRes")
  }
}

Bale

File -> Project Structure -> Aritifacts -> click + sign ->jar -> second -> specify Module and MainClass -> JAR files from libraries select second -> click ok

Click Build -> Build Aritifacts-Build in the subject bar

Generate the corresponding jar package in the out directory of the project, and the package is successful

Submit task

Start hadoop

#进入sbin目录
cd $Hadoop_HOME/sbin 
#启动hadoop集群
start-all.sh

Upload test files to hdfs

hadoop fs -put test.txt /test/test.txt

Upload the program jar package

是同filelize 或者sftp 或者 rz -y命令上传程序jar

Submit task

Start master

sudo ./start-master.sh
访问localhost:8080 获取spark://xxx:7077

Start worker

sudo ./bin/spark-class org.apache.spark.deploy.worker.Worker spark://dell:7077

submit homework

sudo ./bin/spark-submit --master spark://dell:7077 --class WordCount /home/dell/demopro.jar

Check whether the test procedure is correct

Check whether the folder is generated and enter the file to check whether the program is correct

Enter the file to check whether the program is correct

Guess you like

Origin blog.csdn.net/zzq060143/article/details/108343473