Spark wordcount入门

今天简单讲一下在local模式下用eclipse开发一个简单的spark应用程序,并在本地运行测试。 
1.下载最新版的scala for eclipse版本,选择windows 64位,下载网址: http://scala-ide.org/download/sdk.html  


下载好后解压到D盘,打开并选择工作空间。 
 

然后创建一个测试项目ScalaDev,右击项目选择Properties,在对话框中选择Scala Compiler,在右面页签中勾选Use Project Settings和Scala Installation点击ok,保存配置。 



2.添加spark1.6.0的jar文件依赖spark-assembly-1.6.0-hadoop2.6.0.jar,并添加到项目中。 
spark-assembly-1.6.0-hadoop2.6.0.jar在spark-1.6.0-bin-hadoop2.6.tgz包中的lib下面。 



右击ScalaDev项目选择Build Path->Configure Build Path 


注:如果你选择了Scala Installation为Latest2.11 bundle(dynamic)项目会报如下的错误:ScalaDev工程上出现一个红叉,查看Problems下面的原因是scala编译版本和spark的不一致导致。 
More than one scala library found in the build path (D:/eclipse/plugins/org.scala-lang.scala-library_2.11.7.v20150622-112736-1fbce4612c.jar, F:/IMF/Big_Data_Software/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar).At least one has an incompatible version. Please update the project build path so it contains only one compatible scala library. 



 
解决方法:右击Scala Library Container->Properties,在弹出框中选择Latest 2.10 bundle(dynamic),保存即可。 

 

3.在src下创建spark工程包,并创建入口类。 
选择项目New -> Package创建com.imf.spark包; 

 

选择com.imf.spark包名,创建Scala Object; 

 

测试程序前,要将spark-1.6.0-bin-hadoop2.6目录中的README.md文件拷贝到D://testspark//目录中,代码如下: 
Java代码   收藏代码
  1. package com.imf.spark  
  2.   
  3. import org.apache.spark.SparkConf  
  4. import org.apache.spark.SparkContext  
  5. /** 
  6.  * 用户scala开发本地测试的spark wordcount程序 
  7.  */  
  8. object WordCount {  
  9.    def main(args: Array[String]): Unit = {  
  10.      /** 
  11.       * 1.创建Spark的配置对象SparkConf,设置Spark程序的运行时的配置信息, 
  12.       * 例如:通过setMaster来设置程序要链接的Spark集群的Master的URL,如果设置为local, 
  13.       * 则代表Spark程序在本地运行,特别适合于机器配置条件非常差的情况。 
  14.       */  
  15.      //创建SparkConf对象  
  16.      val conf = new SparkConf()  
  17.      //设置应用程序名称,在程序运行的监控界面可以看到名称  
  18.      conf.setAppName("My First Spark App!")  
  19.      //设置local使程序在本地运行,不需要安装Spark集群  
  20.      conf.setMaster("local")  
  21.      /** 
  22.       * 2.创建SparkContext对象 
  23.       * SparkContext是spark程序所有功能的唯一入口,无论是采用Scala,java,python,R等都必须有一个SprakContext 
  24.       * SparkContext核心作用:初始化spark应用程序运行所需要的核心组件,包括DAGScheduler,TaskScheduler,SchedulerBackend 
  25.       * 同时还会负责Spark程序往Master注册程序等; 
  26.       * SparkContext是整个应用程序中最为至关重要的一个对象; 
  27.       */  
  28.      //通过创建SparkContext对象,通过传入SparkConf实例定制Spark运行的具体参数和配置信息  
  29.      val sc = new SparkContext(conf)  
  30.   
  31.      /** 
  32.       * 3.根据具体数据的来源(HDFS,HBase,Local,FS,DB,S3等)通过SparkContext来创建RDD; 
  33.       * RDD的创建基本有三种方式:根据外部的数据来源(例如HDFS)、根据Scala集合、由其他的RDD操作; 
  34.       * 数据会被RDD划分成为一系列的Partitions,分配到每个Partition的数据属于一个Task的处理范畴; 
  35.       */  
  36.      //读取本地文件,并设置一个partition  
  37.      val lines = sc.textFile("D://testspark//README.md",1)  
  38.   
  39.      /** 
  40.       * 4.对初始的RDD进行Transformation级别的处理,例如map,filter等高阶函数的变成,来进行具体的数据计算 
  41.       * 4.1.将每一行的字符串拆分成单个单词 
  42.       */  
  43.      //对每一行的字符串进行拆分并把所有行的拆分结果通过flat合并成一个大的集合  
  44.       val words = lines.flatMap { line => line.split(" ") }  
  45.      /** 
  46.       * 4.2.在单词拆分的基础上对每个单词实例计数为1,也就是word => (word,1) 
  47.       */  
  48.      val pairs = words.map{word =>(word,1)}  
  49.   
  50.      /** 
  51.       * 4.3.在每个单词实例计数为1基础上统计每个单词在文件中出现的总次数 
  52.       */  
  53.      //对相同的key进行value的累积(包括Local和Reducer级别同时Reduce)  
  54.      val wordCounts = pairs.reduceByKey(_+_)  
  55.      //打印输出  
  56.      wordCounts.foreach(pair => println(pair._1+":"+pair._2))  
  57.      sc.stop()  
  58.    }  
  59. }  


运行结果: 
Java代码   收藏代码
  1. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties  
  2. 16/01/26 08:23:37 INFO SparkContext: Running Spark version 1.6.0  
  3. 16/01/26 08:23:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable  
  4. 16/01/26 08:23:42 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path  
  5. java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.  
  6.     at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)  
  7.     at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)  
  8.     at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)  
  9.     at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)  
  10.     at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104)  
  11.     at org.apache.hadoop.security.Groups.<init>(Groups.java:86)  
  12.     at org.apache.hadoop.security.Groups.<init>(Groups.java:66)  
  13.     at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)  
  14.     at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)  
  15.     at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248)  
  16.     at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763)  
  17.     at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748)  
  18.     at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621)  
  19.     at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2136)  
  20.     at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2136)  
  21.     at scala.Option.getOrElse(Option.scala:120)  
  22.     at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2136)  
  23.     at org.apache.spark.SparkContext.<init>(SparkContext.scala:322)  
  24.     at com.dt.spark.WordCount$.main(WordCount.scala:29)  
  25.     at com.dt.spark.WordCount.main(WordCount.scala)  
  26. 16/01/26 08:23:42 INFO SecurityManager: Changing view acls to: vivi  
  27. 16/01/26 08:23:42 INFO SecurityManager: Changing modify acls to: vivi  
  28. 16/01/26 08:23:42 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vivi); users with modify permissions: Set(vivi)  
  29. 16/01/26 08:23:43 INFO Utils: Successfully started service 'sparkDriver' on port 54663.  
  30. 16/01/26 08:23:43 INFO Slf4jLogger: Slf4jLogger started  
  31. 16/01/26 08:23:43 INFO Remoting: Starting remoting  
  32. 16/01/26 08:23:43 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:54676]  
  33. 16/01/26 08:23:43 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 54676.  
  34. 16/01/26 08:23:43 INFO SparkEnv: Registering MapOutputTracker  
  35. 16/01/26 08:23:43 INFO SparkEnv: Registering BlockManagerMaster  
  36. 16/01/26 08:23:43 INFO DiskBlockManager: Created local directory at C:\Users\vivi\AppData\Local\Temp\blockmgr-5f59f3c2-3b87-49c5-a1ae-e21847aac44b  
  37. 16/01/26 08:23:43 INFO MemoryStore: MemoryStore started with capacity 1813.7 MB  
  38. 16/01/26 08:23:43 INFO SparkEnv: Registering OutputCommitCoordinator  
  39. 16/01/26 08:23:43 INFO Utils: Successfully started service 'SparkUI' on port 4040.  
  40. 16/01/26 08:23:43 INFO SparkUI: Started SparkUI at http://192.168.100.102:4040  
  41. 16/01/26 08:23:43 INFO Executor: Starting executor ID driver on host localhost  
  42. 16/01/26 08:23:43 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 54683.  
  43. 16/01/26 08:23:43 INFO NettyBlockTransferService: Server created on 54683  
  44. 16/01/26 08:23:43 INFO BlockManagerMaster: Trying to register BlockManager  
  45. 16/01/26 08:23:43 INFO BlockManagerMasterEndpoint: Registering block manager localhost:54683 with 1813.7 MB RAM, BlockManagerId(driver, localhost, 54683)  
  46. 16/01/26 08:23:43 INFO BlockManagerMaster: Registered BlockManager  
  47. 16/01/26 08:23:46 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 153.6 KB, free 153.6 KB)  
  48. 16/01/26 08:23:46 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 13.9 KB, free 167.6 KB)  
  49. 16/01/26 08:23:46 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:54683 (size: 13.9 KB, free: 1813.7 MB)  
  50. 16/01/26 08:23:46 INFO SparkContext: Created broadcast 0 from textFile at WordCount.scala:37  
  51. 16/01/26 08:23:47 WARN : Your hostname, vivi-PC resolves to a loopback/non-reachable address: fe80:0:0:0:5937:95c4:86da:2f43%30, but we couldn't find any external IP address!  
  52. 16/01/26 08:23:48 INFO FileInputFormat: Total input paths to process : 1  
  53. 16/01/26 08:23:48 INFO SparkContext: Starting job: foreach at WordCount.scala:56  
  54. 16/01/26 08:23:48 INFO DAGScheduler: Registering RDD 3 (map at WordCount.scala:48)  
  55. 16/01/26 08:23:48 INFO DAGScheduler: Got job 0 (foreach at WordCount.scala:56) with 1 output partitions  
  56. 16/01/26 08:23:48 INFO DAGScheduler: Final stage: ResultStage 1 (foreach at WordCount.scala:56)  
  57. 16/01/26 08:23:48 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)  
  58. 16/01/26 08:23:48 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)  
  59. 16/01/26 08:23:48 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[3] at map at WordCount.scala:48), which has no missing parents  
  60. 16/01/26 08:23:48 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.0 KB, free 171.6 KB)  
  61. 16/01/26 08:23:48 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.3 KB, free 173.9 KB)  
  62. 16/01/26 08:23:48 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:54683 (size: 2.3 KB, free: 1813.7 MB)  
  63. 16/01/26 08:23:48 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006  
  64. 16/01/26 08:23:48 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[3] at map at WordCount.scala:48)  
  65. 16/01/26 08:23:48 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks  
  66. 16/01/26 08:23:48 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2119 bytes)  
  67. 16/01/26 08:23:48 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)  
  68. 16/01/26 08:23:48 INFO HadoopRDD: Input split: file:/D:/testspark/README.md:0+3359  
  69. 16/01/26 08:23:48 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id  
  70. 16/01/26 08:23:48 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id  
  71. 16/01/26 08:23:48 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap  
  72. 16/01/26 08:23:48 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition  
  73. 16/01/26 08:23:48 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id  
  74. 16/01/26 08:23:48 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 2253 bytes result sent to driver  
  75. 16/01/26 08:23:48 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 177 ms on localhost (1/1)  
  76. 16/01/26 08:23:48 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool   
  77. 16/01/26 08:23:48 INFO DAGScheduler: ShuffleMapStage 0 (map at WordCount.scala:48) finished in 0.186 s  
  78. 16/01/26 08:23:48 INFO DAGScheduler: looking for newly runnable stages  
  79. 16/01/26 08:23:48 INFO DAGScheduler: running: Set()  
  80. 16/01/26 08:23:48 INFO DAGScheduler: waiting: Set(ResultStage 1)  
  81. 16/01/26 08:23:48 INFO DAGScheduler: failed: Set()  
  82. 16/01/26 08:23:48 INFO DAGScheduler: Submitting ResultStage 1 (ShuffledRDD[4] at reduceByKey at WordCount.scala:54), which has no missing parents  
  83. 16/01/26 08:23:48 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.5 KB, free 176.4 KB)  
  84. 16/01/26 08:23:48 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1581.0 B, free 177.9 KB)  
  85. 16/01/26 08:23:48 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:54683 (size: 1581.0 B, free: 1813.7 MB)  
  86. 16/01/26 08:23:48 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006  
  87. 16/01/26 08:23:48 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (ShuffledRDD[4] at reduceByKey at WordCount.scala:54)  
  88. 16/01/26 08:23:48 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks  
  89. 16/01/26 08:23:48 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0,NODE_LOCAL, 1894 bytes)  
  90. 16/01/26 08:23:48 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)  
  91. 16/01/26 08:23:48 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks  
  92. 16/01/26 08:23:48 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 2 ms  
  93. package:1  
  94. For:2  
  95. Programs:1  
  96. processing.:1  
  97. Because:1  
  98. The:1  
  99. cluster.:1  
  100. its:1  
  101. [run:1  
  102. APIs:1  
  103. have:1  
  104. Try:1  
  105. computation:1  
  106. through:1  
  107. several:1  
  108. This:2  
  109. graph:1  
  110. Hive:2  
  111. storage:1  
  112. ["Specifying:1  
  113. To:2  
  114. page](http://spark.apache.org/documentation.html):1  
  115. Once:1  
  116. "yarn":1  
  117. prefer:1  
  118. SparkPi:2  
  119. engine:1  
  120. version:1  
  121. file:1  
  122. documentation,:1  
  123. processing,:1  
  124. the:21  
  125. are:1  
  126. systems.:1  
  127. params:1  
  128. not:1  
  129. different:1  
  130. refer:2  
  131. Interactive:2  
  132. R,:1  
  133. given.:1  
  134. if:4  
  135. build:3  
  136. when:1  
  137. be:2  
  138. Tests:1  
  139. Apache:1  
  140. ./bin/run-example:2  
  141. programs,:1  
  142. including:3  
  143. Spark.:1  
  144. package.:1  
  145. 1000).count():1  
  146. Versions:1  
  147. HDFS:1  
  148. Data.:1  
  149. >>>:1  
  150. programming:1  
  151. Testing:1  
  152. module,:1  
  153. Streaming:1  
  154. environment:1  
  155. run::1  
  156. clean:1  
  157. 1000::2  
  158. rich:1  
  159. GraphX:1  
  160. Please:3  
  161. is:6  
  162. run:7  
  163. URL,:1  
  164. threads.:1  
  165. same:1  
  166. MASTER=spark://host:7077:1  
  167. on:5  
  168. built:1  
  169. against:1  
  170. [Apache:1  
  171. tests:2  
  172. examples:2  
  173. at:2  
  174. optimized:1  
  175. usage:1  
  176. using:2  
  177. graphs:1  
  178. talk:1  
  179. Shell:2  
  180. class:2  
  181. abbreviated:1  
  182. directory.:1  
  183. README:1  
  184. computing:1  
  185. overview:1  
  186. `examples`:2  
  187. example::1  
  188. ##:8  
  189. N:1  
  190. set:2  
  191. use:3  
  192. Hadoop-supported:1  
  193. tests](https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools).:1  
  194. running:1  
  195. find:1  
  196. contains:1  
  197. project:1  
  198. Pi:1  
  199. need:1  
  200. or:3  
  201. Big:1  
  202. Java,:1  
  203. high-level:1  
  204. uses:1  
  205. <class>:1  
  206. Hadoop,:2  
  207. available:1  
  208. requires:1  
  209. (You:1  
  210. see:1  
  211. Documentation:1  
  212. of:5  
  213. tools:1  
  214. using::1  
  215. cluster:2  
  216. must:1  
  217. supports:2  
  218. built,:1  
  219. system:1  
  220. build/mvn:1  
  221. Hadoop:3  
  222. this:1  
  223. Version"](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version):1  
  224. particular:2  
  225. Python:2  
  226. Spark:13  
  227. general:2  
  228. YARN,:1  
  229. pre-built:1  
  230. [Configuration:1  
  231. locally:2  
  232. library:1  
  233. A:1  
  234. locally.:1  
  235. sc.parallelize(1:1  
  236. only:1  
  237. Configuration:1  
  238. following:2  
  239. basic:1  
  240. #:1  
  241. changed:1  
  242. More:1  
  243. which:2  
  244. learning,:1  
  245. first:1  
  246. ./bin/pyspark:1  
  247. also:4  
  248. should:2  
  249. for:11  
  250. [params]`.:1  
  251. documentation:3  
  252. [project:2  
  253. mesos://:1  
  254. Maven](http://maven.apache.org/).:1  
  255. setup:1  
  256. <http://spark.apache.org/>:1  
  257. latest:1  
  258. your:1  
  259. MASTER:1  
  260. example:3  
  261. scala>:1  
  262. DataFrames,:1  
  263. provides:1  
  264. configure:1  
  265. distributions.:1  
  266. can:6  
  267. About:1  
  268. instructions.:1  
  269. do:2  
  270. easiest:1  
  271. no:1  
  272. how:2  
  273. `./bin/run-example:1  
  274. Note:1  
  275. individual:1  
  276. spark://:1  
  277. It:2  
  278. Scala:2  
  279. Alternatively,:1  
  280. an:3  
  281. variable:1  
  282. submit:1  
  283. machine:1  
  284. thread,:1  
  285. them,:1  
  286. detailed:2  
  287. stream:1  
  288. And:1  
  289. distribution:1  
  290. return:2  
  291. Thriftserver:1  
  292. ./bin/spark-shell:1  
  293. "local":1  
  294. start:1  
  295. You:3  
  296. Spark](#building-spark).:1  
  297. one:2  
  298. help:1  
  299. with:3  
  300. print:1  
  301. Spark"](http://spark.apache.org/docs/latest/building-spark.html).:1  
  302. data:1  
  303. wiki](https://cwiki.apache.org/confluence/display/SPARK).:1  
  304. in:5  
  305. -DskipTests:1  
  306. downloaded:1  
  307. versions:1  
  308. online:1  
  309. Guide](http://spark.apache.org/docs/latest/configuration.html):1  
  310. comes:1  
  311. [building:1  
  312. Python,:2  
  313. Many:1  
  314. building:2  
  315. Running:1  
  316. from:1  
  317. way:1  
  318. Online:1  
  319. site,:1  
  320. other:1  
  321. Example:1  
  322. analysis.:1  
  323. sc.parallelize(range(1000)).count():1  
  324. you:4  
  325. runs.:1  
  326. Building:1  
  327. higher-level:1  
  328. protocols:1  
  329. guidance:2  
  330. a:8  
  331. guide,:1  
  332. name:1  
  333. fast:1  
  334. SQL:2  
  335. will:1  
  336. instance::1  
  337. to:14  
  338. core:1  
  339. :67  
  340. web:1  
  341. "local[N]":1  
  342. programs:2  
  343. package.):1  
  344. that:2  
  345. MLlib:1  
  346. ["Building:1  
  347. shell::2  
  348. Scala,:1  
  349. and:10  
  350. command,:2  
  351. ./dev/run-tests:1  
  352. sample:1  
  353. 16/01/26 08:23:48 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1165 bytes result sent to driver  
  354. 16/01/26 08:23:48 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 61 ms on localhost (1/1)  
  355. 16/01/26 08:23:48 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool   
  356. 16/01/26 08:23:48 INFO DAGScheduler: ResultStage 1 (foreach at WordCount.scala:56) finished in 0.061 s  
  357. 16/01/26 08:23:48 INFO DAGScheduler: Job 0 finished: foreach at WordCount.scala:56, took 0.328012 s  
  358. 16/01/26 08:23:48 INFO SparkUI: Stopped Spark web UI at http://192.168.100.102:4040  
  359. 16/01/26 08:23:48 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!  
  360. 16/01/26 08:23:48 INFO MemoryStore: MemoryStore cleared  
  361. 16/01/26 08:23:48 INFO BlockManager: BlockManager stopped  
  362. 16/01/26 08:23:48 INFO BlockManagerMaster: BlockManagerMaster stopped  
  363. 16/01/26 08:23:48 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!  
  364. 16/01/26 08:23:48 INFO SparkContext: Successfully stopped SparkContext  
  365. 16/01/26 08:23:48 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.  
  366. 16/01/26 08:23:48 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.  
  367. 16/01/26 08:23:48 INFO ShutdownHookManager: Shutdown hook called  
  368. 16/01/26 08:23:48 INFO ShutdownHookManager: Deleting directory C:\Users\vivi\AppData\Local\Temp\spark-56f9ed0a-5671-449a-955a-041c63569ff2  

说明:上面程序运行错误,是加载hadoop的配置,因为运行在本地,是找不到的,但不影响测试。 

猜你喜欢

转载自blog.csdn.net/u012879957/article/details/81018414