hadoop+spark installation precautions, information and installation package


To run the code related to hadoop/spark programs locally, you must simulate the
environment related to hadoop/spark at runtime, some resource packages, etc. Therefore, to build a local test environment under the window, the premise is to configure the environment of hadoop/spark as well.



information package
write picture description here

[Download link]
Cloud disk link https://pan.baidu.com/s/1nvXdDsl Password: 5sw0

[Packages that need to be installed on the virtual machine linux]
Generally, hadoop and spark are installed under the linux system. Of course it's good to have a separate linux host, without it you can only use a virtual machine.

1.hadoop编译好的包
2.java语言运行环境jdk
3.spark包
4.spark运行环境scala

[Remote operation tools]
Windows system must need some tools to operate linux.

1 ssh免安装连接工具 
2 远程编辑工具免安装 nodepad++的远程功能
3 上传下载工具 免安装的FileZilla

[Local environment debugging tool class]
After building the hadoop environment on linux, when writing the hadoop program code, it must be debugged in the local environment, and it will run on the actual hadoop environment without any problems.
The same goes for spark.
hadoop local environment debugging tool

0 配置好hadoop
1 免安装的eclipse
2 hadoop的eclipse插件

Related blog posts
     Eclipse remote connection to Hadoop
     Hadoop configuration IDE development environment


spark local python environment debugging tool

0 配置好spark
1 python编辑器pycharm
2 python运行环境python2.7 

Related blog
postsConfiguring pyspark on pycharm Setting up a spark test environment
Configuring pyspark on pycharm

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325529298&siteId=291194637