Click Install and test hadoop
1 • Local (Standalone) Mode (local mode single node) Linux file system is the storage system hadoop of
running jvm in a single environment, which uses linux file system for development, testing, debugging environment
Run Case
2 • Pseudo-Distributed Mode (pseudo-distributed mode, it is actually on a single machine)
to do their own take it (last)
familiar with the shell HDFS
HDFS the DFS -help
3 • Fully-Distributed Mode (distributed across multiple nodes, each node running)
Set hadoop operating environment env is called the operating environment hadoop-env.sh hadoop operating environment, modify the inside of Javahome
This stand-alone installation.
Metadata is data description
---------------------------------
fully distributed hadoop official website
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
High reliability is automatically set to retain the number of copies, copy once lost, namenode regularly replicated copy (that is, backup).
But also fast reporting mechanisms, each time one hour report, there is no report once considered to be hung up on datanode, detecting if a piece of missing copy,
namenode will tell some other block will be 4 blocks to some other cluster copy
There is also a heartbeat mechanism to report once every three minutes, indicating still alive, alive datanode, the namenode can give its partial task
Configuration and testing of cluster hdfs
hadoop集群安装
1、jdk
2、ssh免密
hadoop0001--->hadoop0001,hadoop0002,hadoop0003
3、静态IP
4、hostname
5、hosts
6、防火墙关闭
7、使用root用户(企业级的集群,使用普通用户hadoop)
安装步骤:
1、解压并配置环境变量 tar -zxvf had -C /usr/local
2、配置hadoop的运行环境
hadoop-env.sh
1 hadoop的核心配置文件
vi $HADOOP_HOME/etc/hadoop/core-site.xml
<!--指定hadoop所使用的文件系统schema(uri),hdfs的老大的地址-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop0001:9000</value>
</property>
<!--指定buffer的大小-->
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<!--指定hadoop运行时产生的文件的存放目录-->
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
3、2 配置hdfs的设置
vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<!--指定namenode的元数据的存放路径-->
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoopdata/dfs/name</value>
</property>
<!--指定datanode的数据的存放路径-->
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoopdata/dfs/data</value>
</property>
<!--指定namenode的web ui的监控端口-->
<property>
<name>dfs.http.address</name>
<value>master:50070</value>
</property>
<!--指定secondary namenode的web ui的监控端口-->
<property>
<name>dfs.secondary.http.address</name>
<value>slave1:50090</value>
</property>
<!--指定副本数量-->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!--指定块的大小-->
<property>
<name>dfs.blocksize</name>
<value>128m</value>
</property>
<!--指定web端 rest api是否开启操作hdfs的权限-->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!--是否开启文件系统权限-->
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<!--文件系统的检测目录-->
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///usr/local/hadoopdata/checkpoint/cname</value>
<description>Determines where on the local filesystem the DFS secondary
name node should store the temporary images to merge.
If this is a comma-delimited list of directories then the image is
replicated in all of the directories for redundancy.
</description>
</property>
<!--edits的检测目录-->
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>file:///usr/local/hadoopdata/checkpoint/cname</value>
<description>Determines where on the local filesystem the DFS secondary
name node should store the temporary edits to merge.
If this is a comma-delimited list of directories then the edits is
replicated in all of the directories for redundancy.
Default value is same as dfs.namenode.checkpoint.dir
</description>
</property>
3、3配置小弟文件
vi $HADOOP_HOME/etc/hadoop/slaves
hadoop0001
hadoop0002
hadoop0003
4、分发安装包
scp -r
5、初始化 格式化namenode
name.dir的路径不能手动创建
hdfs namenode -format
hadoop namenode -format
6、启动
单节点:
hadoop-daemon.sh start namenode
批量启动:
start-dfs.sh