Introduction 2.1 Hadoop
Founder: Doug Cutting
1 Introduction:
Free open source;
Simple operation, greatly reduce the complexity of use;
Hadoop is a Java development;
On the Hadoop development and application support for multiple programming languages, not limited to Java;
Hadoop two core: HDFS + MapReduce
HDFS: mass data storage
MapReduce: massive data processing
2. Origin:
It was originally a text search library, imitating Google's search engine;
Into the Google technologies: distributed file system GFS; distributed parallel programming framework MapReduce;
3. History of fame: data sorting proud achievements
4. Characteristics:
1. High reliability
2. efficiency
3 scalability
4. A high fault tolerance
5. Low Cost
6. run on the Linux platform
7. supports a variety of programming languages
5. Application Status:
For example: Facebook
2.2 Hadoop project structure
HDFS: distributed file storage
MapReduce: data processing, disk-based
The Spark (MapReduce performance than an order of magnitude): data processing, memory-based
Hive: data warehouse; decision making analysis; support for SQL statements (SQL statements to turn into MapReduce jobs, go to execution);
Pig: stream data processing, lightweight data; providing SQL-like query Pig Latin;
Oozie: workflow scheduling system
Zookeeper: distributed coordination services; distributed lock; cluster management;
HBase: column families database, random access
Flume: log collection
Sqoop: importing and exporting data, relational database to HDFS, HBase, Hive transconductance
Ambari: Rapid Deployment Tool
Installation and use of 2.3 Hadoop
1.Linux options:
Select the version of Linux: Ubuntu
Memory options: look at the computer. Memory than 4G, select 64
2. The system is installed virtual machine or dual system:
See computer configuration
Relatively new computer, install a virtual machine
3. With regard to Linux Basics
1.Shell: command parser
2.sudo command: rights management mechanism, administrators can authorize ordinary users to perform some operations require root privileges to perform
3. Enter the password: can not see the password you entered
4. English switch input method: "shift" key
Adhesive 5.Ubuntu terminal assignment shortcut: ctrl + shift + V
4. Installation:
单机模式,伪分布式模式,分布式模式
5.创建虚拟机:
1.材料与工具:虚拟机软件与系统映像文件
2.确认系统版本:
2.4 Hadoop集群的部署与使用
考虑HDFS和MapReduce
(后补)
慕课链接:https://www.icourse163.org/learn/XMU-1002335004?tid=1003965001#/learn/content