1. Hadoop environment construction
1. Introduction to Hadoop Ecosystem
2. Location and relationship in Hadoop cloud computing
3. Introduction of domestic and foreign Hadoop application cases
4. Hadoop concept, version, history
5. Introduction to the core components of Hadoop and the architecture of hdfs and mapreduce
6. Hadoop Standalone Mode Installation and Testing
7. Hadoop cluster structure
8. Detailed installation steps for Hadoop pseudo-distribution
9. Observe Hadoop from the command line and browser
10. Hadoop startup script analysis
11. Building a fully distributed environment for Hadoop
12. Introduction to Hadoop Safe Mode and Recycle Bin
2. HDFS Architecture and Shell and Java Operation
1. The underlying working principle of HDFS
2. HDFS datanode, namenode details
3. Single Point of Failure (SP0F) and High Availability (HA)
4. Access HDFS via API
5. Introduction and installation of common compression algorithms
6. Maven introduction and installation, use Maven in eclipse, build Maven local warehouse
3. Mapreduce learning
1. Introduction to the four stages of Mapreduce
2. Introduction to Job and Task
3. Default working mechanism
4. Create MR application development to obtain the annual maximum temperature
5. Running MR jobs on Windows
Folders 、 Reduce
7. InputSplit和OutputSplit
8. Shuffle:Sort,Partitioner,Group,Combiner
9. Debug the program through the counter
10. Install Hadoop on Windows
11. Install the Hadoop plugin in eclipse to access Hadoop resources
write ant script in eclipse
13. YARN scheduling framework event distribution mechanism
14. Remote Debug Explorer
15. Protocol analysis of Hadoop's underlying google ProtoBuf
16. Hadoop underlying IPC principle and RPC
4. Hadoop High Availability-HA
1. Introduction to Hadoop2.x cluster structure
2. Hadoop2.x cluster construction
3. High Availability (HA) of NameNode
4. HDFS Federation
5. High Availability (HA) of ResourceManager
6. Hadoop cluster common problems and solutions