Basic principles of Hadoop's MapReduce model

refer to:

https://www.zybuluo.com/frank-shaw/note/206604

 

1. MapReduce data processing flow

 

Regarding the above figure, the following step-by-step analysis can be made:

  1. The input data (to be processed) will first be divided into shards, and each shard will be copied to HDFS in multiple copies. The default in the above figure is that the shard already exists in HDFS.
  2. Hadoop will run map tasks on the nodes that store the input data shards (data in HDFS) for best performance (Data TaskTracker optimization, saving bandwidth).
  3. After running the map task, you can see that the data is not stored back in HDFS, but directly stored on the local disk, because the map output data is intermediate data, and the intermediate data is processed by the reduce task before the final output result is generated. After the reduce task is completed, the data will be deleted.
  4. The output result of the map will be partitioned and sorted locally, which is in preparation for the subsequent reduce phase. The common partition method is to perform Hash conversion on the key value and then modulo, so that the data with the same key value can be placed in the same partition, and the data in the same partition in the reduce phase will be arranged in the same reduce.
  5. If necessary, you can set the combine method in the map stage. The function body of the combine method and the reduce method is the same (doing the same thing), but the object for the combine method is only the data with the same key value in the current map, and The reduce method processes the data corresponding to the same key in all input data. That is, it's just a small clone of reduce. The purpose of this is to reduce the IO transfer burden from the map stage to the reduce stage, and it is also a way to save bandwidth (after combining optimization, the amount of data transferred will be greatly reduced).
  6. Each reduce will copy the data of all map corresponding partitions through IO and merge them. The process of merging includes the process of sorting, because the data corresponding to the same key value should be processed uniformly. In the reduce calculation phase, the input key of reduce is the key, and the input value is an iterator data structure composed of the value corresponding to the same key data.
  7. After the reduce processing, the final output result is the result we want. The output is stored in HDFS, with the first replica stored on the local node and other replicas stored on other rack nodes. Further, these output results can be used as the input of another MapRuce task to perform more task calculations.

 

Second, the specific implementation of MapReduce on Hadoop

 

This implementation mechanism is MapReduce1, and the implementation mechanism became YARN in Hadoop2.x. Knowing MapReduce1 is very helpful for us to understand Hadoop, and I will write an article dedicated to YARN later.

If you look closely, you can find that the MapReduce1 implementation diagram is actually the same as the initial MapReduce workflow, but with the addition of JobTracker, TaskTracker, and Client roles. The map and reduce tasks are assigned to multiple TaskTrackers for execution. These roles are very important and it is necessary to understand them in detail.

About client

Client (client): This is the main part of the programmer's work. The work is to write the mapreduce program, configure the corresponding file information, and submit the job. If something goes wrong, you need to find the error and modify the program until it works perfectly. 

Introduction to JobTracker and TaskTracker

The master-slave structure is obeyed between JobTracker and TaskTracker. As can be seen from the figure: the master node has only one JobTracker, while the slave node has many TaskTrackers.

JobTracker is responsible for:

  • Receive computing tasks submitted by customers
  • Allocate computing tasks to TaskTracker for execution
  • Monitor the execution of TaskTracker

TaskTrackers are responsible for:

  • Complete the computing tasks assigned by the JobTracker

The relationship between JobTrackers and TaskTrackers is the relationship between project managers and developers. The project manager receives the user's requirement list, and then assigns the user's requirement to the developer to complete.

 

 Third, the specific implementation mechanism

After understanding how to migrate from MapReduce to JobTracker TaskTrackers, let's talk about the implementation mechanism in detail (each point mentioned below corresponds to the corresponding number on the graph):

 

submission of assignments

1. A MapReduce program written is a job. Click Run. At this point, a JobClient will be generated, and it will do a series of preparations. When the preparations are done, it will submit the task to the JobTracker.

2. The JobClient requests a new job ID from the JobTracker. At the same time, the JobClient will first do the following checks:

  • Check the output directory of the job, if it is not specified or already exists, do not submit the job and throw an error to the program;
  • Check whether the input directory exists, and throw an error if it does not exist; if it exists, JobTracker will calculate the input split (Input Split) and generate the split according to the input, and throw an error if the split cannot be calculated.

3. The JobClient copies the resources required to run the job (including the job jar file, configuration files, and calculated input shards) to the directory named by the job ID in the JobTracker's file system (that is, in HDFS). It is worth noting that the job jar has more copies (default mapred.submit.replication = 10).

4. After the above preparations are done, it will submit a task to the JobTracker (it will inform the JobTracker: Brother, we are ready to fight at any time. Oops, funny).

 

Job initialization

5. After the JobTracker receives the job submission information, it puts it into the internal queue, assigns it to the job scheduler for scheduling, and initializes it (initialization is to create a running job object (encapsulating tasks and recording information) so that the JobTracker can track it. job status and progress).

6. After initialization, the job scheduler will obtain the input split information (input split), and create a map task for each split. Regarding the number of shards (that is, the number of maps), it has been mentioned earlier. The number of reducers is specified by the user in the configuration file.

In addition to map and reduce tasks, setupJob and cleanupJob need to be created: executed by each TaskTrackers before all maps start and after all reduce ends. setupJob() creates the output directory and a temporary working directory for the task, and cleanupJob() deletes the temporary working directory.

 

assignment of work

7. Each TaskTracker periodically sends a heartbeat to the JobTracker to inform itself that it is still alive, with a message indicating whether it is ready to accept new tasks. The JobTracker uses this to assign tasks and communicates with the TaskTracker using the return value of the heartbeat. The JobTracker uses the scheduling algorithm to first select a job and then select a task of the job to assign to the TaskTracker.

Each TaskTracker will have a fixed number of map and reduce task slots, which are determined by the number of TaskTracker cores and the memory size. The JobTracker will fill all the map slots of the TaskTracker first, and then fill in the reduce task slots of the TaskTracker. 
When JobTracker allocates map tasks, it will select the TaskTracker closest to the input shard, that is, data TaskTracker optimization. It is not necessary to consider the data TaskTracker when assigning reduce tasks.

 

execution of tasks

8. After TaskTracker is assigned to a task, first copy the jar file of the job and all files (set by DistributedCache) required for running the job from HDFS to TaskTracker **local**. Next, TaskTracker creates a new local working directory for the task, and decompresses the contents of the jar file to this folder (the setupJob() mentioned above is needed at this time, which is used to create the output directory and the temporary work of the task. content).

9. TaskTracker creates a new taskRunner instance to run the task.

10.TaskRunner starts a new JVM to run each task.

At this point all the actions shown in the diagram have been written down. But there are still some details to grasp. please look below:

 

Progress and Status Updates

The Child JVM has an independent thread to check the task update flag every 3 seconds. If there is an update, it will report to the  TaskTracker

A global view showing the status of all running jobs and their tasks. 
JobClient.monitorAndPrintJob() queries this information every second.

 

completion of homework

When the JobTracker receives the completion report of the last task (this will be the special job cleanup task), it sets the job status to successful. The Job returns from waitForCompletion() after getting the completion information. 
Finally, the JobTracker clears the job's job status and instructs the TaskTracker to also clear the job's job status (eg, delete intermediate outputs).

 

failure handling mechanism

Node failures are common during distributed computing. As a mature implementation mechanism, there should be a complete set of failure handling mechanisms.

There are three common failures in Hadoop's MapReduce1 architecture: task failure, TaskTracker failure, and JobTracker failure.

 

mission failed

  • Subtask failed. When the code in the map or reduce subtask throws an exception, the JVM process will send an error report to the server TaskTracker process before exiting, and the TaskTracker will mark this (task attempt) task attempt as failed, freeing a slot to run another Task.
  • jvm failed. When the JVM exits suddenly, i.e. a JVM error, TaskTracker will notice that the process has exited, marked as failed.

It is worth noting that: 
1) There is a retry mechanism for task failure. The number of retries map task setting is controlled by the mapred.map.max.attempts property, and reduce is controlled by the mapred.reduce.max.attempts property. 
2) It is acceptable for some jobs to complete a part of the total task, this percentage is controlled by the mapred.map.failures.precent and mapred.reduce.failures.precent parameters. 
3) A task attempt can be killed.

 

TaskTracker failed

During the job running, the TaskTracker will continuously communicate with the system JobTracker through the heartbeat mechanism. If a TaskTracker runs slowly, fails, or fails, the TaskTracker will stop or rarely send heartbeats to the JobTracker, and the JobTracker will notice that the TaskTracker sends heartbeats. This removes this TaskTracker from the TaskTracker pool waiting for task scheduling.

Since TaskTracker contains a certain number of map and reduce subtasks, how to deal with these subtasks at this time?

1) If it is a map and completed successfully, the JobTracker will arrange for the map task that was successfully run on this TaskTracker to return. 
2) If it is reduce and successful, the data is used directly, because the output of reduce will be written to HDFS as soon as it is executed. 
3) If they belong to unfinished jobs, the reduce stage cannot obtain the local map output file on the TaskTracker, and any tasks need to be rescheduled.

Also, even if the TaskTracker does not fail, it will be blacklisted if the failed subtasks on it are much higher than the cluster's average number of failed subtasks. Can be removed from JobTracker's blacklist by restarting.

###jobtracker failure 
jobtracker failure is arguably the most serious way to fail, and it is quite serious in the case of a single point of failure in Hadoop, because in this case the job will eventually fail, despite this failure is very unlikely.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325352052&siteId=291194637