Big Data Series (5) Introduction to Mapreduce and Yarn

Introduction to MapReduce

MapReduce is a programming model, a programming method, and an abstract theory.

YARN concept

  • ResourceManager
  • ApplicationMaster
  • NodeManager
ResourceManager
  • Allocate and schedule resources
  • Start and monitor ApplicationMaster
  • Monitor NodeManager
ApplicationMaster
  • Request resources for MR-type programs and assign them to internal tasks
  • Responsible for data segmentation
  • Monitor task execution and fault tolerance
NodeManager
  • Manage the resources of a single node
  • Process commands from ResourceManager
  • Process commands from ApplicationMaster
MapReduce programming model
  • Enter a large file, and after splitting, divide it into multiple shards
  • Each file fragment is processed by a separate machine, this is the Mao method
  • The structure calculated by each machine is summarized and the final structure is obtained. This is the Reduce method
Published 35 original articles · won 3 · views 3300

Guess you like

Origin blog.csdn.net/qq_43430261/article/details/105544369