Getting Started with Mesos

Mesos provides efficient, resource isolation and sharing across distributed applications and frameworks, supporting Hadoop, MPI, Hypertable, Spark, and more.

Mesos is an open source project in the Apache incubator that uses ZooKeeper to achieve fault-tolerant replication, uses Linux Containers to isolate tasks, and supports multiple resource planning allocations (memory and CPU). Provides Java, Python, and C++ APIs to develop new parallel applications, and a web-based user interface to view cluster status.

Mesos contains four main types of services (actually a socket server), which are Mesos Master, Mesos Slave, SchedulerProcess and ExecutorProcess. They communicate with each other through Protocol Buffer messages. Several Protocol Buffers are registered inside each service. A message handler, once a certain message is received, it will call the corresponding message handler for processing. In addition to the above four services, Mesos also provides three external programmable components, namely Alloctor, Framework Scheduler and Framework Executor. When writing these components, several interfaces must be implemented as required, and these interfaces will be downloaded separately. Adjacent service calls in the diagram.

 

After seeing the above Mesos architecture, most people think that the Framework must be a general framework, such as MapReduce, Storm, Spark, etc., and the Mesos Master is responsible for allocating resources to each framework, and the Scheduler of each framework further allocates resources to various applications within it. This notion is wrong and a misinterpretation of the Mesos architecture.

In fact, Framework can be not only a general framework, but also simple computing tasks like Hadoop's Job or YARN's Application. That is to say, Framework does not need to be a "Framework", or a long-running service ( Such as JobTracker, etc.), it can also be a Job or Application with a short life cycle. If the Framework corresponds to a Hadoop Job, the Framework Scheduler and Framework Executor can be designed as follows:

(1) Framework Scheduler function

Framework Scheduler is responsible for decomposing a job's input data into several tasks, applying for resources for these tasks, monitoring the running status of these tasks, and re-applying resources for a task that fails to run.

(2) Framework Executor function

Prepare the running environment for Map Task or Reduce Task on a node, including preparing various jar packages and binary files, setting necessary environment variables, performing necessary resource isolation, starting Jetty Shuffle to provide remote data copy services for Reduce Task, etc., receive Commands from Framework Scheduler (start tasks, kill tasks, etc.), and execute them.

From the above introduction, we can know that the Framework Scheduler is only responsible for running one Hadoop Job, and if you are familiar with YARN, you will find that the MapReduce ApplicationMaster in YARN does exactly what the MapReduce ApplicationMaster in YARN does. Yes, the design architecture of Mesos and YARN is like this So close that we can easily modify any of YARN's ApplicationMaster to run it in Mesos as a Framework Scheduler.

Recently Mesos provides a mesos-submit tool ( https://github.com/apache/mesos/blob/trunk/docs/Using-the-mesos-submit-tool.md , note that the tool is not complete), which The tool allows the user's Framework Scheduler to run on any Mesos Slave to prevent the client from running too many Framework Schedulers. In this way, the entire architecture and workflow of Mesos has become almost the same as that of YARN.

In order to make it easier for everyone to understand the similarity in architecture between Mesos and YARN, the component correspondence table of Mesos and YARN is given below:

Components in Mesos Components in YARN Function
Mesos Master Resource Manager Resource management and scheduling for the entire cluster
Mesos Slave Node Manager Resource management of a single node (resource isolation, reporting, etc.), task startup, etc.
Framework Executor
Framework Scheduler ApplicationMaster The management of a single application and the secondary scheduling of resources, the basic operations include registration, resource application/acquisition, resource allocation (for internal tasks), etc.

Since Mesos and YARN are so similar, which one should we use? Or, which system is more promising?

From the current point of view, YARN has obvious advantages in the following aspects: (1) Large human investment. At present, YARN has a dedicated company (hortonwork) for maintenance and development (2) with a high reputation. YARN was previously evolved from Hadoop 1.0, inherited the popularity of Hadoop, and has a large number of companies and developers sharing patches. However, the biggest advantage of Mesos is that it is simple in design and easy to use. Unlike YARN, a resource allocation process involves several state machines, and each state machine has more than a dozen states and more than a dozen events. However, in terms of stability, both systems are in the R&D and testing stage, and there is still a long way to go before they are stable and usable.

 

http://www.cnblogs.com/xinsheng/p/3803137.html

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326678280&siteId=291194637