Ambari Introduction to Big Data

The management and control of Hadoop clusters has always been a hot topic. For such an application scenario, some people have studied and achieved good results for a long time. This is EasyHadoop. Its functions mainly include cluster installation, management, monitoring and other functions. Ambari abroad is the top-level project of Apache, and now it is a project promoted by Hortonworks, a big data upstart company. This software has cluster automation installation, centralized management, and cluster monitoring. , alarm and other functions, shorten the installation time of the cluster from days to hours, and reduce the number of operation and maintenance personnel from dozens to less than a few, which greatly improves the efficiency of cluster management.



 

Ambari makes Hadoop management simpler by providing a consistent, secure platform for operational control. Ambari provides an intuitive Web UI as well as a robust REST API, which is particularly useful for automating cluster operations. With Ambari, Hadoop operators get the following core benefits:

Simplified Installation, Configuration and Management. Easily and efficiently create, manage and monitor clusters at scale. Takes the guesswork out of configuration with Smart Configs and Cluster Recommendations.  Enables repeatable, automated cluster creation with Ambari Blueprints.

Centralized Security Setup. Reduce the complexity to administer and configure cluster security across the entire platform. Helps automate the setup and configuration of advanced cluster security capabilities such as Kerberos and Apache Ranger.

Full Visibility into Cluster Health. Ensure your cluster is healthy and available with a holistic approach to monitoring. Configures predefined alerts — based on operational best practices — for cluster monitoring. Captures and visualizes critical operational metrics — using Grafana — for analysis and troubleshooting. Integrated with Hortonworks SmartSense for proactive issue prevention and resolution.

Highly Extensible and Customizable. Fit Hadoop seamlessly into your enterprise environment. Highly extensible with Ambari Stacks for bringing custom services under management, and with Ambari Views for customizing the Ambari Web UI.

 

Ambari System Architecture



 
Ambari-server internal architecture

 Ambari is a Hadoop distributed cluster configuration management tool. It is an open source project led by Hortonworks. It has become an open source project of the Apache Foundation and has become a right-hand man in the Hadoop operation and maintenance system.

Ambari makes full use of some existing excellent open source software and skillfully combines them to achieve clustered service management capabilities, monitoring capabilities, and display capabilities in a distributed environment. These excellent open source software include:

(1) On the agent side, the puppet management node is used.

(2) On the web side, ember.js is used as the front-end MVC framework and NodeJS related tools, handlebars.js is used as the page rendering engine, and the Bootstrap framework is also used in CSS/HTML.

(3) On the server side, Jetty, Spring, JAX-RS, etc. are used.

(4) At the same time, the distributed monitoring capabilities of Ganglia and Nagios are used.

The Ambari framework adopts the Server/Client mode, which is mainly composed of two parts: ambari-agent and ambari-server. ambari relies on other mature tools, for example: its ambari-server relies on python, while ambari-agent also relies on tools such as ruby, puppet, and fecter, and it also relies on some monitoring tools nagios and ganglia for monitoring cluster status. in:

Puppet is a distributed cluster configuration management tool and a typical Server/Client mode. It can centrally manage the installation, configuration and deployment of distributed clusters. The main language is ruby.

Facter is a node resource collection library written in Python, which is used to collect system information of nodes, such as OS information. Since ambari-agent is mainly written in Python, facter can be used to collect node information well.

 

 


 
Ambari-agent internal architecture

Ambari-agent is stateless and its functionality is divided into two parts:

Collect the information of the node where it is located and summarize and send the heartbeat report to the ambari-server.

Handles the execution request of ambari-server.

So it has two kinds of queues:

(1), the message queue Message Queue, or ResultQueue. Including node status information (including registration information) and execution result information, and sent to ambari-server through heartbeat after summary.

(2), the operation queue ActionQueue. It is used to receive the status operation sent by ambari-server, and then give it to the executor to call modules such as puppet or Python script to execute the task.



 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326896441&siteId=291194637