Storm cluster architecture

1. Who is storm?

Storm is an open source framework for Twitter. Storm is a distributed, fault-tolerant real-time computing system hosted on GitHub under the Eclipse Public License 1.0. Storm is a real-time processing system developed by BackType, which is now owned by Twitter. The latest version on GitHub is Storm 0.9.0.1, basically written in Clojure.

A Twitter Storm cluster is superficially similar to a Hadoop cluster. Hadoop runs MapReduce Jobs, while Storm runs topologies; but there is a big difference in itself. The main difference is that Hadoop MapReduce Job runs will eventually finish, while Storm topologies process The data process theoretically lives forever unless you kill it.

There are two types of nodes in a Storm cluster: Master Node and Work Node.

2. The corresponding roles are as follows:
1. A background program called Nimbus runs on the master node, which is responsible for distributing code in the Storm cluster, assigning tasks to working machines, and monitoring the running status of the cluster . The role of Nimbus is similar to the role of JobTracker in Hadoop.

2. A daemon called Supervisor runs on each Work Node. Supervisor is responsible for listening to tasks assigned to it from Nimbus, and starts or stops the worker process executing the task accordingly. Each worker process executes a subset of a topology; a running topology consists of multiple worker processes distributed on different worker nodes.


 
All coordination between Nimbus and Supervisor nodes is done through the Zookeeper cluster. Additionally, both Nimbus and Supervisor processes are fail-fast and stateless; all state of a Storm cluster is either in the Zookeeper cluster or stored on local disk. This means you can kill the Nimbus and Supervisor processes with kill -9 and they will continue to work after a reboot. This design makes Storm clusters incredibly stable.

To achieve real-time computing on a Storm cluster, you need to create Topologies. A Topology is a calculated graph. Each node in a topology contains processing logic, and the links between nodes represent how data should be passed around between nodes.
Running a topology is relatively simple. First, you package all the code and dependencies into a jar package. Then, you run the following command:


storm jar all-my-code.jar backtype.storm.MyTopology arg1 arg2  
This runs a backtype.storm.MyTopology class with two parameters, arg1 and arg2. The main method defines the Topology and submits it to Nimbus. The storm jar part connects to Nimbus and uploads the jar package to the cluster.
 
Since the definition of a topology is a Thirf structure, and Nimbus is a Thirf service, you can create and submit a topology in any language.
Storm is a real-time stream processing framework, so of course its abstract core is stream. Storm can also be used for "continuous computation", making continuous queries on data streams, and outputting the results to users in the form of streams during computation. It can also be used for "distributed RPC" to run expensive operations in parallel. Storm's lead engineer Nathan Marz said:


Storm can easily write and scale complex real-time computations in a computer cluster. Storm is for real-time processing what Hadoop is for batch processing. Storm guarantees that every message will be processed, and it's fast - in a small cluster, millions of messages per second can be processed. Even better, you can develop in any programming language.  

3. The main features of Storm are as follows:

 

1) Simple programming model: Similar to Mapreduce which reduces the complexity of parallel batch processing, Storm reduces the complexity of doing real-time processing.

2) Various programming languages ​​can be used: You can use various programming languages ​​on top of Storm. Clojure, Java, Ruby and Python are supported by default. To add support for other languages, simply implement a simple Storm communication protocol.

3) Fault Tolerance: Storm manages the failure of worker processes and nodes.

4) Horizontal scaling: Computation is performed in parallel across multiple threads, processes and servers.

5) Reliable message processing: Storm guarantees that each message can be fully processed at least once. When the task fails, it is responsible for retrying the message from the message source.


6) Fast: The design of the system ensures that messages can be processed quickly, using ZeroMQ as its underlying message queue.

7) Local Mode: Storm has a "local mode" that fully emulates a Storm cluster during processing. This allows you to develop and unit test quickly.


 
4. Storm cluster architecture Storm cluster adopts master-slave architecture. The master node is Nimbus, and the slave node is Supervisor. The information related to scheduling is stored in the ZooKeeper cluster. The architecture is shown in the following figure:
 

                      


The specific description is as follows: The Master node of the

Nimbus
Storm cluster is responsible for distributing user code and assigning it to the Worker node on the specific Supervisor node to run the Task of the component (Spout/Bolt) corresponding to the Topology.

The slave node of the Supervisor
Storm cluster is responsible for managing the startup and termination of each Worker process running on the Supervisor node. Through the supervisor.slots.ports configuration item in Storm's configuration file, you can specify the maximum number of slots allowed on a Supervisor. Each slot is uniquely identified by a port number, and a port number corresponds to a Worker process (if the Worker process is start up).

ZooKeeper
is used to coordinate Nimbus and Supervisor. If the Supervisor cannot run the Topology due to a fault, Nimbus will sense it and reassign the Topology to run on other available Supervisors.

Stream Groupings
The most important abstraction in Storm should be Stream grouping, which can control how the Task corresponding to Spot/Bolt distributes Tuple, and transmit Tuple to the Task corresponding to the destination Spot/Bolt, as shown in the following figure:



Currently, Storm Streaming Grouping supports the following types:
Shuffle Grouping: Random grouping, across multiple Bolt Tasks, can randomly make each Bolt Task receive roughly the same number of Tuples, but Tuples do not repeat

Fields Grouping: According to the specified Fields are grouped, and the value of the same Field will definitely be emitted to the same Task

Partial Key Grouping: Similar to Fields grouping, grouping and distribution is performed according to a part of the specified Field, which can well achieve Load balance and send Tuple to the task corresponding to the downstream Bolt, especially in scenarios with data skew, use Partial Key Grouping can better improve resource utilization

All Grouping: All Bolt Tasks receive the same Tuple (replication means here)

Global Grouping: All streams point to the same Bolt Task (that is, with the smallest Task ID)

None Grouping: You don't need to care about how Streams are grouped, which is equivalent to Shuffle grouping

Direct Grouping: It is up to the Tupe producer to decide which Bolt Task to send to the downstream, which needs to be precisely controlled in the logic of actually developing and writing Bolt code

. or Shuffle Grouping: If the target Bolt has one or more Tasks in the JVM instance corresponding to the same Worker process, the Tuple is only sent to these Tasks

. In addition, Storm also provides a user-defined Streaming Grouping interface. If the above Streaming Grouping Neither can meet the actual business needs, or you can implement it yourself. You only need to implement the backtype.storm.grouping.CustomStreamGrouping interface, which defines the following methods:
 
List<Integer> chooseTasks(int taskId, List<Object> values)  
 
Among the above built-in implementations of Streaming Group, the most commonly used ones are Shuffle Grouping, Fields Grouping, and Direct Grouping. Others can also be used to meet specific application requirements.
 
Thank you for reading this article, for more content or support, please click  Shanghai Big Data Training

 Please indicate the source when reprinting! http://www.shsxt.com/it/bigdata/958.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324983095&siteId=291194637