Storm introduces big data framework

First, what is Storm?

  Storm is a distributed computing framework, mainly using Clojure and Java language, originally developed by Nathan Marz Backtype led the company to create a team, be open after the company was acquired by Twitter Backtype company. The original version is in the September 17, 2011 release, version 0.5.0.

  September 2013, Apache Foundation began to take over and hatch Storm project. Apache Storm is developed under the Eclipse Public License, it is available to most businesses. After more than a year, in September 2014, Storm project became Apache top-level project. Currently, Storm's latest version 1.1.0.

  Storm is a free open source distributed real-time computing systems. Storm can easily and reliably handle unbounded data streams, like Hadoop data batch.

Second, the difference between the Storm and hadoop

1) Storm for real-time computing, Hadoop for off-line calculations.

Data 2) Storm processed stored in memory, a steady stream; processing the data stored in Hadoop file system, batch to batch processing.

3) Storm incoming data transmission through the network; the Hadoop data stored in the disk.

4) Storm and Hadoop programming model similar

 

 

Storm

hadoop

Character

Nimbus

JobTracker

 

Supervisor

TaskTracker

 

Worker

Child

Application Name

Topology

Job

Programming Interface

Spout/Bolt

Mapper/Reducer

 

Three, Storm features

  1) widely applicable scenarios: Storm can be applied to real-time processing messages, update the database for computing scene.

  2) can be highly scalable: Storm Storm scalability allows messages per second reached very high. Extend a real-time computing tasks, you need to do is to add machines and increase the degree of parallelism computing tasks. Storm Zookeeper to coordinate the use of various configurations inside the machine makes the Storm cluster can be easily extended.

  3) to ensure that no data is lost: Storm ensure that all data are processed.

  4) Abnormal robust: Storm cluster is very easy to manage, turn to restart the node does not affect the application.

  5) good fault tolerance: An exception occurred in the message processing, Storm will try again.

 

Guess you like

Origin www.cnblogs.com/wangxiaozhang/p/11025532.html