[1] Flume Getting basic components (b) Agent

Way of working

Flume-og uses a multi-Master way. To ensure the consistency of the configuration data, the introduction of the ZooKeeper The Flume, for storing configuration data, the ZooKeeper itself can ensure consistency and high availability configuration data, In addition, when the configuration data changes, may inform the ZooKeeper Flume Master node. Flume Master synchronization data between gossip protocol used.
Flume-ng The most obvious change is the abolition of Master Zookeeper centralized management and configuration, become a mere transmission tool. Another major difference Flume-ng is read in the data and the data written by a different process operating threads (referred Runner). In Flume-og, the read thread to do the work of writing the same (except retry failure). If you write slow, then (not a complete failure), it blocks Flume ability to receive data. This asynchronous design makes reading thread can work smoothly without any attention to the problem downstream.
 

Advantage

2. When the speed of data collection over time the write data, that is, when the collection of information encountered peak, this time the information collection is very large, even more than the ability to write data to the system, this time, Flume data will between producer and to adjust the data container, ensure that it can provide a smooth data therebetween.
 

structure

edit
Agent mainly by: source, channel, sink three components.
Source:
Receiving data from the data generator, and to transfer the received data to a format Flume the event one or more channels Channel, Flume embodiment provides a variety of data received, such as Avro, Thrift, twitter1%, etc.
Channel:
channel is a temporary storage container, it will receive data from the source to the event at the format of cached until they are consumed sinks, it plays a role of a bridge between the source and sink, channel is a complete transaction, this ensures consistency of data sent and received in time and it can be any number of source and sink link types are supported:.. JDBC channel, File System channel, Memory channel and so on.
sink:
storing the data sink, such as a memory Hbase and concentrated HDFS, and its consumption data transfer channels (Events) from it to the destination. sink may be another destination, or may HDFS, HBase.

Guess you like

Origin www.cnblogs.com/Agnes1994/p/12217582.html