Big Data Course E1 - An Overview of Flume

E-mail of the author of the article: [email protected] Address: Huizhou, Guangdong

 ▲ Purpose of this chapter

⚪ Understand the concept of Ganglia;

⚪ Understand the topology and execution process of Ganglia;

⚪ Master the installation and operation of Ganglia;

1. Introduction

1 Overview

1. Flume was originally developed by Cloudera and later contributed to Apache as a distributed, reliable mechanism for collecting, aggregating and moving log data.

2. In big data, more than 70% of the data in actual development comes from logs - logs are the cornerstone of big data.

3. Flume provides a very simple and flexible streaming mechanism for logs.

4. Version:

a. Flume0.X: also known as Flume-og. Relying on Zookeeper, the structure configuration is relatively complicated, and this version has been deactivated in the market now.

b. Flume1.X: also known as Flume-ng. It does not depend on Zookeeper, and its structure configuration is relatively simple. It is a commonly used version on the market.

2. Basic concepts

1. Event:

a. In Flume, each collected log is encapsulated into an Event object - in Flume, an Event corresponds to a log.

b. Event is essentially a json string, which contains two fixed partsÿ

Guess you like

Origin blog.csdn.net/u013955758/article/details/132024521