学习笔记:从0开始学习大数据-15. Flume安装及使用

上节测试了spark  编程,spark sql ,spark streaming 等都测试可用了,接下来是数据源的收集,Flume的安装使用,其实很简单,但作为完整,也写个记录笔记

1.下载 

wget http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.16.1.tar.gz

2.解压
tar -zxvf flume-ng-1.6.0-cdh5.16.1.tar.gz

3.配置
flume.env.sh

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-0.el7_5.x86_64

4. 为了其中一个测试,安装telnet
yum install telnet-server
yum install telnet
chkconfig telnet on
systemctl start telnet.socket

systemctl stop firewalld

telnet localhost  //测试是否可用

 5 测试一:接收telnet 发送数据

(1)新建数据通道配置
netcat-logger.conf

--------------
#定义数据源,通道,目的地
a1.sources = r1
a1.sinks = k1
a1.channels = c1

a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

a1.sinks.k1.type = logger

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

(2)启动 服务

[root@centos7 apache-flume-1.6.0-cdh5.16.1-bin]# bin/flume-ng agent --conf conf --conf-file conf/netcat-logger.conf --name a1 -Dflume.root.logger=INFO,console

(3)到其它机器 telnet 发送数据过来(因为本机的屏幕在监控发送过来的数据,需要在另外的机器发送)

发送端:

接收端:

6.测试二: 发送信息,接收到hbase存储

(1) 配置文件 conf/hbase_simple.conf

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = syslogtcp
a1.sources.r1.port = 5140
a1.sources.r1.host = centos7

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.type = hbase
a1.sinks.k1.table = test_idoall_org
a1.sinks.k1.columnFamily = name
a1.sinks.k1.column = idoall
a1.sinks.k1.serializer =  org.apache.flume.sink.hbase.RegexHbaseEventSerializer
a1.sinks.k1.channel = memoryChannel

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sinks.k1.channel = c1
a1.sources.r1.channels = c1
(2) 启动 hdfs  hbase

start-dfs.sh

start-yarn.sh

start-hbase.sh

(3) 建hbase表

hbase shell

create 'test_idoall_org','name'

desc 'test_idoall_org'

(4)启动flume

bin/flume-ng agent -n a1 -c conf -f conf/hbase_simple.conf -Dflume.root.logger=INFO,console

(5) 开启另外一个 linux shell ,发送数据

 echo "hello how are you?" | nc centos7 5140
(6) 到hbase 查看表

hbase shell

scan  'test_idoall_org'

发送的数据已保存在hbase数据库了

下一步,可以把分布在各服务器的日志定期发送到hbase保存,以便分析。

7. 使用参考文章:

https://www.cnblogs.com/netbloomy/p/6666683.html  Flume1.7.0入门:安装、部署、及flume的案例
https://www.cnblogs.com/biehongli/p/8031403.html  日志采集框架Flume以及Flume的安装部署(一个分布式、可靠、和高可用的海量日志采集、聚合和传输的系统)

https://blog.csdn.net/wugenqiang/article/details/81282998  大数据可视化之Nginx日志分析及web图表展示(HDFS+Flume+Spark+Nginx+Highcharts)

猜你喜欢

转载自blog.csdn.net/oLinBSoft/article/details/84671669