Flume优化

Flume的自动容灾和负载均衡

说明

flume的自动容灾指的是当某一个channel或者sink挡掉后,由其他的sink来接收数据
flume的负载均衡指的是多个channel处理的event的数量尽可能的相同。

自动容灾

1)上游方案的编写

[root@qianfeng01 ~]# vim first-processor.conf

#list names
a1.sources = r1
a1.channels = c1
a1.sinks = k1 k2
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1


# source
a1.sources.r1.type = syslogtcp
a1.sources.r1.host = qianfeng01
a1.sources.r1.port = 10086

# channel
a1.channels.c1.type = memory

# sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = qianfeng02
a1.sinks.k1.port = 10087

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = qianfeng03
a1.sinks.k2.port = 10088

#设置sink组
a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = failover
a1.sinkgroups.g1.processor.priority.k1 = 10
a1.sinkgroups.g1.processor.priority.k2 = 5
a1.sinkgroups.g1.processor.maxpenalty = 10000

2)下游的qianfeng02上的方案

[root@qianfeng02 flumeconf]# vim second-processor.conf

#list names
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


# source
a1.sources.r1.type = avro
a1.sources.r1.bind = qianfeng02
a1.sources.r1.port = 10087

# channel
a1.channels.c1.type = memory

# sink
a1.sinks.k1.type = logger

3)下游的qianfeng03上的方案

[root@qianfeng03 flumeconf]# vim third-processor.conf

#list names
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


# source
a1.sources.r1.type = avro
a1.sources.r1.bind = qianfeng03
a1.sources.r1.port = 10088

# channel
a1.channels.c1.type = memory

# sink
a1.sinks.k1.type = logger

4)启动

先启动下游的两个方案
[root@qianfeng02 flumeconf]# flume-ng agent -c ../conf -f second-processor.properties -n a1 -Dflume.root.logger=INFO,console
[root@qianfeng03 flumeconf]# flume-ng agent -c ../conf -f third-processor.properties -n a1 -Dflume.root.logger=INFO,console
在启动上游的一个方案
[root@qianfeng03 flumeconf]# flume-ng agent -c ../conf -f first-processor.properties -n a1 -Dflume.root.logger=INFO,console

5)测试

[root@qianfeng02 ~]# echo "helloworld" | nc qianfeng01 10086

由于k1的优先级是最高的,因此会看到qianfeng02上有数据
模拟自动容灾,使用ctrl+c 杀死qianfeng02上的方案,就会看到qianfeng03上有数据了。

负载均衡

​ 负载均衡Sink 选择器提供了在多个sink上进行负载均衡流量的功能。 它维护一个活动sink列表的索引来实现负载的分配。 默认支持了轮询round_robin)和随机random)两种选择机制分配负载。 默认是轮询,可以通过配置来更改。

==注意==:

如果backoff设置为true则启用了退避机制,失败的sink会被放入黑名单,达到一定的超时时间后会自动从黑名单移除。 如从黑名单出来后sink仍然失败,则再次进入黑名单而且超时时间会翻倍,以避免在无响应的sink上浪费过长时间。 如果没有启用退避机制,在禁用此功能的情况下,发生sink传输失败后,会将本次负载传给下一个sink继续尝试,因此这种情况下是不均衡的。

1)上游方案的编写

[root@qianfeng01 ~]# vim first-processor.conf

#list names
a1.sources = r1
a1.channels = c1
a1.sinks = k1 k2
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1


# source
a1.sources.r1.type = syslogtcp
a1.sources.r1.host = qianfeng01
a1.sources.r1.port = 10086

# channel
a1.channels.c1.type = memory

# sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = qianfeng02
a1.sinks.k1.port = 10087

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = qianfeng03
a1.sinks.k2.port = 10088

#设置sink组
a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = load_balance
a1.sinkgroups.g1.processor.backoff = true
a1.sinkgroups.g1.processor.selector = random

2)下游的qianfeng02上的方案

[root@qianfeng02 flumeconf]# vim second-processor.conf

#list names
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


# source
a1.sources.r1.type = avro
a1.sources.r1.bind = qianfeng02
a1.sources.r1.port = 10087

# channel
a1.channels.c1.type = memory

# sink
a1.sinks.k1.type = logger

3)下游的qianfeng03上的方案

[root@qianfeng03 flumeconf]# vim third-processor.conf

#list names
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


# source
a1.sources.r1.type = avro
a1.sources.r1.bind = qianfeng03
a1.sources.r1.port = 10088

# channel
a1.channels.c1.type = memory

# sink
a1.sinks.k1.type = logger

4)启动

先启动下游的两个方案
[root@qianfeng02 flumeconf]# flume-ng agent -c ../conf -f second-processor.properties -n a1 -Dflume.root.logger=INFO,console
[root@qianfeng03 flumeconf]# flume-ng agent -c ../conf -f third-processor.properties -n a1 -Dflume.root.logger=INFO,console
在启动上游的一个方案
[root@qianfeng03 flumeconf]# flume-ng agent -c ../conf -f first-processor.properties -n a1 -Dflume.root.logger=INFO,console

5)测试

[root@qianfeng01 ~]# echo "helloworld" | nc qianfeng01 10086
[root@qianfeng01 ~]# echo "helloworld" | nc qianfeng01 10086
[root@qianfeng01 ~]# echo "helloworld" | nc qianfeng01 10086

多发几条查看效果.....
发送数据的时候不要特别快,因为一个channel可以容纳多条Event,发送特别快就查看不到效果,所以慢慢发送,最好的操作方式就发送一条数据后等这条数据已经到某一个sink了,在发送下一条即可。

 更多大数据精彩内容欢迎B站搜索“千锋教育”或者扫码领取全套资料

【千锋教育】大数据开发全套教程,史上最全面的大数据学习视频

猜你喜欢

转载自blog.csdn.net/longz_org_cn/article/details/131653179