The first is a basic introduction to flume
component name |
Features |
Agent |
Run Flume with the JVM. One agent runs per machine, but multiple sources and sinks can be included in an agent. |
Client client |
Produce data, run in a separate thread. |
Source |
Collect data from Client and pass it to Channel. |
sink receiver |
Collect data from Channel, perform related operations, and run in a separate thread. |
Channel channel |
Connect sources and sinks, this is a bit like a queue. |
Events |
The basic data payload of the transfer. |
At present, flume supports a variety of sources
Among them, it supports reading jms message queue messages, but does not support reading rabbitMq, so secondary development of flume is required
Here is mainly how flume reads data from rabbitMq
Here I found a plugin for flume to read data from rabbitMq from git
The download address is: https://github.com/gmr/rabbitmq-flume-plugin
There are some English descriptions above, you can look at it
Environment introduction
centOS 7.3 jdk1.8 cdh5.14.0
1. Package the project with mvn, two JAR packages will be generated
2. Because I use cdh to install and integrate flume, I put these two jars under /usr/lib
If it is a common installation method, you need to copy these two jar packages to the lib under the flume installation directory
3. Enter the cdh management page to configure the Agent
The following is the detailed configuration. I am writing the message directly to the kafka cluster.
tier1.sources = source1
tier1.channels = channel1
tier1.sinks = sink1
tier1.sources.source1.type = com.aweber.flume.source.rabbitmq.RabbitMQSource
tier1.sources.source1.bind = 127.0.0.1
tier1.sources.source1.port = 5672
tier1.sources.source1.virtual-host = /
tier1.sources.source1.username = guest
tier1.sources.source1.password = guest
tier1.sources.source1.queue = test
tier1.sources.source1.prefetchCount = 10
tier1.sources.source1.channels = channel1
tier1.sources.source1.threads = 2
tier1.sources.source1.interceptors = i1
tier1.sources.source1.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
tier1.sources.source1.interceptors.i1.preserveExisting = true
tier1.channels.channel1.type = memory
tier1.sinks.sink1.channel = channel1
tier1.sinks.sink1.type = org.apache.flume.sink.kafka.KafkaSink
tier1.sinks.sink1.topic = flume_out
tier1.sinks.sink1.brokerList = 127.0.0.1,127.0.0.1:9093,27.0.0.1:9094
tier1.sinks.sink1.requiredAcks = 1
tier1.sinks.sink11.batchSize = 20
The configuration is complete, update the configuration and restart the Agent
This is the received rabbitMq message
You're done, if you have any questions about the configuration, you can leave a message and I will reply after I see it