hbase hdfs sink

bin/flume-ng agent --conf conf --conf-file conf/hbase.conf --name a1 -Dflume.root.logger=INFO,console
 
 
# example.conf: A single-node Flume configuration
 
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 12345
 
# Describe the sink
a1.sinks.k1.type = logger
 
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
 
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
 
#HDFS sink
a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/%y-%m-%d/%H
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.useLocalTimeStamp = true #sink is hdfs, then use directory auto-generate function. There is an error like the title. According to the official website documentation, there needs to be a timestamp at the beginning of each file record line, but the format of the timestamp may be difficult to adjust, so you can also set   the hdfs.useLocalTimeStamp parameter, such as every hour As a folder, then the configuration should look like this
## Solving the error:

java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null

#HBASE
a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hbase
a1.sinks.k1.table = flume
a1.sinks.k1.columnFamily = f1
a1.sinks.k1.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
a1.sinks.k1.channel = c1
 
 
 
The way to read the data channel:
netcat
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 12345
Pass data according to the port connection
telnet localhost 233333
 
euro
agent1.sources.source1.type = avro
agent1.sources.source1.bind = localhost
agent1.sources.source1.port = 44444
Handling serialized data
 
and check
a1.sources = r1
a1.channels = c1
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /var/log/secure
a1.sources.r1.channels = c1
Process the command line
 
test port
netstat -tnl | grep 23
tcp        0      0 0.0.0.0:36232               0.0.0.0:*                   LISTEN     
tcp        0      0 :::23                       :::*                        LISTEN     
 
access port
telnet localhost 23
 

View port tasks

ps -ef | grep 23
 
 

View port occupancy status

lsof -i:23
 
 
 
# example.conf: A single-node Flume configuration
 
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 12345
 
# Describe the sink
a1.sinks.k1.type = logger
 
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
 
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
 
#HDFS sink
a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/%y-%m-%d/%H
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.useLocalTimeStamp = true
 
#HBASE
a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hbase
a1.sinks.k1.table = flume
a1.sinks.k1.columnFamily = f1
a1.sinks.k1.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
a1.sinks.k1.channel = c1

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326117039&siteId=291194637