Operation and maintenance of large data Flume

Flume questions:
1. In the master node installation start Flume components, open flume-ng running Linux Shell help command to check the usage information Flume-ng.
[root @ Master ~] # flume-ng help
Usage: /usr/hdp/2.6.1.0-129/flume/bin/flume-ng.distro [Options] ...

commands:
help display this help text
agent run a Flume agent
avro-client run an avro Flume client
password create a password file for use in flume config
version show Flume version info

global options:
–conf,-c use configs in directory
–classpath,-C append to the classpath
–dryrun,-d do not actually start Flume, just print the command
–plugins-path colon-separated list of plugins.d directories. See the
plugins.d section in the user guide for more details.
Default: $FLUME_HOME/plugins.d
-Dproperty=value sets a Java system property value
-Xproperty=value sets a Java -X option

agent options:
–conf-file,-f specify a config file (required)
–name,-n the name of this agent (required)
–help,-h display help text

avro-client options:
–rpcProps,-P RPC client properties file with server connection params
–host,-H hostname to which events will be sent
–port,-p port of the avro source
–dirname

directory to stream to avro source
–filename,-F text file to stream to avro source (default: std input)
–headerFile,-R File containing event headers as key/value pairs on each new line
–help,-h display help text

Either --rpcProps or both --host and --port must be specified.

password options:
–outfile The file in which encoded password is stored

Note that if directory is specified, then it is always included first
in the classpath.

The template file provides log-example.conf using tools to collect Flume NG master node system log / var / log / secure, file name of the log information collected in the "xiandian-sec" prefix, stored in HDFS file system / 1daoyun / file / flume directory, the definition file and generates a time stamp in HDFS 10 minutes. After collection, the query HDFS file system / 1daoyun / file / list of information flume.
[the root Master @ ~] # flume-ng agent -c . -f /opt/log-example.conf -n a1-Dflume.root.logger = the INFO, Console
[the root Master @ ~] # hadoop fs -ls /1daoyun/file/flume
Found items. 19
-rw-r--. 3 the root the root-R & lt 2019-05-23 13:03 1172 / 1daoyun / File / Flume / xiandian-sec.1558616567923
[the root Master @ ~] # cat /opt/log-example.conf
# example.conf: A SINGLE-Node Configuration The Flume
# the Name ON the this Agent Components The
a1.sources R1 =
a1.sinks = K1
a1.channels = C1
# DESCRIBE / Configure The Source
a1.sources.r1.type Exec =
a1.sources.r1.command = tail -F /var/log/secure
a1.sources.r1.channels = c1
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path =hdfs://master:8020/1daoyun/file/flume
a1.sinks.k1.hdfs.filePrefix = xiandian-sec
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

The template file provides hdfs-example.conf using tools provided Flume NG master node system path / opt / xiandian / upload files to the real-time for real-time file system path to the HDFS, HDFS storage path provided for the file system / data / flume /, after uploading the file name remains the same, file type DataStream, and then start the flume-ng agent.
[the root Master @ ~] # flume-ng agent -c . -f /opt/hdfs-example.conf -n master-Dflume.root.logger = the INFO, Console
[the root Master @ ~] # hadoop fs -ls -R /data/flume
drwxr-XR-X - the root the root 2019-05-23 13:33 0 / Data / Flume / opt
drwxr-XR the -X-- the root the root 2019-05-23 13:33 0 / Data / Flume / opt / xiandian
-rw-r--. 3 the root-R & lt 2019-05-23 13:33 the root 665 / Data / Flume / opt / xiandian /log-example.conf.1558618396952
[the root Master @ ~] # cat /opt/hdfs-example.conf
# example.conf: A SINGLE-Node Configuration The Flume
# the Name ON the this Agent Components The
master.sources = webmagic
master.sinks = K1
master.channels = C1
# Describe/configure the source
master.sources.webmagic.type = spooldir
master.sources.webmagic.fileHeader = true
master.sources.webmagic.fileHeaderKey = fileName
master.sources.webmagic.fileSuffix = .COMPLETED
master.sources.webmagic.deletePolicy = never
master.sources.webmagic.spoolDir = /opt/xiandian/
master.sources.webmagic.ignorePattern = ^$
master.sources.webmagic.consumeOrder = oldest
master.sources.webmagic.deserializer =org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder
master.sources.webmagic.batchsize = 5
master.sources.webmagic.channels = c1
# Use a channel which buffers events in memory
master.channels.c1.type = memory
# Describe the sink
master.sinks.k1.type = hdfs
master.sinks.k1.channel = c1
master.sinks.k1.hdfs.path =hdfs://master:8020/data/flume/%{dicName}
master.sinks.k1.hdfs.filePrefix = %{fileName}
master.sinks.k1.hdfs.fileType = DataStream

Guess you like

Origin blog.csdn.net/mn525520/article/details/93781038