大数据实时采集系统

实时采集,我们采用flume框架,我们同样在windows上安装flume。
这里写图片描述
1、到Apache的Flume官网(http://flume.apache.org/download.html)下载apache-flume-1.8.0-bin.tar.gz
2、解压到目录,例如D:\software\apache-flume-1.8.0-bin
3、新建FLUME_HOME变量,填写flume安装目录D:\software\apache-flume-1.8.0-bin
4、编辑系统变量path,追加%FLUME_HOME%\conf和%FLUME_HOME%\bin
5、复制并重命名flume\config目录下的三个文件,并去掉.template后缀
(如果没有配置JAVA_HOME需要)修改flume下conf文件夹中的flume-env.sh
中的JAVA_HOME,指定jdk安装路径 ,如:export JAVA_HOME=D:\software\java\jdk10
6、Win+R输入cmd,进入命令窗口,输入flume-ng version

public static void main(String[] args) throws Exception {
        while (true) {
            logger.info("hello world:"+ String.valueOf(new Date().getTime()));
            Thread.sleep(2000);
        }
    }
### set log levels ###
log4j.rootLogger=INFO, stdout, file, flume
log4j.logger.per.flume=INFO

### flume ###
log4j.appender.flume=org.apache.flume.clients.log4jappender.Log4jAppender
log4j.appender.flume.layout=org.apache.log4j.PatternLayout
log4j.appender.flume.Hostname=localhost
log4j.appender.flume.Port=44444

### stdout ###
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Threshold=INFO
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %c{1} [%p] %m%n

### file ###
log4j.appender.file=org.apache.log4j.DailyRollingFileAppender
log4j.appender.file.Threshold=INFO
log4j.appender.file.Append=true
log4j.appender.file.DatePattern='.'yyyy-MM-dd
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %c{1} [%p] %m%n
flume-ng agent --conf conf --conf-file example-hdfs.conf --name a1 

conf文件中example-hdfs.conf配置文件:

a1.sources=source1  
a1.channels=channel1  
a1.sinks=sink1  

a1.sources.source1.type=avro  
a1.sources.source1.bind=0.0.0.0  
a1.sources.source1.port=44444  
a1.sources.source1.channels=channel1  

a1.channels.channel1.type=memory  
a1.channels.channel1.capacity=10000  
a1.channels.channel1.transactionCapacity=1000  
a1.channels.channel1.keep-alive=30  

a1.sinks.sink1.type=hdfs  
a1.sinks.sink1.channel=channel1  
# 指定了我的flume日志目录,还没创建请先行创建
a1.sinks.sink1.hdfs.path=hdfs://localhost:9000/flume
a1.sinks.sink1.hdfs.fileType=DataStream  
a1.sinks.sink1.hdfs.writeFormat=Text  
a1.sinks.sink1.hdfs.rollInterval=0  
a1.sinks.sink1.hdfs.rollSize=10240  
a1.sinks.sink1.hdfs.rollCount=0  
a1.sinks.sink1.hdfs.idleTimeout=60 

获取源码学习,下载地址:http://47.98.237.162/detail/1/170

猜你喜欢

转载自blog.csdn.net/sinat_15153911/article/details/81462657