CDH组件的使用-Flume| Kafka| Oozie基于Hue的任务调度

日志采集Flume配置

 1)Flume配置分析

  

Flume直接读log日志的数据,log日志的格式是app-yyyy-mm-dd.log。

 2)Flume的具体配置如下:

    (1)在CM管理页面上点击Flume

在实例页面选择hadoop101上的Agent

  2)在CM管理页面hadoop101上Flume的配置中找到代理名称改为a1

 

  3)在配置文件如下内容(flume-kafka)

a1.sources=r1
a1.channels=c1 c2 
a1.sinks=k1 k2 

# configure source
a1.sources.r1.type = TAILDIR
a1.sources.r1.positionFile = /opt/module/flume/log_position.json
a1.sources.r1.filegroups = f1
a1.sources.r1.filegroups.f1 = /tmp/logs/app.+
a1.sources.r1.fileHeader = true
a1.sources.r1.channels = c1 c2

#interceptor
a1.sources.r1.interceptors = i1 i2
a1.sources.r1.interceptors.i1.type = com.atguigu.flume.interceptor.LogETLInterceptor$Builder
a1.sources.r1.interceptors.i2.type = com.atguigu.flume.interceptor.LogTypeInterceptor$Builder

# selector
a1.sources.r1.selector.type = multiplexing
a1.sources.r1.selector.header = topic
a1.sources.r1.selector.mapping.topic_start = c1
a1.sources.r1.selector.mapping.topic_event = c2

# configure channel
a1.channels.c1.type = memory
a1.channels.c1.capacity=10000
a1.channels.c1.byteCapacityBufferPercentage=20

a1.channels.c2.type = memory
a1.channels.c2.capacity=10000
a1.channels.c2.byteCapacityBufferPercentage=20

# configure sink
# start-sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = topic_start
a1.sinks.k1.kafka.bootstrap.servers = hadoop101:9092,hadoop102:9092,hadoop103:9092
a1.sinks.k1.kafka.flumeBatchSize = 2000
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.channel = c1

# event-sink
a1.sinks.k2.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k2.kafka.topic = topic_event
a1.sinks.k2.kafka.bootstrap.servers = hadoop101:9092,hadoop102:9092,hadoop103:9092
a1.sinks.k2.kafka.flumeBatchSize = 2000
a1.sinks.k2.kafka.producer.acks = 1
a1.sinks.k2.channel = c2
View Code

注意:com.xxx.flume.interceptor.LogETLInterceptor和com.xxx.flume.interceptor.LogTypeInterceptor是自定义的拦截器的全类名。需要根据用户自定义的拦截器做相应修改。

 4)修改/opt/module/flume/log_position.json的读写权限

     注意:Json文件的父目录一定要创建好,并改好权限

[root@hadoop101 ~]# mkdir -p /opt/module/flume
[root@hadoop101 ~]# cd /opt/module/flume/
[root@hadoop101 flume]# touch log_position.json
[root@hadoop101 flume]# ll
总用量 0
-rw-r--r-- 1 root root 0 4月   5 20:05 log_position.json
[root@hadoop101 flume]# chmod 777 log_position.json
[root@hadoop101 flume]# ll
总用量 0
-rwxrwxrwx 1 root root 0 4月   5 20:05 log_position.json
[root@hadoop101 flume]# xsync /opt/module/flume/
fname=flume
pdir=/opt/module
------------------- hadoop102 --------------
sending incremental file list
flume/
flume/log_position.json

sent 108 bytes  received 35 bytes  286.00 bytes/sec
total size is 0  speedup is 0.00
------------------- hadoop103 --------------
sending incremental file list
flume/
flume/log_position.json

sent 108 bytes  received 35 bytes  286.00 bytes/sec
total size is 0  speedup is 0.00

Flume拦截器

本项目中自定义了两个拦截器,分别是:ETL拦截器、日志类型区分拦截器。

ETL拦截器主要用于,过滤时间戳不合法和Json数据不完整的日志

日志类型区分拦截器主要用于,将启动日志和事件日志区分开来,方便发往Kafka的不同Topic。

打包,拦截器打包之后,只需要单独包,不需要将依赖的包上传。打包之后要放入flume的lib文件夹下面。

注意:为什么不需要依赖包?因为依赖包在flume的lib目录下面已经存在了。

  采用root用户将flume-interceptor-1.0-SNAPSHOT.jar包放入到hadoop101的/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/flume-ng/lib/文件夹下面

[root@hadoop101 flume]# cd /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/flume-ng/lib/
[root@hadoop101 lib]# ll
总用量 608
lrwxrwxrwx 1 root root     41 4月   5 18:31 apache-log4j-extras-1.1.jar -> ../../../jars/apache-log4j-extras-1.1.jar
lrwxrwxrwx 1 root root     29 4月   5 18:31 async-1.4.0.jar -> ../../../jars/async-1.4.0.jar
lrwxrwxrwx 1 root root     34 4月   5 18:31 asynchbase-1.7.0.jar -> ../../../jars/asynchbase-1.7.0.jar
lrwxrwxrwx 1 root root     30 4月   5 18:31 avro-ipc.jar -> ../../../lib/avro/avro-ipc.jar


[root@hadoop101 lib]# ls | grep interceptor
flume-interceptor-1.0-SNAPSHOT.jar

分发Flume到hadoop102

[root@hadoop101 lib]$ xsync flume-interceptor-1.0-SNAPSHOT.jar

Flume的具体配置如下:

1)在CM管理页面hadoop104上Flume的配置中找到代理名称

 

 

日志生成数据传输到HDFS
1)将log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar上传都hadoop102的/opt/module目录
2)分发log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar到hadoop103
[root@hadoop102 module]# xsync log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar
3)在/root/bin目录下创建脚本lg.sh
[root@hadoop102 bin]$ vim lg.sh
    4)在脚本中编写如下内容
#! /bin/bash

    for i in hadoop102 hadoop103 
    do
        ssh $i "java -classpath /opt/module/log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar com.atguigu.appclient.AppMain $1 $2 >/opt/module/test.log &"
    done
5)修改脚本执行权限
[root@hadoop102 bin]$ chmod 777 lg.sh
6)启动脚本
[root@hadoop102 module]$ lg.sh 

Kafka的使用

[root@hadoop101 cloudera]# ll
总用量 16
drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 4月   5 20:26 csd
drwxr-xr-x 2 root         root         4096 4月   5 20:39 parcel-cache
drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 4月   5 20:38 parcel-repo
drwxr-xr-x 6 cloudera-scm cloudera-scm 4096 4月   5 20:40 parcels
[root@hadoop101 cloudera]# cd parcels/
[root@hadoop101 parcels]# ll
总用量 12
lrwxrwxrwx  1 root root   27 4月   5 19:25 CDH -> CDH-5.12.1-1.cdh5.12.1.p0.3
drwxr-xr-x 11 root root 4096 8月  25 2017 CDH-5.12.1-1.cdh5.12.1.p0.3
drwxr-xr-x  4 root root 4096 12月 19 2014 HADOOP_LZO-0.4.15-1.gplextras.p0.123
lrwxrwxrwx  1 root root   25 4月   5 20:40 KAFKA -> KAFKA-3.0.0-1.3.0.0.p0.40
drwxr-xr-x  6 root root 4096 10月  6 2017 KAFKA-3.0.0-1.3.0.0.p0.40
[root@hadoop101 parcels]# cd KAFKA
[root@hadoop101 KAFKA]# ll
总用量 16
drwxr-xr-x 2 root root 4096 10月  6 2017 bin
drwxr-xr-x 5 root root 4096 10月  6 2017 etc
drwxr-xr-x 3 root root 4096 10月  6 2017 lib
drwxr-xr-x 2 root root 4096 10月  6 2017 meta
[root@hadoop101 KAFKA]# /opt/cloudera/parcels/KAFKA/bin/kafka-topics --zookeeper hadoop101:2181 --list

创建 Kafka Topic

进入到/opt/cloudera/parcels/KAFKA目录下分别创建:启动日志主题、事件日志主题。

1)创建启动日志主题

[root@hadoop101 KAFKA]# ll
总用量 16
drwxr-xr-x 2 root root 4096 10月  6 2017 bin
drwxr-xr-x 5 root root 4096 10月  6 2017 etc
drwxr-xr-x 3 root root 4096 10月  6 2017 lib
drwxr-xr-x 2 root root 4096 10月  6 2017 meta
[root@hadoop101 KAFKA]# bin/kafka-topics --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181  --create --replication-factor 1 --partitions 1 --topic topic_start
[root@hadoop101 KAFKA]# bin/kafka-topics --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181  --create --replication-factor 1 --partitions 1 --topic topic_event

 测试Kafka 生产者:

[root@hadoop101 KAFKA]# bin/kafka-console-producer --broker-list hadoop101:9092 --topic test
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
20/04/05 21:00:45 INFO producer.ProducerConfig: ProducerConfig values: 
    acks = 1
    batch.size = 16384
    bootstrap.servers = [hadoop101:9092]
    buffer.memory = 33554432
    client.id = console-producer
    compression.type = none
    connections.max.idle.ms = 540000
    enable.idempotence = false
    interceptor.classes = null
    key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
    linger.ms = 1000
    max.block.ms = 60000
    max.in.flight.requests.per.connection = 5
    max.request.size = 1048576
    metadata.max.age.ms = 300000
    metric.reporters = []
    metrics.num.samples = 2
    metrics.recording.level = INFO
    metrics.sample.window.ms = 30000
    partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
    receive.buffer.bytes = 32768
    reconnect.backoff.max.ms = 1000
    reconnect.backoff.ms = 50
    request.timeout.ms = 1500
    retries = 3
    retry.backoff.ms = 100
    sasl.jaas.config = null
    sasl.kerberos.kinit.cmd = /usr/bin/kinit
    sasl.kerberos.min.time.before.relogin = 60000
    sasl.kerberos.service.name = null
    sasl.kerberos.ticket.renew.jitter = 0.05
    sasl.kerberos.ticket.renew.window.factor = 0.8
    sasl.mechanism = GSSAPI
    security.protocol = PLAINTEXT
    send.buffer.bytes = 102400
    ssl.cipher.suites = null
    ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
    ssl.endpoint.identification.algorithm = null
    ssl.key.password = null
    ssl.keymanager.algorithm = SunX509
    ssl.keystore.location = null
    ssl.keystore.password = null
    ssl.keystore.type = JKS
    ssl.protocol = TLS
    ssl.provider = null
    ssl.secure.random.implementation = null
    ssl.trustmanager.algorithm = PKIX
    ssl.truststore.location = null
    ssl.truststore.password = null
    ssl.truststore.type = JKS
    transaction.timeout.ms = 60000
    transactional.id = null
    value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer

20/04/05 21:00:45 INFO utils.AppInfoParser: Kafka version : 0.11.0-kafka-3.0.0
20/04/05 21:00:45 INFO utils.AppInfoParser: Kafka commitId : unknown
>Hello World
>Hello kris
View Code

Kafka消费者:

[root@hadoop101 KAFKA]# bin/kafka-console-consumer --bootstrap-server hadoop101:9092 --from-beginning --topic test
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
20/04/05 21:00:54 INFO consumer.ConsumerConfig: ConsumerConfig values: 
    auto.commit.interval.ms = 5000
    auto.offset.reset = earliest
    bootstrap.servers = [hadoop101:9092]
    check.crcs = true
    client.id = 
    connections.max.idle.ms = 540000
    enable.auto.commit = true
    exclude.internal.topics = true
    fetch.max.bytes = 52428800
    fetch.max.wait.ms = 500
    fetch.min.bytes = 1
    group.id = console-consumer-53066
    heartbeat.interval.ms = 3000
    interceptor.classes = null
    internal.leave.group.on.close = true
    isolation.level = read_uncommitted
    key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
    max.partition.fetch.bytes = 1048576
    max.poll.interval.ms = 300000
    max.poll.records = 500
    metadata.max.age.ms = 300000
    metric.reporters = []
    metrics.num.samples = 2
    metrics.recording.level = INFO
    metrics.sample.window.ms = 30000
    partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
    receive.buffer.bytes = 65536
    reconnect.backoff.max.ms = 1000
    reconnect.backoff.ms = 50
    request.timeout.ms = 305000
    retry.backoff.ms = 100
    sasl.jaas.config = null
    sasl.kerberos.kinit.cmd = /usr/bin/kinit
    sasl.kerberos.min.time.before.relogin = 60000
    sasl.kerberos.service.name = null
    sasl.kerberos.ticket.renew.jitter = 0.05
    sasl.kerberos.ticket.renew.window.factor = 0.8
    sasl.mechanism = GSSAPI
    security.protocol = PLAINTEXT
    send.buffer.bytes = 131072
    session.timeout.ms = 10000
    ssl.cipher.suites = null
    ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
    ssl.endpoint.identification.algorithm = null
    ssl.key.password = null
    ssl.keymanager.algorithm = SunX509
    ssl.keystore.location = null
    ssl.keystore.password = null
    ssl.keystore.type = JKS
    ssl.protocol = TLS
    ssl.provider = null
    ssl.secure.random.implementation = null
    ssl.trustmanager.algorithm = PKIX
    ssl.truststore.location = null
    ssl.truststore.password = null
    ssl.truststore.type = JKS
    value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer

20/04/05 21:00:54 INFO utils.AppInfoParser: Kafka version : 0.11.0-kafka-3.0.0
20/04/05 21:00:54 INFO utils.AppInfoParser: Kafka commitId : unknown
20/04/05 21:00:55 INFO internals.AbstractCoordinator: Discovered coordinator hadoop101:9092 (id: 2147483594 rack: null) for group console-consumer-53066.
20/04/05 21:00:55 INFO internals.ConsumerCoordinator: Revoking previously assigned partitions [] for group console-consumer-53066
20/04/05 21:00:55 INFO internals.AbstractCoordinator: (Re-)joining group console-consumer-53066
20/04/05 21:00:58 INFO internals.AbstractCoordinator: Successfully joined group console-consumer-53066 with generation 1
20/04/05 21:00:58 INFO internals.ConsumerCoordinator: Setting newly assigned partitions [test-0] for group console-consumer-53066
Hello World
Hello kris
View Code

查看某个Topic的详情

[root@hadoop101 KAFKA]# bin/kafka-topics --zookeeper hadoop101:2181  --describe --topic test
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
20/04/05 21:02:39 INFO zkclient.ZkEventThread: Starting ZkClient event thread.
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:host.name=hadoop101
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_144
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.home=/opt/module/jdk1.8.0_144/jre
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.class.path=:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/activation-1.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/aopalliance-1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/apacheds-i18n-2.0.0-M15.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/api-asn1-api-1.0.0-M20.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/api-util-1.0.0-M20.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/asm-3.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/avro-1.7.6-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/cglib-2.2.1-v20090111.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-beanutils-1.8.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-beanutils-core-1.8.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-cli-1.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-codec-1.9.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-collections-3.2.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-compress-1.4.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-configuration-1.6.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-digester-1.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-el-1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-httpclient-3.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-io-2.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-lang-2.6.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-lang3-3.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-logging-1.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-math3-3.1.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-net-3.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-pool2-2.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-api-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-file-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-json-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-runtime-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-transforms-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/gson-2.2.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/guava-20.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/guice-3.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/guice-servlet-3.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-annotations-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-auth-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-core-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-shuffle-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-api-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-client-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-server-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-server-nodemanager-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hk2-api-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hk2-locator-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hk2-utils-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/htrace-core4-4.0.1-incubating.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/httpclient-4.4.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/httpcore-4.4.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-annotations-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-core-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-databind-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-jaxrs-base-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-jaxrs-json-provider-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-module-jaxb-annotations-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jasper-compiler-5.5.23.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javassist-3.20.0-GA.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javassist-3.21.0-GA.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.annotation-api-1.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.inject-1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.inject-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/java-xmlbuilder-0.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-client-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-common-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-container-servlet-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-container-servlet-core-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-guava-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-guice-1.9.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-media-jaxb-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-server-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jets3t-0.9.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jettison-1.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jopt-simple-5.0.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jsch-0.1.42.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jsp-api-2.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jsr305-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka_2.11-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka_2.11-0.11.0-kafka-3.0.0-sources.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka_2.11-0.11.0-kafka-3.0.0-test-sources.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-clients-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-log4j-appender-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-streams-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-streams-examples-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-tools-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/leveldbjni-all-1.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/libthrift-0.9.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/log4j-1.2.17.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/lz4-1.3.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/maven-artifact-3.5.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/metrics-servlet-2.2.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/paranamer-2.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/plexus-utils-3.0.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/reflections-0.9.11.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/rocksdbjni-5.0.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/scala-library-2.11.11.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-binding-kafka-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-core-common-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-core-model-db-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-core-model-kafka-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-policy-common-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-policy-kafka-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-provider-common-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-provider-db-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-provider-file-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/servlet-api-2.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/shiro-core-1.2.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-api-1.7.25.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-api-1.7.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-log4j12-1.7.25.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-log4j12-1.7.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/snappy-java-1.1.2.6.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/stax-api-1.0-2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/xmlenc-0.52.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/xz-1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/zkclient-0.10.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/zookeeper-3.4.10.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/zookeeper-3.4.5-cdh5.13.0.jar
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.32-642.el6.x86_64
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:user.name=root
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=hadoop101:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@52e6fdee
20/04/05 21:02:39 INFO zkclient.ZkClient: Waiting for keeper state SyncConnected
20/04/05 21:02:39 INFO zookeeper.ClientCnxn: Opening socket connection to server hadoop101/192.168.1.101:2181. Will not attempt to authenticate using SASL (unknown error)
20/04/05 21:02:39 INFO zookeeper.ClientCnxn: Socket connection established to hadoop101/192.168.1.101:2181, initiating session
20/04/05 21:02:39 INFO zookeeper.ClientCnxn: Session establishment complete on server hadoop101/192.168.1.101:2181, sessionid = 0x3714a18244c0005, negotiated timeout = 30000
20/04/05 21:02:39 INFO zkclient.ZkClient: zookeeper state changed (SyncConnected)
Topic:test    PartitionCount:1    ReplicationFactor:1    Configs:
    Topic: test    Partition: 0    Leader: 52    Replicas: 52    Isr: 52
20/04/05 21:02:39 INFO zkclient.ZkEventThread: Terminate ZkClient event thread.
20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Session: 0x3714a18244c0005 closed
20/04/05 21:02:39 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x3714a18244c0005
View Code
删除 Kafka Topic
1)删除启动日志主题
[root@hadoop101 KAFKA]$ bin/kafka-topics --delete --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181 --topic topic_start
2)删除事件日志主题
[root@hadoop101 KAFKA]$ bin/kafka-topics --delete --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181 --topic topic_event

Oozie基于Hue实现GMV指标流程调度

 1 执行前的准备

1)添加MySQL驱动文件

[root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH
CDH/                         CDH-5.12.1-1.cdh5.12.1.p0.3/ 
[root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib
lib/     lib64/   libexec/ 
[root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/lib
[root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/sqoop/lib/
[root@hadoop101 ~]# hadoop fs -put /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /user/oozie/share/lib/lib_20190903103134/sqoop
put: `/user/oozie/share/lib/lib_20190903103134/sqoop': No such file or directory
[root@hadoop101 ~]# hadoop fs -put /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /user/oozie/share/lib/lib_20200405214946/sqoop
[root@hadoop101 ~]# xsync /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/lib 
fname=lib
pdir=/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop
------------------- hadoop102 --------------
sending incremental file list
lib/
lib/mysql-connector-java-5.1.27-bin.jar

sent 877414 bytes  received 246 bytes  1755320.00 bytes/sec
total size is 3041243  speedup is 3.47
------------------- hadoop103 --------------
sending incremental file list
lib/
lib/mysql-connector-java-5.1.27-bin.jar

sent 877414 bytes  received 246 bytes  585106.67 bytes/sec
total size is 3041243  speedup is 3.47
[root@hadoop101 ~]# xsync /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/sqoop/lib/
fname=lib
pdir=/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/sqoop
------------------- hadoop102 --------------
sending incremental file list
lib/
lib/mysql-connector-java-5.1.27-bin.jar

sent 874734 bytes  received 134 bytes  1749736.00 bytes/sec
total size is 873514  speedup is 1.00
------------------- hadoop103 --------------
sending incremental file list
lib/
lib/mysql-connector-java-5.1.27-bin.jar

sent 874734 bytes  received 134 bytes  1749736.00 bytes/sec
total size is 873514  speedup is 1.00

2)修改YARN的容器内存yarn.nodemanager.resource.memory-mb为4G

 

 

 

 

Hue中创建Oozie任务GMV

1)生成数据

安装Oozie可视化js,复制ext-2.2.zip到/opt/cloudera/parcels/CDH/lib/oozie/libext(或/var/lib/oozie)中,解压

[root@hadoop101 libext]# ll
总用量 860
drwxr-xr-x 9 root  root    4096 8月   4 2008 ext-2.2
-rw-r--r-- 1 oozie oozie 872303 4月   5 22:24 mysql-connector-java.jar
drwxr-xr-x 5 oozie oozie   4096 4月   5 22:24 tomcat-deployment
[root@hadoop101 libext]# unzip /opt/software/ext-2.2.zip

查看Hue的web页面

选择query->scheduler->workflow

 点击My Workflow->输入gmv

点击保存

编写任务脚本并上传到HDFS

1)点击workspace,查看上传情况

2)上传要执行的脚本导HDFS路径

 [root@hadoop101 bin]# hadoop fs -put /root/bin/*.sh /user/hue/oozie/workspaces/hue-oozie-1559457755.83/lib

3)点击左侧的->Documents->gmv

编写任务调度

1)点击编辑

2)选择actions

3)拖拽控件编写任务

4)选择执行脚本

5)Files再选择执行脚本

6)定义脚本参数名称

 

7)按顺序执行3-6步骤。

ods_db.sh、dwd_db.sh、dws_db_wide.sh、ads_db_gmv.sh和sqoop_export.sh

其中ods_db.sh、dwd_db.sh、dws_db_wide.sh、ads_db_gmv.sh只有一个参数do_date;

sqoop_export.sh只有一个参数all。

8)点击保存

执行任务调度

1)点击执行

2)点击提交

 

猜你喜欢

转载自www.cnblogs.com/shengyang17/p/12638186.html