Storm集成Kafka编程模型

Java有兴趣的朋友可以加553175249这个群哦,一起学习,共同进步 .

本文主要介绍如何在Storm编程实现与Kafka的集成

  一、实现模型

   数据流程:

    1、Kafka Producter生成topic1主题的消息 

    2、Storm中有个Topology,包含了KafkaSpout、SenqueceBolt、KafkaBolt三个组件。其中KafkaSpout订阅了topic1主题消息,然后发送

      给SenqueceBolt加工处理,最后数据由KafkaBolt生成topic2主题消息发送给Kafka

    3、Kafka Consumer负责消费topic2主题的消息

    

    

  二、Topology实现

    1、创建maven工程,配置pom.xml

      需要依赖storm-core、kafka_2.10、storm-kafka三个包

复制代码
  <dependencies>         
	<dependency>            
		<groupId>org.apache.storm</groupId>              
		<artifactId>storm-core</artifactId>              
		<version>0.9.2-incubating</version>              
		<scope>provided</scope>         
	</dependency>         
	<dependency>         
		<groupId>org.apache.kafka</groupId>         
		<artifactId>kafka_2.10</artifactId>         
		<version>0.8.1.1</version>         
		<exclusions>             
			<exclusion>                 
				<groupId>org.apache.zookeeper</groupId>                 
								<artifactId>zookeeper</artifactId>             
						</exclusion>             
						<exclusion>                 
							<groupId>log4j</groupId>                 
							<artifactId>log4j</artifactId>             
						</exclusion>         
				</exclusions>     
		</dependency>                 
		<dependency>             
			<groupId>org.apache.storm</groupId>            
			<artifactId>storm-kafka</artifactId>             
			<version>0.9.2-incubating</version>       
		</dependency>     
</dependencies>      
	<build>     
		<plugins>       
			<plugin>         
				<artifactId>maven-assembly-plugin</artifactId>         
				<version>2.4</version>         
			<configuration>           
				<descriptorRefs>             
					<descriptorRef>jar-with-dependencies</descriptorRef>           
				</descriptorRefs>         
			</configuration>         
	<executions>          
 		<execution>            
 			<id>make-assembly</id>              
			<phase>package</phase>             
			<goals>               
				<goal>single</goal>             
			</goals>         
  	</execution>      
   </executions>       
</plugin>     
</plugins>   
</build>
复制代码

 

    2、KafkaSpout

      KafkaSpout是Storm中自带的Spout,源码在https://github.com/apache/incubator-storm/tree/master/external

      使用KafkaSpout时需要子集实现Scheme接口,它主要负责从消息流中解析出需要的数据

注意   Scheme 接口,不同版本deserialize方法的参数可能不一样,比如 deserialize(byte[] ser)和 deserialize(ByteBuffer ser)

只要转成string 就可以了

复制代码
public class MessageScheme implements Scheme {          
 /* (non-Javadoc)      
  * @see backtype.storm.spout.Scheme#deserialize(byte[])     
  */     
public List<Object> deserialize(byte[] ser) {        
 try {             
	String msg = new String(ser, "UTF-8");              
	return new Values(msg);         
   } catch (UnsupportedEncodingException e) {                    
 }         
return null;     
}              
 /* (non-Javadoc)      
  * @see backtype.storm.spout.Scheme#getOutputFields()     
 */     
public Fields getOutputFields() {         
	// TODO Auto-generated method stub         
	return new Fields("msg");       
	}   
} 
复制代码

    3、SenqueceBolt

       SenqueceBolt实现很简单,在接收的spout的消息前面加上“I‘m” 

复制代码
public class SenqueceBolt extends BaseBasicBolt{         
 /* (non-Javadoc)     
 * @see backtype.storm.topology.IBasicBolt#execute(backtype.storm.tuple.Tuple, backtype.storm.topology.BasicOutputCollector)      
*/     
	public void execute(Tuple input, BasicOutputCollector collector) {         
	// TODO Auto-generated method stub          
		String word = (String) input.getValue(0);            
		String out = "I'm " + word +  "!";            
		System.out.println("out=" + out);          
		collector.emit(new Values(out));     
	}          
	/* (non-Javadoc)      
	* @see backtype.storm.topology.IComponent#declareOutputFields(backtype.storm.topology.OutputFieldsDeclarer)      
	*/     
	public void declareOutputFields(OutputFieldsDeclarer declarer) {         
		declarer.declare(new Fields("message"));     
	}
 } 
复制代码

    4、KafkaBolt

      KafkaBolt是Storm中自带的Bolt,负责向Kafka发送主题消息

    5、Topology

复制代码
public class StormKafkaTopo {        
public static void main(String[] args) throws Exception { 
     // 配置Zookeeper地址         
	BrokerHosts brokerHosts = new ZkHosts("node04:2181,node05:2181,node06:2181");        
	 // 配置Kafka订阅的Topic,以及zookeeper中数据节点目录和名字         
	SpoutConfig spoutConfig = new SpoutConfig(brokerHosts, "topic1", "/zkkafkaspout" , "kafkaspout");        
     // 配置KafkaBolt中的kafka.broker.properties         
	Config conf = new Config();           
	Map<String, String> map = new HashMap<String, String>(); 
     // 配置Kafka broker地址                
	map.put("metadata.broker.list", "node04:9092");         
	// serializer.class为消息的序列化类         
	map.put("serializer.class", "kafka.serializer.StringEncoder");         
	conf.put("kafka.broker.properties", map);
    // 配置KafkaBolt生成的topic         
	conf.put("topic", "topic2");                  
	spoutConfig.scheme = new SchemeAsMultiScheme(new MessageScheme());           
	TopologyBuilder builder = new TopologyBuilder();            
	builder.setSpout("spout", new KafkaSpout(spoutConfig));          
 	builder.setBolt("bolt", new SenqueceBolt()).shuffleGrouping("spout");          
	builder.setBolt("kafkabolt", new KafkaBolt<String, Integer>()).shuffleGrouping("bolt");                  
	if (args != null && args.length > 0) {               
		conf.setNumWorkers(3);               
		StormSubmitter.submitTopology(args[0], conf, builder.createTopology());           
	} else {                  
		LocalCluster cluster = new LocalCluster();              
 		cluster.submitTopology("Topo", conf, builder.createTopology());               
		Utils.sleep(100000);               
		cluster.killTopology("Topo");              
 		cluster.shutdown();          
 	}      
   }   
}
复制代码


       

  三、测试验证

    1、使用Kafka client模拟Kafka Producter ,生成topic1主题   

      bin/kafka-console-producer.sh --broker-list node04:9092 --topic topic1

    2、使用Kafka client模拟Kafka Consumer,订阅topic2主题

      bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topic2 --from-beginning

    3、运行Strom Topology

      bin/storm jar storm-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar  StormKafkaTopo KafkaStorm

    4、运行结果

        


猜你喜欢

转载自blog.csdn.net/qq_26418435/article/details/51699711