Getting Started with Storm
storm实战构建大数据实时计算
DRPC拓扑
分布式远程过程调用(DRPC),它利用Storm的分布式特性执行远程过程调用(RPC)。对每一次函数调用,Storm集群上运行的拓扑接收调用方法名称和参数作为输入流,经过一系列的计算,最终将计算结果作为输出流返回发射出去。
LinearDRPCTopologyBuilder
LinearDRPCTopologyBuilder是Storm提供的一个线形Topology builder,它可以自动完成几乎所有的DRPC步骤。它包括
- 设置Spout
- 返回结果到DRPC服务器
- 为Bolt提供有限的聚合元组的能力。
//org.apache.storm.drpc.LinearDRPCTopologyBuilder
public class LinearDRPCTopologyBuilder {
//构造函数,指定function名称
public LinearDRPCTopologyBuilder(String function);
//添加元组, 以及并发数
public LinearDRPCInputDeclarer addBolt(IBatchBolt bolt, Number parallelism);
//创建本地拓扑:需要实现ILocalDRPC的drpc参数
public StormTopology createLocalTopology(ILocalDRPC drpc){
return createTopology(new DRPCSpout(_function, drpc));
}
//创建远程拓扑,不需要参数
public StormTopology createRemoteTopology(){
return createTopology(new DRPCSpout(_function));
}
//创建topology
private StormTopology createTopology(DRPCSpout spout);
}
重点分析下createTopology(DRPCSpout spout)
,其中spout
发送元组信息如下:
@Override
public void nextTuple() {
DRPCRequest req = client.fetchRequest(_function);
if(req.get_request_id().length() > 0) {
Map returnInfo = new HashMap();
returnInfo.put("id", req.get_request_id());
returnInfo.put("host", client.getHost());
returnInfo.put("port", client.getPort());
//1.1请求参数 ,1.2.请求request_id+ip+host,
//2..request_id
_collector.emit(new Values(req.get_func_args(), JSONValue.toJSONString(returnInfo)), new DRPCMessageId(req.get_request_id(), i));
break;
}
}
private StormTopology createTopology(DRPCSpout spout){
final String SPOUT_ID = "spout";
final String PREPARE_ID = "prepare-request";
TopologyBuilder builder = new TopologyBuilder();
//预先创建一个spout
builder.setSpout(SPOUT_ID, spout);
//创建一个PrepareRequest(生成一个请求id,为return-info创建一个流,为args创建一个流)
builder.setBolt(PREPARE_ID, new PrepareRequest())
.noneGrouping(SPOUT_ID);
//省略中间处理步骤
//创建CoordinatedBolt
BoltDeclarer declarer = builder.setBolt(
boltId(i),
new CoordinatedBolt(component.bolt, source, idSpec),
component.parallelism);
//创建direct groupings
declarer.directGrouping(boltId(i-1), Constants.COORDINATED_STREAM_ID);
//省略中间处理步骤
//JoinResult 将结果与return-info拼接起来
builder.setBolt(boltId(i), new JoinResult(PREPARE_ID))
.fieldsGrouping(boltId(i-1), outputStream, new Fields(fields.get(0)))
.fieldsGrouping(PREPARE_ID, PrepareRequest.RETURN_STREAM, new Fields("request"));
i++;
//ReturnResults:连接到DRPCServer,返回结构
builder.setBolt(boltId(i), new ReturnResults())
.noneGrouping(boltId(i-1));
return builder.createTopology();
}
}
本地模式DRPC
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.LocalDRPC;
import org.apache.storm.drpc.LinearDRPCTopologyBuilder;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.BasicOutputCollector;
import org.apache.storm.topology.IBasicBolt;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseBasicBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import org.apache.storm.tuple.Values;
import java.security.InvalidParameterException;
import java.util.Map;
public class Chap03DrpcApp {
public static void main(String[] args) {
//创建一个LocalDRPC对象,模拟DRPC服务器
LocalDRPC drpc = new LocalDRPC();
LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("add");
builder.addBolt(new AdderBolt(), 2);
Config conf = new Config();
conf.setDebug(true);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("drpcder-topology", conf,
builder.createLocalTopology(drpc));
String result = drpc.execute("add", "1+-1");
System.out.println("result==="+result);
result = drpc.execute("add", "1+1+5+10");
System.out.println("result==="+result);
cluster.shutdown();
drpc.shutdown();
}
static class AdderBolt extends BaseBasicBolt {
@Override
public void execute(Tuple input, BasicOutputCollector collector) {
//input.getString(0) ==== requestId
String[] numbers = input.getString(1).split("\\+");
Integer added = 0;
if(numbers.length<2){
throw new InvalidParameterException("Should be at least 2 numbers");
}
for(String num : numbers){
added += Integer.parseInt(num);
}
collector.emit(new Values(input.getValue(0),added));
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("id","result"));
}
}
}
远程模式DRPC
- 创建topology
public static void main(String[] args) throws InvalidTopologyException, AuthorizationException, AlreadyAliveException {
LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("add");
builder.addBolt(new AdderBolt(), 2);
Config conf = new Config();
conf.setDebug(false);
//以远程模式启动
StormSubmitter.submitTopology("remote",conf,builder.createRemoteTopology());
}
- 开启drpc服务配置:只需要在
nimbus
开启即可。
drpc.servers:
- "s156"
-
部署topology,
storm jar xxx.jar xxx.MainClazz
-
调用client
public void testDRPC() throws Exception {
Config conf = new Config();
//远程服务器地址, 默认端口3772
DRPCClient client = new DRPCClient(conf,"s159", 3772);
//调用远程方法,并获取结果
String result = client.execute("remoteDRPC", "hello");
}
LinearDRPCTopologyBuilder
只能处理“线性的”DRPC拓扑,若如果DRPC调用中包含复杂的带有分支和合并的Bolt拓扑,需要使用CoordinatedBolt
来完成这种非线性拓扑的计算。