1、问题来源:
压测环境是两台namesever,两台broker master,分别是10.255.255.142(broker-b)和10.255.255.151(broker-a),从监控上看151从2015-3-13后就没收到过消息。
测试环境两天master,总共的TPS是4000左右,消息大小是2K,
2、寻找问题点:
1、在eclipse环境连接压测环境,发现消息只发送到broker-b上,没有发送到broker-a上面。
2、怀疑是producer没有连接上broker-a,用netstat命令查看broker-a的连接,producer连接上了broker-a
3、怀疑producer从nameserver没有获取到broker-a上面的消息队列,使用MessageQueueSelector发现nameserver返回了broker-a的消息队列。
4、只往broker-a的消息队列上发送消息,报如下错误
- com.alibaba.rocketmq.client.exception.MQBrokerException: CODE: 14 DESC: service not available now, maybe disk full, CL: 0.87 CQ: 0.87 INDEX: 0.87, maybe your broker machine memory too small.
- For more information, please visit the url, https://github.com/alibaba/RocketMQ/issues/64
- at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.processSendResponse(MQClientAPIImpl.java:492)
- at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.sendMessageSync(MQClientAPIImpl.java:398)
- at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.sendMessage(MQClientAPIImpl.java:379)
- at com.alibaba.rocketmq.client.impl.producer.DefaultMQProducerImpl.sendKernelImpl(DefaultMQProducerImpl.java:698)
- at com.alibaba.rocketmq.client.impl.producer.DefaultMQProducerImpl.sendSelectImpl(DefaultMQProducerImpl.java:877)
- at com.alibaba.rocketmq.client.impl.producer.DefaultMQProducerImpl.send(DefaultMQProducerImpl.java:851)
- at com.alibaba.rocketmq.client.producer.DefaultMQProducer.send(DefaultMQProducer.java:163)
- at com.ruishenh.rocketmq.example.Producer.main(Producer.java:78)
5、发现时硬盘不足,去broker-a上查看硬盘,硬盘还是有空间的
扫描二维码关注公众号,回复:
608781 查看本文章
6、查看RocketMQ的源码,知道出现问题的地方:
DefaultMessageStore中的public PutMessageResult putMessage(MessageExtBrokerInner msg)
- if (!this.runningFlags.isWriteable()) {
- long value = this.printTimes.getAndIncrement();
- if ((value % 50000) == 0) {
- log.warn("message store is not writeable, so putMessage is forbidden "
- + this.runningFlags.getFlagBits());
- }
- return new PutMessageResult(PutMessageStatus.SERVICE_NOT_AVAILABLE, null);
- }
- else {
- this.printTimes.set(0);
- }
RunningFlags类中的方法
- public boolean isWriteable() {
- if ((this.flagBits & (NotWriteableBit | WriteLogicsQueueErrorBit | DiskFullBit | WriteIndexFileErrorBit)) == 0) {
- return true;
- }
- return false;
- }
7、基本判断是硬盘不足了,让测试人员把释放一部分的硬盘空间,当硬盘空闲空间达到4G以上broker-a就能正常工作了,出问题的时候空闲的硬盘空间是2.5G
http://blog.csdn.net/xuhaifang_9856/article/details/44309123