RocketMQ原理解析-Index

1. 介绍
  索引文件指,发送一条消息后,MQ通过(topic和uniqKey)或(topic和keys)构建的索引,然后通过queryMsgByKey可以查询到结果。注意,通过queryMsgById的查询,不是索引查询。索引文件存在于/store/index/文件夹下,以时间戳命名,如20151209213520685。每个索引文件,默认存储2000w条数据,文件大小默认为420000040字节。索引文件由头信息,槽位,内容组成。

2. 组成
  头信息包含6个字段,分别为:
  • beginTimestamp:long型,第一条消息的存储时间。
  • endTimestamp:long型, 最后一条消息的存储时间。broker异常关闭下判断是否删除索引文件。
  • beginPhyOffset:long型,第一条消息的offset。
  • endPhyOffset:long型, 最后一条消息的offset。
  • hashSlotCount:int型,slot数量。从零递增,有消息即+1。
  • indexCount:int型,index数量。从1递增。有消息即+1。
  •     所以,头信息占用40字节。

    内容包括四个字段,分别为:
  • keyHash:int型,key的hash值,key为topic和uniqueKey或topic和keys组合。
  • phyOffset: long型,offset值。
  • timeDiff: int型,消息的存储时间与beginTimestamp差值。
  • nextIndexOffset:int型,即key的hashcode或取模冲突后,指向的下一个index offset。
  •     所以每条索引消息占用20字节。槽位(slotNum)默认有500w,int型。

    说明:如果uniqueKey不为空,以topic和uniqueKey创建索引。然后判断keys是否为空,不为空,以”空格”分隔keys得到key组数,以topic和key值进行索引:

if (req.getUniqKey() != null) {
                indexFile = putKey(indexFile, msg, buildKey(topic, req.getUniqKey()));
                if (indexFile == null) {
                    log.error("putKey error commitlog {} uniqkey {}", req.getCommitLogOffset(), req.getUniqKey());
                    return;
                }
            }

            if ((keys != null && keys.length() > 0)) {
                String[] keyset = keys.split(MessageConst.KEY_SEPARATOR);
                for (int i = 0; i <  keyset.length; i++) {
                    String key = keyset[i];
                    if (key.length() > 0) {
                            indexFile = putKey(indexFile, msg, buildKey(topic, key));
                            if (indexFile == null) {
                                log.error("putKey error commitlog {} uniqkey {}", req.getCommitLogOffset(), req.getUniqKey());
                                return;
                            }
                        }
                 }
            }


3. 创建
  当producer发送消息到broker后,MQ通过ReputMessageService线程异步构建consumequeue和index。

4. 插入
  当有索引消息时,先计算key的hashcode值,然后hashcode%slotNum得到槽位,由于key的hashcode和取模都会导致冲突,所以槽值总是指向最新的一个索引项。为了节省空间,存储的时间是存储时间-开始时间。
public boolean putKey(final String key, final long phyOffset, final long storeTimestamp) {
        if (this.indexHeader.getIndexCount() < this.indexNum) {
            int keyHash = indexKeyHashMethod(key);
            int slotPos = keyHash % this.hashSlotNum;
            int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * HASH_SLOT_SIZE;

            FileLock fileLock = null;

            try {

                // fileLock = this.fileChannel.lock(absSlotPos, HASH_SLOT_SIZE,
                // false);
                int slotValue = this.mappedByteBuffer.getInt(absSlotPos);
                if (slotValue <= INVALID_INDEX || slotValue > this.indexHeader.getIndexCount()) {
                    slotValue = INVALID_INDEX;
                }

                long timeDiff = storeTimestamp - this.indexHeader.getBeginTimestamp();


                timeDiff = timeDiff / 1000;


                if (this.indexHeader.getBeginTimestamp() <= 0) {
                    timeDiff = 0;
                } else if (timeDiff > Integer.MAX_VALUE) {
                    timeDiff = Integer.MAX_VALUE;
                } else if (timeDiff < 0) {
                    timeDiff = 0;
                }

                int absIndexPos =
                        IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * HASH_SLOT_SIZE
                                + this.indexHeader.getIndexCount() * INDEX_SIZE;


                this.mappedByteBuffer.putInt(absIndexPos, keyHash);
                this.mappedByteBuffer.putLong(absIndexPos + 4, phyOffset);
                this.mappedByteBuffer.putInt(absIndexPos + 4 + 8, (int) timeDiff);
                this.mappedByteBuffer.putInt(absIndexPos + 4 + 8 + 4, slotValue);


                this.mappedByteBuffer.putInt(absSlotPos, this.indexHeader.getIndexCount());


                if (this.indexHeader.getIndexCount() <= 1) {
                    this.indexHeader.setBeginPhyOffset(phyOffset);
                    this.indexHeader.setBeginTimestamp(storeTimestamp);
                }

                this.indexHeader.incHashSlotCount();
                this.indexHeader.incIndexCount();
                this.indexHeader.setEndPhyOffset(phyOffset);
                this.indexHeader.setEndTimestamp(storeTimestamp);

                return true;
            } catch (Exception e) {
                log.error("putKey exception, Key: " + key + " KeyHashCode: " + key.hashCode(), e);
            } finally {
                if (fileLock != null) {
                    try {
                        fileLock.release();
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }
            }
        } else {
            log.warn("putKey index count " + this.indexHeader.getIndexCount() + " index max num "
                    + this.indexNum);
        }

        return false;
    }


5. 查询
  先计算出对应的slot,由于key的hash值不同但模数相同,所以在查询时会比较一次key的hash值,然后加入返回列表,每次最多返回32条索引信息。这里需要注意,由于hash值相同但key不等下产生的相同slot,也会被返回给客户端,所以在客户端又进行了一次处理。
 
public void selectPhyOffset(final List<Long> phyOffsets, final String key, final int maxNum,
                                final long begin, final long end, boolean lock) {
        if (this.mapedFile.hold()) {
            int keyHash = indexKeyHashMethod(key);
            int slotPos = keyHash % this.hashSlotNum;
            int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * HASH_SLOT_SIZE;

            FileLock fileLock = null;
            try {
                if (lock) {
                    // fileLock = this.fileChannel.lock(absSlotPos,
                    // HASH_SLOT_SIZE, true);
                }

                int slotValue = this.mappedByteBuffer.getInt(absSlotPos);
                // if (fileLock != null) {
                // fileLock.release();
                // fileLock = null;
                // }

                if (slotValue <= INVALID_INDEX || slotValue > this.indexHeader.getIndexCount()
                        || this.indexHeader.getIndexCount() <= 1) {
                    // TODO NOTFOUND
                } else {
                    for (int nextIndexToRead = slotValue; ; ) {
                        if (phyOffsets.size() >= maxNum) {
                            break;
                        }

                        int absIndexPos =
                                IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * HASH_SLOT_SIZE
                                        + nextIndexToRead * INDEX_SIZE;

                        int keyHashRead = this.mappedByteBuffer.getInt(absIndexPos);
                        long phyOffsetRead = this.mappedByteBuffer.getLong(absIndexPos + 4);

                        long timeDiff = (long) this.mappedByteBuffer.getInt(absIndexPos + 4 + 8);
                        int prevIndexRead = this.mappedByteBuffer.getInt(absIndexPos + 4 + 8 + 4);


                        if (timeDiff < 0) {
                            break;
                        }

                        timeDiff *= 1000L;

                        long timeRead = this.indexHeader.getBeginTimestamp() + timeDiff;
                        boolean timeMatched = (timeRead >= begin) && (timeRead <= end);

                        if (keyHash == keyHashRead && timeMatched) {
                            phyOffsets.add(phyOffsetRead);
                        }

                        if (prevIndexRead <= INVALID_INDEX
                                || prevIndexRead > this.indexHeader.getIndexCount()
                                || prevIndexRead == nextIndexToRead || timeRead < begin) {
                            break;
                        }

                        nextIndexToRead = prevIndexRead;
                    }
                }
            } catch (Exception e) {
                log.error("selectPhyOffset exception ", e);
            } finally {
                if (fileLock != null) {
                    try {
                        fileLock.release();
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }

                this.mapedFile.release();
            }
        }
    }


6. 注意
    由于每条索引消息的构建并未同步更新checkpoint文件的indexMsgTimestamp信息,所以在broker异常关闭情形下,broker重启后会删除最后一个索引文件,这将导致消息通过索引查询失败。确实不太清楚,作者的本意。也许是遗漏同步更新checkpoint的bug吧。

猜你喜欢

转载自technoboy.iteye.com/blog/2367786