flush后split和compact后split

什么时候split?
当某store所有文件总大小大于某个值时split,注意,并不是判断某个storefile大小大于某个值。
什么时候compact?
检查整个region内所有store中任一个store 的总storefile是不是太多了(大于hbase.hstore.blockingStoreFiles(7)),太多了则会先进行compact。

flush会遍历这个region的所有store,并一个个执行。
compact遍历这个region找到符合条件的store进行compact.


1. 在flush之后会判断是否需要split和compact
这里的split有一个判断条件,先计算这tableRegionsCount(regionserver上的这个table的online的region个数),
然后循环计算此region的所有store是否太大,这是通过getSizeToCheck方法计算出一个size,若当前的store总大小大于这个值,则表示此region需要split.

getSizeToCheck的计算方法首先判断tableRegionsCount是否等于0,若是则返回hbase.hregion.max.filesize ,若不是,则计算Math.min(getDesiredMaxFileSize(),
        this.flushSize * (tableRegionsCount * tableRegionsCount)。

      boolean shouldCompact = region.flushcache();
      // We just want to check the size
      boolean shouldSplit = region.checkSplit() != null;
      if (shouldSplit) {
        this.server.compactSplitThread.requestSplit(region);
      } else if (shouldCompact) {
        server.compactSplitThread.requestCompaction(region, getName());
      }



  private long flushSize;

  @Override
  protected void configureForRegion(HRegion region) {
    super.configureForRegion(region);
    this.flushSize = region.getTableDesc() != null?
      region.getTableDesc().getMemStoreFlushSize():
      getConf().getLong(HConstants.HREGION_MEMSTORE_FLUSH_SIZE,
        HTableDescriptor.DEFAULT_MEMSTORE_FLUSH_SIZE);
  }

  @Override
  protected boolean shouldSplit() {
    if (region.shouldForceSplit()) return true;
    boolean foundABigStore = false;
    // Get count of regions that have the same common table as this.region
    int tableRegionsCount = getCountOfCommonTableRegions();
    // Get size to check
    long sizeToCheck = getSizeToCheck(tableRegionsCount);

    for (Store store : region.getStores().values()) {
      // If any of the stores is unable to split (eg they contain reference files)
      // then don't split
      if ((!store.canSplit())) {
        return false;
      }

      // Mark if any store is big enough
      long size = store.getSize();
      if (size > sizeToCheck) {
        LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
          " size=" + size + ", sizeToCheck=" + sizeToCheck +
          ", regionsWithCommonTable=" + tableRegionsCount);
        foundABigStore = true;
        break;
      }
    }

    return foundABigStore;
  }

  /**
   * @return Region max size or <code>count of regions squared * flushsize, which ever is
   * smaller; guard against there being zero regions on this server.
   */
  long getSizeToCheck(final int tableRegionsCount) {
    return tableRegionsCount == 0? getDesiredMaxFileSize():
      Math.min(getDesiredMaxFileSize(),
        this.flushSize * (tableRegionsCount * tableRegionsCount));
  }

  /**
   * @return Count of regions on this server that share the table this.region
   * belongs to
   */
  private int getCountOfCommonTableRegions() {
    RegionServerServices rss = this.region.getRegionServerServices();
    // Can be null in tests
    if (rss == null) return 0;
    byte [] tablename = this.region.getTableDesc().getName();
    int tableRegionsCount = 0;
    try {
      List<HRegion> hri = rss.getOnlineRegions(tablename);
      tableRegionsCount = hri == null || hri.isEmpty()? 0: hri.size();
    } catch (IOException e) {
      LOG.debug("Failed getOnlineRegions " + Bytes.toString(tablename), e);
    }
    return tableRegionsCount;
  }


2. compact后split
CompactionRequest.run中,compact完成之后,若完成了compact,则继续判断是否需要compact,判断的依据是if (s.getCompactPriority() <= 0) 表示7减去当前storefile的文件数是否<=0,也就是还有许多文件需要compact。
否则则进行split,在CompactSplitThread.requestSplit中,if (shouldSplitRegion() && r.getCompactPriority() >= PRIORITY_USER) ,首先判断系统设置的hbase.regionserver.regionSplitLimit(此参数可以限制整个系统总的region数)总region数是否大于当前在线的region数,若大于就不会split,再判断是否有这个region所有store中7-文件数>=1的store,两者都符合则split.

有一个疑问:难道不需要判断一下文件大小再split吗???

        boolean completed = r.compact(this);
        long now = EnvironmentEdgeManager.currentTimeMillis();
        LOG.info(((completed) ? "completed" : "aborted") + " compaction: " +
              this + "; duration=" + StringUtils.formatTimeDiff(now, start));
        if (completed) {
          server.getMetrics().addCompaction(now - start, this.totalSize);
          // degenerate case: blocked regions require recursive enqueues
          if (s.getCompactPriority() <= 0) {
            server.compactSplitThread
              .requestCompaction(r, s, "Recursive enqueue");
          } else {
            // see if the compaction has caused us to exceed max region size
            server.compactSplitThread.requestSplit(r);
          }
        }


  public synchronized boolean requestSplit(final HRegion r) {
    // don't split regions that are blocking
    if (shouldSplitRegion() && r.getCompactPriority() >= PRIORITY_USER) {
      byte[] midKey = r.checkSplit();
      if (midKey != null) {
        requestSplit(r, midKey);
        return true;
      }
    }
    return false;
  }


  private boolean shouldSplitRegion() {
    return (regionSplitLimit > server.getNumberOfOnlineRegions());
  }

    this.regionSplitLimit = conf.getInt("hbase.regionserver.regionSplitLimit",
        Integer.MAX_VALUE);


  public int getCompactPriority() {
    int count = Integer.MAX_VALUE;
    for(Store store : stores.values()) {
      count = Math.min(count, store.getCompactPriority());
    }
    return count;
  }



猜你喜欢

转载自brianf.iteye.com/blog/1696626