HBase scannerCaching&caching

HBase Version: hbase-0.94.6-cdh4.3.0

在HBase Scan中有一优化: 使用 scannerCaching&caching.

意思是HBaseClient从HBase服务器一次取得多少条数据回来,减少从服务器来回取数据的次数,可以设置一次从HBase服务器取scannerCaching&caching条数据.

其中scannerCaching是HTable的属性;caching是Scan的属性;

在HTable源码中可以发现, 此两个方法都已经过时了.

public class HTable implements HTableInterface {
  protected int scannerCaching;
  /**
   * Gets the number of rows that a scanner will fetch at once.
   * <p>
   * The default value comes from {@code hbase.client.scanner.caching}.
   * @deprecated Use {@link Scan#setCaching(int)} and {@link Scan#getCaching()}
   */
  public int getScannerCaching() {
    return scannerCaching;
  }

  /**
   * Sets the number of rows that a scanner will fetch at once.
   * <p>
   * This will override the value specified by
   * {@code hbase.client.scanner.caching}.
   * Increasing this value will reduce the amount of work needed each time
   * {@code next()} is called on a scanner, at the expense of memory use
   * (since more rows will need to be maintained in memory by the scanners).
   * @param scannerCaching the number of rows a scanner will fetch at once.
   * @deprecated Use {@link Scan#setCaching(int)}
   */
  public void setScannerCaching(int scannerCaching) {
    this.scannerCaching = scannerCaching;
  }
}

 要我们在使用中, 使用Scan的{@link Scan#setCaching(int)} and {@link Scan#getCaching()}

通过HTable源码知道, HTable进行Scan时调用, 返回ResultScanner,再对查询出的ResultScanner进行处理; 

   /**
    * {@inheritDoc}
    */
  @Override
  public ResultScanner getScanner(final Scan scan) throws IOException {
    if (scan.getCaching() <= 0) {
      scan.setCaching(getScannerCaching());
    }
    return new ClientScanner(getConfiguration(), scan, getTableName(),
        this.connection);
  }

 通过上面代码知道,HBase HTable设置的scannerCaching是赋值到scan上的.

1. 在ClientScanner代码中获取了从HTable中过来的scannerCaching; 

2. 当然如果HTable中过来的scannerCaching;没有设置(scannerCaching=0); 则ClientScanner中的caching保留原值; 

    /**
     * Create a new ClientScanner for the specified table
     * Note that the passed {@link Scan}'s start row maybe changed changed. 
     * 
     * @param conf The {@link Configuration} to use.
     * @param scan {@link Scan} to use in this scanner
     * @param tableName The table that we wish to scan
     * @param connection Connection identifying the cluster
     * @throws IOException
     */
    public ClientScanner(final Configuration conf, final Scan scan,
      final byte[] tableName, HConnection connection) throws IOException {
      if (LOG.isDebugEnabled()) {
        LOG.debug("Creating scanner over "
            + Bytes.toString(tableName)
            + " starting at key '" + Bytes.toStringBinary(scan.getStartRow()) + "'");
      }
      this.scan = scan;
      this.tableName = tableName;
      this.lastNext = System.currentTimeMillis();
      this.connection = connection;
      this.maxScannerResultSize = conf.getLong(
          HConstants.HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE_KEY,
          HConstants.DEFAULT_HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE);
      this.scannerTimeout = (int) conf.getLong(
          HConstants.HBASE_REGIONSERVER_LEASE_PERIOD_KEY,
          HConstants.DEFAULT_HBASE_REGIONSERVER_LEASE_PERIOD);

      // check if application wants to collect scan metrics
      byte[] enableMetrics = scan.getAttribute(
        Scan.SCAN_ATTRIBUTES_METRICS_ENABLE);
      if (enableMetrics != null && Bytes.toBoolean(enableMetrics)) {
        scanMetrics = new ScanMetrics();
      }

      // Use the caching from the Scan.  If not set, use the default cache setting for this table.
      if (this.scan.getCaching() > 0) {
        this.caching = this.scan.getCaching();
      } else {
        this.caching = conf.getInt("hbase.client.scanner.caching", 1);
      }

      // initialize the scanner
      nextScanner(this.caching, false);
    }

 3. 还有一个地方就是如果没有设置1和2(HTable和Scan都没有设置 scannerCaching&caching),;

     杯具就来了:使用默认的hbase.client.scanner.caching=1

猜你喜欢

转载自greatwqs.iteye.com/blog/1894428