hbase常用过滤器

1、SingleColumnValueFilter   

     单列列值型过滤器:指定某个列簇和列信息,对指定列的值进行过滤, 相当于sql查询中的 where t.username = 'xxx'  或   where t.username like '%xxx%';

    注意:注意如果某一行不含有该列,同样返回,除非通过filterIfColumnMissing 设置成真。

    方法:public void setFilterIfMissing(boolean filterIfMissing)

                       true:如果找不到该列,整行将被跳过

                       false:如果找不到该列,整行将会通过(默认)

1.1、子类:SingleColumnValueExcludeFilter

该过滤器同上面的过滤器正好相反,如果条件相符,将不会返回该列的内容。

 

2、ValueFilter

      值过滤器,针对单元值,可以过滤掉不符合设定标准的所有单元

 

3、RowFilter

      基于行键rowkey匹配过滤数据,If an already known row range needs to be scanned, use Scan start and stop rows directly rather than a filter.

 

4、PrefixFilter

     也是一种针对行键的过滤器,它基于行键的前缀进行过滤,Pass results that have same row prefix.

=================================================================

列过滤器: 

5、FamilyFilter

   用于过滤列族。通常,在Scan中选择ColumnFamilie优于在过滤器中做,If an already known column family is looked for, use Get.addFamily(byte[]) directly rather than a filter.

6、QualifierFilter

       限定符过滤器,针对列名的比较过滤器,This filter is used to filter based on the column qualifier.

   If an already known column qualifier is looked for, use  Get.addColumn(byte[], byte[]) directly rather than a filter.

7、ColumnPrefixFilter

  基于列名(即Qualifier)前缀过滤。


8、MultipleColumnPrefixFilter

   和 ColumnPrefixFilter 行为差不多,但可以指定多个前缀。


9、ColumnRangeFilter 

      基于列名范围的过滤,可以进行高效内部扫描,如: you have a million columns in a row but you only want to look at columns bbbb-bbdd.

==============================================================

 10、FirstKeyOnlyFilter

       只返回每一列的第一个keyvalue,可用于高效的行数的统计。A filter that will only return the first KV from each row.

       This filter can be used to more efficiently perform row count operations.

 

猜你喜欢

转载自weisu.iteye.com/blog/1944928