Hbase-interview questions

1. Hbase-region segmentation 

  1. Automatic segmentation, by default
    1. Version 2.0, the first time the data of the region reaches 256M, it will be divided, and then it will be divided every time it reaches 10G. After the division is completed, the load will be balanced to other regionservers
  2. Pre-partition + custom rowkey
    1. Can be understood as pre-segmentation
    2. For example, pre-partitioning, each regionserver will have 10 regions, and each region has startrow and endrow
    3. Production must use pre-partition + custom rowkey
    4. After the pre-partition is completed, even if there is no data, 10 empty files of the region will be created
    5. When storing data in the future, it will be evenly stored in each region

2. Hbase-big merge and small merge

Big Merge: Delete expired data and merge files once every 7 days in the enterprise
Small Merge: Mark expired data, but not delete, only merge adjacent files 

3. Hbase-memory data refresh

  1. Manually flash
    1. Flash with the command
  2. Scheduled flushing
  3. Setting parameters
    1. MemStore reaches 128M
    2. If there are many MemStore and none of them reach 128M, you can set the size of the region to 512M

 4. Hbase-secondary index

 4.1. Problems

If the filter condition of hbase query is not rowkey, it will traverse globally  

Example:

If you filter by name, it will traverse globally

id    name    age  
1     ikun    19   

4.2. Resolution

Adding a secondary index is actually creating a new table with name as rowkey

name  id  
ikun  1 

Guess you like

Origin blog.csdn.net/qq_40382400/article/details/132150247