The difference between HBase and RDBMS: HBase cells have versioned descriptions (versioned), rows are ordered, and columns (Qualifier) exist in the column families (column families) to which they belong, and are added by the client.
- There is no concept of Joins in Hbase. If you want to join, when designing, you should consider using the thinking of a large table to solve it.
- Row Keys design: Primary keys, in alphabetical order in the region (Byte array). When writing, it should be scattered to avoid causing data to be stored only on a few nodes. For example, in the order table, the order id can be used as the reverse operation for the row key.
Note: For multiple conditional queries, you can choose to combine rowkey.
When reading data, you can only press rowkey or scan the whole table
3. Column cluster CF design: try to have 1-2 CFs. When designing the hbase shema. Try to have only one column family
Flush: When the data in MemStore reaches a certain threshold, it is flushed into HFile files in HDFS.
Compaction: It can change the "interleaved disordered state" of multiple HFiles into the "ordered state" of a single HFile, reducing the read latency.
Quote: http://baijiahao.baidu.com/s?id=1596690073555129451&wfr=spider&for=pc
Case:
1. Student table, class schedule, many-to-many.
RDBMS:
Hbase:
2. person ID card form 1:1
RDBMS
Hbase:
3. Order: roder table and order detail 1:N
RDBMS