6. Frequently asked questions about Phoenix

1. How to pre-cut the table

The pre-cut table refers to the division point of each pre-partitioned region by specifying the cut value, thereby effectively avoiding the problem of single-point overheating.


create table test
(id integer primary key, 
name varchar,
age integer,
address varchar) split on (10, 20, 30, 40)
#split后为切割点

The above will create 5 regions, and the range of the rowkey is:
Region 1: the first two digits of the row key are min~10
Region 2: the first two digits of the row key are 10~20
Region 3: the first two digits of the row key are 20 ~30
Region 4: The first two digits of the row key are 30~40
Region 5: The first two digits of the row key are 40~max

2. Specify compression to improve the performance of large tables


create table test
(id integer primary key, 
name varchar,
age integer,
address varchar) split on (10, 20, 30, 40),COMPRESSION='GZ'
#split后为切割点

3. Does the application use the Phoenix JDBC connection pool?

It is not recommended to use a connection pool. The connection object of Phoenix is ​​different from most JDBC connections, because the bottom layer is the connection of Hbase. Phoenix connection objects are designed to be finer and cheaper to create. If Phoenix's connection is reused, the underlying HBase connection may not always be kept in a healthy state by the previous user. A better approach is to create a new Phoenix connection to ensure that potential problems are avoided.

4. Write optimization

When using upsert to write large amounts of data, turn off autocommit and submit in smaller batches.

try (Connection conn = DriverManager.getConnection(url)) {
  conn.setAutoCommit(false);
  int batchSize = 0;
  int commitSize = 1000; // 每一批提交数.  
  try (Statement stmt = conn.prepareStatement(upsert)) {
    stmt.set ... while (there are records to upsert) {
      stmt.executeUpdate(); 
      batchSize++; 
      if (batchSize % commitSize == 0) { 
        conn.commit(); 
      } 
   } 
 conn.commit(); // commit the last batch of records 
 

NOTE: With thin clients, it is very important to use executeBatch(), as the number of rpcs for the client and querying the server will be minimized

5. Reduce RPC interaction

Phoenix has a table-level parameter UPDATE_CACHE_FREQUENCY when designing a table. The value is always by default, which means that every SQL query will first request meta data, which increases the pressure of the request and has a certain impact on performance.

It is recommended to specify how often this parameter is synchronized when creating a table, such as

CREATE TABLE IF NOT EXISTS test_user (
id VARCHAR NOT NULL PRIMARY KEY,
username VARCHAR  ,
phoen VARCHAR  ,
addr  VARCHAR,
times bigint)  UPDATE_CACHE_FREQUENCY = 900000;
#客户端应每15分钟检查表或其统计信息的更新


#修改已有表
alter table 表名 set UPDATE_CACHE_FREQUENCY = 时间(毫秒)。

Guess you like

Origin blog.csdn.net/lzzyok/article/details/119705101