Hive_data warehouse_data type selection

 

Hive has many basic data types. How do we choose between so many data types that actually build data warehouses?

 

If your company is large enough, then it is recommended to refer to the following suggestions:

 

Floating point recommendations:

 

1) Double type in Hive needs to be used with caution, there is distortion in Double type in Hive.

For example: The original data is 10000, 10000 in Hive may be 10000.0001


 
2) In order to avoid floating-point numbers out of bounds or distortion, it is recommended that the original data be stored as decimal regardless of whether the original data is float, double or decimal.
 

 

Value type recommendations: 
    


 It is recommended that all numeric types be stored as BIGINT. The purpose is to prevent the range of numeric types from becoming larger, which leads to the problem of numeric value crossing.  
   

 

 

Character, string type:

 

CHAR, VARCHAR are defined as STRING. Although CHAR and VARCHAR are already supported in Hive's higher version, it is recommended to store it as STRING type in order to prevent the data from being out of bounds and complicated.

 

 

Date type recommendations:

 

     It is recommended that DATE be uniformly defined as the STRNG type, (at least in ods this is stored) 

Published 519 original articles · praised 1146 · 2.83 million views

Guess you like

Origin blog.csdn.net/u010003835/article/details/105233864