The test table field query is as follows:
Test the first writing method of intercepting the first 5 characters in hive:
Test the second writing method of intercepting the first 5 characters in hive:
Test the first writing method of intercepting the first 5 characters in impala:
Test the second writing method of intercepting the first 5 characters in impala:
result:
1. In hive, when the substr function is used, the first digit starts from 0 and starts from 1, and the usage is the same, that is,
select substr(name,0,5) from bdl_substr_test;
select substr(name,1,5) from bdl_substr_test;
is consistent.
2. In impala, when the substr function is used, the first digit starts from 0 and starts from 1, and the usage is inconsistent, that is
select substr(name,0,5) from bdl_substr_test;
select substr(name,1,5) from bdl_substr_test;
is inconsistent.
in conclusion:
The substr function in hive and impala must be distinguished in terms of usage. Impala can start from 0 without reporting an error. In actual business calculations, this will be a very big pit, and the impact on the result is very serious! ! !
When using substr to intercept the first few characters in impala, be sure to start from serial number 1, remember! ! !