Spark-sql cannot read the hive table in parquet format

When reading and writing Parquet tables to Hive metastore, Spark SQL will use Spark SQL's own Parquet SerDe (SerDe: the abbreviation of Serialize/Deserilize, which is used for serialization and deserialization) instead of Hive's SerDe. The SerDe that comes with Spark SQL has better performance. The optimized configuration parameter is spark.sql.hive.convertMetastoreParquet, which is enabled by default.

So sometimes it happens that the serialization method that comes with spark cannot parse the parquet data in hive, and the data cannot be read. In this case, you can set this parameter to false.

SET spark.sql.hive.convertMetastoreParquet = false;

 

Guess you like

Origin blog.csdn.net/x950913/article/details/106211587