Why add mysql when using hive?

  • metastore is a centralized storage place for hive metadata
  • metastore uses the embedded derby database as the storage engine by default
  • Disadvantage of the Derby engine: only one session can be opened at a time
  • Using Mysql as an external storage engine, multiple users can access at the same time

Hive installation

Inline Mode: Metadata remains inline Derby mode, allowing only one session connection

Local independent mode: install Mysql locally and put the metadata in Mysql

Remote mode: metadata is placed in a remote Mysql database.


 What I want to say is that hive is just a tool, including its data analysis, which depends on mapreduce, and its data management, which depends on external systems

This step is actually not necessary, because Hive's default metadata (metadata) is stored in Derby, but there is a disadvantage that only one Hive instance can access at the same time, which is suitable for local testing when developing programs.

Hive provides enhanced configuration, which can replace the database with a relational database such as mysql, and separate the stored data and share it among multiple service instances.


 It can be seen that in which path you execute the hive command, the metastore_db will be generated in that path. It is extremely inappropriate to build a set of database files. If everyone in the company is different, it will appear very mixed. As a result, there is no public communication between employees.

      For this purpose, a public one is required, mysql.

   This is why, when installing hive, you also need to configure mysql.


http://www.mamicode.com/info-detail-1462753.html

https://blog.csdn.net/nxw_tsp/article/details/54314886



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326126870&siteId=291194637