- metastore is a centralized storage place for hive metadata
- metastore uses the embedded derby database as the storage engine by default
- Disadvantage of the Derby engine: only one session can be opened at a time
- Using Mysql as an external storage engine, multiple users can access at the same time
Hive installation
Inline Mode: Metadata remains inline Derby mode, allowing only one session connection
Local independent mode: install Mysql locally and put the metadata in Mysql
Remote mode: metadata is placed in a remote Mysql database.
What I want to say is that hive is just a tool, including its data analysis, which depends on mapreduce, and its data management, which depends on external systems
This step is actually not necessary, because Hive's default metadata (metadata) is stored in Derby, but there is a disadvantage that only one Hive instance can access at the same time, which is suitable for local testing when developing programs.
Hive provides enhanced configuration, which can replace the database with a relational database such as mysql, and separate the stored data and share it among multiple service instances.
It can be seen that in which path you execute the hive command, the metastore_db will be generated in that path. It is extremely inappropriate to build a set of database files. If everyone in the company is different, it will appear very mixed. As a result, there is no public communication between employees.
For this purpose, a public one is required, mysql.
This is why, when installing hive, you also need to configure mysql.
http://www.mamicode.com/info-detail-1462753.html
https://blog.csdn.net/nxw_tsp/article/details/54314886