Hive study notes two

Hive common attribute configuration

  • 1, Hive configuration data warehouse location

    1) Default most original position in the data warehouse is the hdfs: / under user / hive / warehouse path

    2) In the repository directory, not create a folder on the default database default. If a default database tables belong to, create a folder directly in the data warehouse catalog.

    3) modify the default data warehouse original position (below the hive-default.xml.template copy the configuration file to the hive-site.xml)

    <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
    </property>

    4) Configure the same group of users have execute permissions

    bin/hdfs dfs -chmod g+w /user/hive/warehouse
  • 2, query, display configuration information database name

    1) add the following information in hive-site.xml configuration file, the header information can be achieved display the current database, and the look-up table configuration.

    <property>
        <name>hive.cli.print.header</name>
        <value>true</value>
    </property>
    
    <property>
        <name>hive.cli.print.current.db</name>
        <value>true</value>
    </property>

2) Restart hive, arranged before and after comparison difference

Pre-deployment:

After configuration:

The local library file into the Hive Case

需求:将本地/opt/module/datas/student.txt这个目录下的数据导入到hive的student(id int, name string)表中。
  • 1, data preparation: data in this directory under /opt/module/datas/student.txt
//1)在/opt/module/目录下创建datas
[itstar@bigdata111module]$ mkdir datas

//2)在/opt/module/datas/目录下创建student.txt文件并添加数据
[itstar@bigdata111datas]$ touch student.txt
[itstar@bigdata111datas]$ vi student.txt
1001    zhangshan
1002    lishi
1003    zhaoliu
//注意以tab键间隔。
  • 2, Hive practical
//1)启动hive
[itstar@bigdata111hive]$ bin/hive

//2)显示数据库
hive>show databases;

//3)使用default数据库
hive>use default;

//4)显示default数据库中的表
hive>show tables;

//5)删除已创建的student表
hive> drop table student;

//6)创建student表, 并声明文件分隔符’\t’
hive> create table student(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

//7)加载/opt/module/datas/student.txt 文件到student数据库表中。
hive> load data local inpath '/opt/module/hive/mydata/student.txt' into table student;

//8)Hive查询结果
hive> select * from student;

Hive commonly used interactive command

  • "-E" do not enter the hive interactive window execute sql statement
[itstar@bigdata111hive]$ bin/hive -e "select id from student;"
  • "-F" execute script sql statement
//1)在/opt/module/datas目录下创建hivef.sql文件
[itstar@bigdata111datas]$ touch hivef.sql
----------------------
//文件中写入正确的sql语句
select *from student;

//2)执行文件中的sql语句
[itstar@bigdata111hive]$ bin/hive -f /opt/module/datas/hivef.sql

//3)执行文件中的sql语句并将结果写入文件中
[itstar@bigdata111hive]$ bin/hive -f /opt/module/datas/hivef.sql  > /opt/module/datas/hive_result.txt

Hive other command operations

  • Exit hive window:
hive(default)>exit;
hive(default)>quit;

//在新版的oracle中没区别了,在以前的版本是有的:
//exit:先隐性提交数据,再退出;
//quit:不提交数据,退出;
  • How to view hdfs file system in hive cli command window
hive(default)>dfs -ls /;
  • How to view hdfs local system in hive cli command window
hive(default)>! ls /opt/module/datas;
  • See all the history commands entered in the hive
//1)进入到当前用户的根目录/root或/home/itstar
//2)查看. hivehistory文件
[itstar@bigdata111~]$ cat .hivehistory

Parameter configuration

  • See all current configuration information
hive>set;
  • Configuration parameters of three ways

    • 1) Profiles mode
      the default configuration file: hive-default.xml
      user-defined profiles: hive-site.xml
      Note: User-defined configuration will override the default configuration. In addition, Hadoop Hive will be read into the configuration, because as a client Hadoop Hive is initiated, the configuration will overwrite the Hive Hadoop configuration. Set profiles are valid for all native Hive process started.

    • 2) mode command line parameters
      when starting Hive, at the command line to add -hiveconf param = value set parameters.
      E.g:

      [itstar@bigdata112 hive]$ bin/hive -hiveconf mapred.reduce.tasks=10;
      //注意:仅对本次hive启动有效
      
      //查看参数设置:
      hive (default)> set mapred.reduce.tasks;
    • 3) parameter declaration ways
      you can use the SET keyword set parameters in the HQL
      example:

      hive (default)> set mapred.reduce.tasks=10;
      
      //注意:仅对本次hive启动有效。
      //查看参数设置
      hive (default)> set mapred.reduce.tasks;

  Priority setting of the three ways in ascending order. That configuration file <command line parameter <parameter declaration. Note that some system-level parameters, such as log4j related settings must be set with the first two methods, because those parameters read in the previous session setup has been completed.

Guess you like

Origin www.cnblogs.com/nthforsth/p/12232322.html