Hive的表操作

表的修改

  1.alter table 修改表的元数据:

    a.修改表名称

      hive>alter table student rename to student1;

    b.修改表分区

      hive>alter table hive.logs add partition (dt='2018-9-1',country='beijng'); 

      hive>alter table hive.logs drop partition (dt='2018-9-1',country='beijng'); 

    c.添加列

      hive>alter table hive.logs add columns(id int,name string);

    d.删除和替换列

      hive>alter table hive.logs replace columns (id int,name string);

向管理表装载数据

  1.装载数据

    hive>load data [local] inpath '/data/hive' [overwrite] into table tbname;

    说明:含local关键字:加载本地目录数据到warehouse下;如果不含local关键字,加载数据目录默认为hdfs路径。

  2.通过查询语句向表中插入数据:在原始数据的基础上,进行业务分析后生成的表数据

    hive>insert overwrite table new_table select [column字段] from 原始数据表 where 条件;

      或
      hive> from 原始数据表 别名 insert overwrite table new_table select [column字段] where 条件;

  3.创建表时,也可以执行插入数据

    hive>create table new_table as select id,name,age from student where age>23;

  4.导出数据 

    hive>insert overwrite directory '/data/stocks/2009-6-23' select * from stocks where ymd ='2009=6-23';

查询

  1.对array查询 

    hive>select name,subordinates[1] from hive.employees where name='John Doe';

  2.对map查询

    hive>select name,deductions['State Taxes'] from hive.employees where name = 'John Doe';

  3.对struct查询

    hive>select name,address.state from hive.employees where name ='John Doe';

  4.支持运算符

  5.内置函数,比如sum(),count(),avg()等。

  6.limit

    hive>select upper(name),salary,deductions['Federal Taxes'],round(salary*(1-deductions['Federal Taxes'])) from hive.employees limit 6;

  7.列别名

    hive>select upper(name),salary,deductions['Federal Taxes'] as Taxes,round(salary*(1-deductions['Federal Taxes'])) as sal from hive.employees limit 6;

  8.嵌套

     hive>from (select upper(name) as name,salary,deductions['Federal Taxes'] as Taxes,round(salary*(1-deductions['Federal Taxes'])) as sal from hive.employees) e select e.name,e.Taxes,e.sal where e.sal >70000;

本地执行条件 

  1.简单查询:select * from table;

  2.将分区列作为查询条件,但查询列应为“*”;

  3.调优方案:建议将hive.exec.mode.local.auto=true;并添加至{/home/hyxy/.hiverc}中;

group by

  hive>select year(ymd),avg(price_close) from stocks where exchange1='NASDAQ' and symbol = 'AAPL' group by year(ymd);

  说明:分组属性列在mysql中一定要出现在select子句之后;但在hive中不需要。  

猜你喜欢

转载自www.cnblogs.com/lyr999736/p/9473935.html