Hivefile create a file on linux
student achievement ID Course ID #
sid cid Score
95001 1 81
95001 2 85
95004 3 88
95001 4 70
95002 2 92
95003 3 88
95002 4 90
95003 2 90
2. Upload the data to the hive in more than
3 use the hive to calculate the average score for each course, and on display. (Because elective courses, so some students are only part of the course grade)
hive> create table lx1(sid int ,cid int,score int)
> row format delimited
> fields terminated by ' ';
OK
Time taken: 1.999 seconds
hive> load data local inpath '/root/hivefile' into table default.lx1;
Loading data to table default.lx1
Table default.lx1 stats: [numFiles=1, totalSize=88]
OK
Time taken: 1.713 seconds
hive> select * from lx1;
OK
95001 1 81
95001 2 85
95004 3 88
95001 4 70
95002 2 92
95003 3 88
95002 4 90
95003 2 90
hive> select avg(score) from lx1 where sid=95001;
Query ID = root_20190624021248_c07ae78a-bbad-49a0-9a60-d034ac21a751
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
2019-06-24 02:12:52,100 Stage-1 map = 0%, reduce = 0%
2019-06-24 02:12:53,144 Stage-1 map = 100%, reduce = 0%
2019-06-24 02:12:54,174 Stage-1 map = 100%, reduce = 100%
Ended Job = job_local1517796820_0001
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 352 HDFS Write: 176 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
78.66666666666667
Time taken: 5.834 seconds, Fetched: 1 row(s)
Create a permanent function:
added at conf hive-site.xml configuration
wherein Test-1.0.jar jar package is uploaded, if it is a plurality of separated by commas
<property>
<name>hive.aux.jars.path</name>
<value>file:///opt/software/hive-1.2.1/lib/Test-1.0.jar</value>
</property>
hive> add jar /opt/software/hive-1.2.1/lib/Test-1.0.jar;
Added [/opt/software/hive-1.2.1/lib/Test-1.0.jar] to class path
Added resources: [/opt/software/hive-1.2.1/lib/Test-1.0.jar]
hive> create function a as "com.dingyabin.udf.TestUDF";
OK
Time taken: 0.155 seconds
Wherein the table is the name of the function to be used in field names;
hive> select a(name) from 1701C.test1;
OK
"asj"
"sd"
"ase"
Time taken: 0.232 seconds, Fetched: 3 row(s)