hive之wordcount

1. Create a table, log file data, use the newline character as a delimiter

  create table file_data(content string)

  row format delimited fields terminated by '/n'

2. The data (/home/hadoop/wordcount.tx) was added to the prepared table file_data

  load data local inpath '/home/hadoop/wordcount.tx' into table file_data

3. The "" segmentation data, each word is recorded in the segmentation result table out as a row.

  (1) Create a result table, the segmentation of the word as the result of each row to the table

    create table words(word string)

    insert into table words select explode(split(line," ")) from file_data

  (2) Polymerization statistical function count

    select word,count(word)

    from words

    group by word

    (May count (word) alias count, then use to sort order by count)

    

Guess you like

Origin www.cnblogs.com/hdc520/p/11416382.html