1. Create a table, log file data, use the newline character as a delimiter
create table file_data(content string)
row format delimited fields terminated by '/n'
2. The data (/home/hadoop/wordcount.tx) was added to the prepared table file_data
load data local inpath '/home/hadoop/wordcount.tx' into table file_data
3. The "" segmentation data, each word is recorded in the segmentation result table out as a row.
(1) Create a result table, the segmentation of the word as the result of each row to the table
create table words(word string)
insert into table words select explode(split(line," ")) from file_data
(2) Polymerization statistical function count
select word,count(word)
from words
group by word
(May count (word) alias count, then use to sort order by count)