Familiar with common HBase operations and write MapReduce jobs

1. The tables and data in the following relational databases are required to be converted into tables suitable for HBase storage and inserted into the data:

Student table (Student) (excluding the last column)

Student ID (S_No)

Name (S_Name)

Gender (S_Sex)

Age (S_Age)

course

2015001

Zhangsan

male

23

 

2015003

Mary

female

22

 

2015003

Lysis

male

24

Math 85

  

2. Use the HBase Shell commands provided by Hadoop to accomplish the same task:

  • List information about all HBase tables; list
  • Print out all the recorded data of the student table in the terminal;
  • Add a course column family to the student table;
  • Add a math column to the course column family and register a grade of 85;
  • delete the course column;
  • The number of rows in the statistics table; count 's1'
  • Clear all record data of the specified table; truncate 's1'

 

3. Write WordCount program task in Python

program

WordCount

enter

a text file with lots of words

output

Each word in the file and its number of occurrences (frequency), sorted alphabetically by the words, each word and its frequency occupy a line, and there is a gap between the words and the frequency

  1. Write map function, reduce function
  2. Modify its permissions accordingly
  3. Test and run the code on the local machine
  4. Put it to run on HDFS
  5. Download and upload files to hdfs
  6. Submit jobs with Hadoop Streaming commands

 

create 'Student', ' S_No  ','S_Name', ’S_Sex’,'S_Age'
put 'Student','s001','S_No','2015001'
put 'Student','s001','S_Name','Zhangsan'
put 'Student','s001','S_Sex','male'
put 'Student','s001','S_Age','23'
put 'Student','s002','S_No','2015003'
put 'Student','s002','S_Name','Mary'
put 'Student','s002','S_Sex','female'
put 'Student','s002','S_Age','22'
put 'Student','s003','S_No','2015003'
put 'Student','s003','S_Name','Lisi'
put 'Student','s003','S_Sex','male'
put 'Student','s003','S_Age','24'

  

list

  

scan 'Student'

  

alter ‘Student',NAME=>'course' 

  

put 'Student','3','course:Math','85'

  

dorp 'Student','course'

  

count 'Student'

  

truncate 'Student'

  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325290501&siteId=291194637