Table of contents
1. Related technical links:
Building a fully distributed environment for Hadoop
Python crawls data and passes it into Mysql
2. Operation code
cd /export/software
Enter the location of the sqoop installation package
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /export/servers/
Unzip the sqoop installation package to the servers directory
cd /export/servers/
mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop
cd ./sqoop/conf
mv sqoop-env-template.sh sqoop-env.sh
vim sqoop-env.sh
export HADOOP_COMMON_HOME=/export/servers/hadoop-2.7.2
export HADOOP_MAPRED_HOME=/export/servers/hadoop-2.7.2
export HIVE_HOME=/export/servers/hive
export HIVE_CONF_DIR=$HIVE_HOME/conf
export ZOOKEEPER_HOME=/export/servers/zookeeper
export ZOOCFGDIR=/export/servers/conf
Find the sqoop-env.sh file and add these contents.
#HIVE needs to be downloaded and installed by yourself.
vim /etc/profile
SQOOP_HOME=/export/servers/sqoop
PATH=$PATH:$SQOOP_HOME/bin
CLASSPATH=$CLASSPATH:$SQOOP_HOME/lib
export SQOOP_HOME PATH CLASSPATH
source /etc/profile
Add sqoop environment variables
mv /home/kie/mysql-connector-java-5.1.31.jar /export/servers/sqoop/lib
Import the mysql driver package into the sqoop lib directory
sqoop list-databases --connect jdbc:mysql://localhost:3306/ --username root --password root
sqoop list-tables --connect jdbc:mysql://localhost:3306/test -username root -password root
Log in to mysql to view the database and tables
sqoop import --connect jdbc:mysql://localhost:3306/test -username root -password root -table webs -hive-import --create-hive-table -m 1 --target-dir /webs --hive-table dblab.webs
import imports data from the webs table of mysql into hdfs, creates the table, and specifies the directory to be webs
hdfs dfs -cat /webs/part-m-00000 | head -n 20
Use the hdfs command to view the first twenty lines of the data file part-m-00000 in the corresponding directory of webs
3. Achieve results
Open my http://bigdata111:50070/ homepage here, find Browse Directory at the back, and you can see the directory files under the database corresponding to hdfs, and you can also download them.