sqoop installation and configuration

sqoop installation and configuration

  • download
wget -c http://ftp.riken.jp/net/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
  • decompress
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /usr/java/
  • configure
    • Configure environment variables/etc/profile
    export SQOOP_HOME=/usr/java/sqoop-1.4.7.bin__hadoop-2.6.0
    export PATH=$SQOOP_HOME/bin:$PATH
    
    • Effective immediately
    ource /etc/profile
    
    • editsqoop-env.sh
    cp sqoop-env-template.sh sqoop-env.sh
    vim sqoop-env.sh
    
    Add the following configuration
    export HADOOP_COMMON_HOME=/usr/java/hadoop-3.0.0
    export HADOOP_MAPRED_HOME=/usr/java/hadoop-3.0.0
    export HBASE_HOME=/usr/java/hbase-1.4.1
    export HIVE_HOME=/usr/java/apache-hive-2.3.2-bin
    
    • libDownload the jdbcjar package from the directory
    http://central.maven.org/maven2/mysql/mysql-connector-java/6.0.6/mysql-connector-java-6.0.6.jar
    
    • Comment out the check for , , in the bin/configure-sqoopfileHCAT_HOMEACCUMULO_HOMEZOOKEEPER_HOME
    ## Moved to be a runtime check in sqoop.
    if [ ! -d "${HBASE_HOME}" ]; then
      echo "Warning: $HBASE_HOME does not exist! HBase imports will fail."
      echo 'Please set $HBASE_HOME to the root of your HBase installation.'
    fi
    ## Moved to be a runtime check in sqoop.
    #if [ ! -d "${HCAT_HOME}" ]; then
    #  echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
    #  echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
    #fi
    #if [ ! -d "${ACCUMULO_HOME}" ]; then
    #  echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
    #  echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
    #fi
    #if [ ! -d "${ZOOKEEPER_HOME}" ]; then
    #  echo "Warning: $ZOOKEEPER_HOME does not exist! Accumulo imports will fail."
    #  echo 'Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.'
    #fi
    
  • mysqlimport data tohdfs
    • start the hadoopcluster
    start-all.sh
    
    • mysqlCreate new tabletb_order
    CREATE TABLE mytest.`tb_order` (
      `id` INT(11) NOT NULL AUTO_INCREMENT,
      `name` VARCHAR(50) NOT NULL,
      `amount` DECIMAL(10,2) NOT NULL,
      `create_time` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
      PRIMARY KEY (`id`)
    ) ENGINE=INNODB DEFAULT CHARSET=utf8;
    
    • insert a piece of data
    • execute import command
    sqoop import --connect jdbc:mysql://master:3306/mytest --username root --password wujinlei --table tb_order --target-dir /home/wujinlei/work/mytest/tb_order -m 1
    
    • View imported files
    hdfs dfs -ls /home/wujinlei/work/mytest/tb_order
    hadoop fs -cat /home/wujinlei/work/mytest/tb_order/part-m-00000
    
  • import hive
    • Create hive table
    create table tb_order(id int,name string,amount decimal(10,2),create_time string) row format delimited fields terminated by ','
    lines terminated by '\n';
    
    • import data into hive
    load data inpath '/home/wujinlei/work/mytest/tb_order' overwrite into table tb_order;
    
    • verify the data
    select * from tb_order;
    

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325517313&siteId=291194637