Solve the problem of local IDEA remote connection to hive

1: First create a maven project. I won’t talk about the step of creating a project. If you can’t, please read another blog of mine. After completion, add dependencies in porm.xml. You need to pay attention to the hadoop-client here. , And do not use the default hhadoop-client version.

<dependencies>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>1.6.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>1.6.0</version>
        <exclusions>
            <exclusion>
                <!--默认是1.0.4版本,要去掉-->
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-client</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>2.6.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.11</artifactId>
        <version>1.6.0</version>
    </dependency>

    <!-- mysql-->
    <dependency>
        <groupId>mysql</groupId>
        <artifactId>mysql-connector-java</artifactId>
        <version>5.1.32</version>
    </dependency>
</dependencies>

I have version 1.6.0 here. If there is a matching version, you can directly copy
2: The first small step is completed, just write a demo below, and put hive-site.xml in your project
resources directory:
Insert picture description here

    val conf = new SparkConf()
      .setMaster("local").setAppName("test")
    val sc = new SparkContext(conf)
    println("程序start!")
    val hiveContext = new HiveContext(sc)
    hiveContext.sql(
      """
        |show databases
      """.stripMargin).show()

Right-click to execute, and the error is as follows:

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

Error 1
Error 2
There are two errors reported here. The first error has been encountered by many people. The local local hadoop is not installed. When there is no content related to hadoop in your project, this error has no effect on the program itself, but When it comes to the need to connect to hadoop, local hadoop must be installed. The second error is also caused by not installing local hadoop, so

2: Install the local hadoop.
Download the tar package of hadoop 2.6.0 (your own corresponding hadoop version) and unzip it to a local directory. Because it is a local windows hadoop, some additional plug-ins are needed. After downloading the plug-ins, decompress them and put winutils Put the .exe and hadoop.dll in the bin directory of hadoop, (plugins are scarce resources. After searching for a long time, I put all the resources last. This is for hadoop2.4 and above, and hadoop2 on github. 2, applicable to hadoop2-hadoop 2.4 and below)
and then execute the above demo program, here you need to configure environment variables, or if it is used for testing, first write the environment path directly in the program

    System.setProperty("hadoop.home.dir", "D:\\hadoopHome\\hadoop-2.6.0")
    val conf = new SparkConf()
      .setMaster("local").setAppName("test")
    val sc = new SparkContext(conf)
    println("程序start!")
    val hiveContext = new HiveContext(sc)
    hiveContext.sql(
      """
        |show databases
      """.stripMargin).show()
  }

However, another error:

Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: The root 

scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------

Error 3
You can see that there is no permission to write, then we give it permission:
Execute in the hadoop bin directory: winutils.exe chmod 777 \tmp\hive
and then execute the program:
this time you can connect to hive, and the result
result
will be displayed directly: but the end will The error of the temporary log deletion failed, but it does not affect the program. It is packaged and put on the cluster to run without any errors. It is estimated that it is a bug in spark, so feel free to use this method. hadoop
2.6.0 download address
Local hadoop plug-in Baidu network disk address:
link: https://pan.baidu.com/s/126nHNWLY28hefuzdG52Nog
extraction code: iisf
copy this content and open the Baidu network disk mobile phone App, the operation is more convenient

Guess you like

Origin blog.csdn.net/qq_39719415/article/details/99735803