How to create a kudu associated table with SparkSQL

From the kudu's official documentation you can see how the spark of integrated kudu. We can see from the document's demo, create kudu table can only be achieved NOSQL API calls, you can not directly create a kudu table by spark sql. After the spark sql query kudu table is a temporary table to register in order to use sql statement query.

Is there a way to submit a Create DDL statements directly on the spark sql to create an association kudu table it?
The answer is: Yes, by USING OPTIONS syntax spark sql implementation, the syntax is used to create the spark sql table external data sources, we can create a data source table kudu through the syntax.

Suppose we have created a kudu table kudu_test by api, then we have to go to create a linked table, spark_kudu_test by spark sql. In this way we will be able to operate kudu data through the spark sql. Steps are as follows:

  1. The jar package kudu-spark2_2.11-1.10.0.jar this copy to the / opt directory. jar package can go maven repository download
  2. Command line into the spark sql ./spark-sql --jars /opt/kudu-spark2_2.11-1.10.0.jar
  3. Implementation of construction of the table statement
CREATE TABLE spark_kudu_test(id1 int, id2 int, id3 string) USING org.apache.kudu.spark.kudu OPTIONS("kudu.master" "node1:7051,node2:7051,node3:7051","kudu.table" "kudu_test");

  1. In this way, the association table kudu table is established out.
Published 135 original articles · won praise 70 · views 110 000 +

Guess you like

Origin blog.csdn.net/qqqq0199181/article/details/104017050