[Translation] Flink table API and SQL

This translation from the official website: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/

The table has not been used flink or sql api, which recently started to use some of the features, the first official website of the corresponding document translation again, easy to see yourself slowly

 

 

-----------------------------------------------

Apache Flink has two associated API-Table API and SQL - a unified flow and batch processing. Scala Table API is integrated query language and Java API, which allows a very intuitive way from a combination of queries relational operator (such as selection, filtering and coupling) of. Flink's SQL-based realization of the SQL standard  the Apache Calcite . Regardless of the input is an input batch (the DataSet) or input stream (DataStream), two interfaces specified in the query have the same semantics and specify the same results.

Table API and tight integration with SQL interface Flink's DataStream and DataSet API. You can easily switch between all API and API-based library. For example, you can use the  CEP library extracted pattern from the DataStream, and then analyzed using the Table API mode, or you can on the preprocessor runs  Gelly FIG algorithm before, using an SQL query, scanning, filtered, and the polymerization batch table data.

Please note, Table API functions and SQL has not been completed, under active development. [Table API, SQL] and [stream, batch] for each combination of inputs do not support all the operations.

Dependency Structure

Starting Flink 1.9, Flink offers two different plans to implement a program to assess Table & SQL API procedures: Blink planner and Flink 1.9 available before old planner. planner  responsible for converting relational operator is executable, optimized Flink jobs. Two kinds planner with different optimization rules and runtime class. They may also differ in terms of supported features.

Note that for the production of cases is recommended before Flink 1.9 old planner.

Table API and all components are bundled in SQL  flink-table or  flink-table-blink Maven assembly.

The following dependencies on most projects related to:

  • flink-table-common: Used to customize the function extension table format ecosystem universal modules.
  • flink-table-api-java: For pure table uses the Java programming language Table & SQL API (in early development, not recommended!).
  • flink-table-api-scala: Pure spreadsheet programs Scala programming language Table & SQL API (in early development, not recommended!).
  • flink-table-api-java-bridge: Using the Java programming language support Table & SQL API with DataStream / DataSet API's.
  • flink-table-api-scala-bridge: Using the Scala programming language support Table & SQL API with DataStream / DataSet API's.
  • flink-table-planner: Table program planner and running. This is the only planner Flink prior to version 1.9. Now it is still recommended.
  • flink-table-planner-blink: New Blink planner.
  • flink-table-runtime-blink: New Blink runtime.
  • flink-table-uber: API module and the above-described old planner packaging to a distribution Table & SQL API most use cases. By default , super JAR files flink-table-*.jar  located on  Flink version of the directory / lib.
  • flink-table-uber-blink: API module and the above-mentioned specific package to the distribution with the majority of Table & SQL API to Blink embodiment of a module. By default , uber JAR files flink-table-blink-*.jar located on /libFlink version of the directory.

For more information about switching between the old and the new planner in the table how the program, please see the common API page.

Table program dependencies

According to target programming language, you need to add Java or Scala API to the project, in order to use SQL Table API and custom pipeline:

<!-- Either... -->
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-table-api-java-bridge_2.11</artifactId>
  <version>1.9.0</version>
  <scope>provided</scope>
</dependency>
<!-- or... -->
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-table-api-scala-bridge_2.11</artifactId>
  <version>1.9.0</version>
  <scope>provided</scope>
</dependency>

Moreover, if you want to run the program and the SQL Table API locally in the IDE, you must add one of the following set of modules, depending on the planner to be used:

 

<!-- Either... (for the old planner that was available before Flink 1.9) -->
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-table-planner_2.11</artifactId>
  <version>1.9.0</version>
  <scope>provided</scope>
</dependency>
<!-- or.. (for the new Blink planner) -->
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-table-planner-blink_2.11</artifactId>
  <version>1.9.0</version>
  <scope>provided</scope>
</dependency>

Internally, part of the table ecosystem implemented in Scala. Therefore, make sure to batch processing and streaming applications add the following dependency: 

<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-streaming-scala_2.11</artifactId>
  <version>1.9.0</version>
  <scope>provided</scope>
</dependency>

Extended dependent

To achieve the Kafka or a set of user-defined functions to interact with the custom formats , the following dependency is sufficient, and may be used SQL Client JAR files:

<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-table-common</artifactId>
  <version>1.9.0</version>
  <scope>provided</scope>
</dependency>

Currently, this extension point module comprising:

  • SerializationSchemaFactory
  • DeserializationSchemaFactory
  • ScalarFunction
  • TableFunction
  • AggregateFunction

Where to go next?

Welcome rookie public attention Flink number will occasionally update Flink (technology development) related Tweets

Guess you like

Origin www.cnblogs.com/Springmoon-venn/p/11826359.html