Flink 1.9 Table API & Sql Data Type Support

Flink 1.9 Table API & SQL new features include the following:

  • The new SQL type system : Table API & SQL 1.9 introduces a new SQL type system. Table conventional type system layer multiplexing in the Runtime TypeInformation, but encountered many limitations in the process of actual operation. The introduction of new SQL type system can be better aligned SQL semantics.
  • Initial support DDL: Flink this version also introduces preliminary support for DDL, the user can use the Create Table, or Drop Table and other simple syntax for defining a table or delete the table.
  • Table API enhancements: Table API API only original relational expressions, Table API & SQL joined the Map, FlatMap such as more flexible API 1.9 now.
  • Unified Catalog API: After the Table API & SQL 1.9 introduces a unified Catalog API, you can easily and others Catalog docking. Such as the common Hive, through a unified Catalog API, to achieve plug-and Hive.metastore interactivity, which can be directly read and processed Flink Hive tables.
  • Blink planner: Table API adds support Blink planner, because in the bottom of the Runtime to do a larger change, the upper layer of the required SQL Planner with the underlying Runtime for docking. In order to ensure that the original Table API users are not affected as much as possible, the community intact the original Flink Planner. But at the same time it introduced a new Blink planner, docking with the new Runtime design.

The main article of the new features of the official website of the Data Type (new SQL type system) to translate: https: //ci.apache.org/projects/flink/flink-docs-release-1.9/zh/dev/table/types. html # data-types-in-the-table-api

Due to historical reasons, before Flink 1.9, closely related Flink's Table and SQL API data types and Flink of TypeInformation. TypeInformation for DataStream and DataSet API, the information is sufficient to describe all serialization and deserialization of the objects in the desired JVM-based distributed environment.

However, TypeInformation not intended to represent independent of the logical type of the actual JVM class. In the past, it is difficult to standard SQL type mapping this abstract. In addition, certain types do not conform SQL, and does not consider a larger scope when introduced.

Flink 1.9 from the beginning, Table & SQL API and obtain a new system which can be used as a standard API stability and long-term compliance solutions.

Redesign is a type of system involving almost all major user-oriented interface work. Therefore, its introduction covers multiple releases, community aims to complete this work by Flink 1.10.

Because while adding a new program plan for the Table (see Flink-11439 ), not every combination of support programs and program data types. In addition, planners may not support every type of data or parameters with the required precision.

** Note: ** Before using data types, see the implementation plan (planner) Compatibility Table and limitations section.

type of data

Data type Value Description Table logical type of ecosystem. It can be used to input and / or output operation type declaration.

Flink data types to SQL data types terminology similar standard, but also contains information about the null values ​​of effective treatment may be scalar expressions.

type of data:

  • INT
  • INT NOT NULL
  • INTERVAL DAY TO SECOND(3)
  • ROW<myField ARRAY<BOOLEAN>, myOtherField TIMESTAMP(3)>

Table API Data Type

API users may use the JVM instance org.apache.flink.table.types.DataType in Table API, or in the definition of the connector (Connectors), use the function directory (Catalogs) or based on user defined.

DataType instance has two responsibilities:

Logic type declarations **: ** This does not mean that the particular physical transport or storage expressed, but based on the boundaries defined between the table and the language of the JVM ecosystems.

** Optional: ** provide prompt indicates that the relevant data to the physical implementation plan, which for the other API edge useful.

For the JVM-based languages, org.apache.flink.table.api.DataTypes provides all of the predefined data types.

We recommend that you '* (scala with' _ 'represents a)' Import into Table program to use the fluent API:

import org.apache.flink.table.api.DataTypes._

val t: DataType = INTERVAL(DAY(), SECOND(3));

Physical Tip

In the edge type SQL-based system and the end of the programming requires a specific type of data tables ecosystems, requiring physical prompt. Tip instructions to achieve the required data format.

For example, it may represent a data source used instead of the default class java.sql.Timestamp java.time.LocalDateTime generation logic TIMESTAMP value. With this information, the runtime class can be generated into its internal data format. In return, the receiver may declare a data format which the data from the runtime.

The following is an example of how to declare bridging conversion class:

// tell the runtime to not produce or consume java.time.LocalDateTime instances
// but java.sql.Timestamp
val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]);

// tell the runtime to not produce or consume boxed integer arrays
// but primitive int arrays
val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]);

** Note: ** usually only need physical prompts only when the extended API. sources / sinks / functions without the need to define such tips. Table program prompts (e.g. field.cast (TIMESTAMP (3) .bridgedTo (Timestamp.class))) are ignored.

Implementation plan (Planner) compatibility

As mentioned above, re-processing type system will span multiple versions of each data type support program depends on the program you are using. This section is intended to summarize the most important differences.

Flink1.9 incorporates Blink code, which means when we use the Table API development process, you can choose between two different execution plan.

One is the original execution plan Flink, collectively OldPlanner

One is Blink implementation plan, collectively BlinkPlanner

Old Planner

Before the launch of Flink 1.9 Flink Old Planner supports the main types of information, only Data type limited support. You can declare can be converted to Type information of the Data of the type , so that old planner can understand them.

The following table summarizes the Data type and Type information distinction between. Most simple types and line types remain unchanged. Time types, array types and decimal types require special attention. No other tips.

For " the Type Information " column, the table is omitted prefix org.apache.flink.table.api.Types.

For " the Data Representation the Type " column, the table is omitted prefix org.apache.flink.table.api.DataTypes.

Type Information Java Expression String Data Type Representation
STRING() STRING STRING()
BOOLEAN() BOOLEAN BOOLEAN()
BYTE() BYTE TINYINT()
SHORT() SHORT SMALLINT()
INT() INT INT()
LONG() LONG BIGINT()
FLOAT() FLOAT FLOAT()
DOUBLE() DOUBLE DOUBLE()
ROW(...) ROW<...> ROW(...)
BIG_DEC() DECIMAL [DECIMAL()]
SQL_DATE() SQL_DATE DATE().bridgedTo(java.sql.Date.class)
SQL_TIME() SQL_TIME TIME(0).bridgedTo(java.sql.Time.class)
SQL_TIMESTAMP() SQL_TIMESTAMP TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class)
INTERVAL_MONTHS() INTERVAL_MONTHS INTERVAL(MONTH()).bridgedTo(Integer.class)
INTERVAL_MILLIS() INTERVAL_MILLIS INTERVAL(DataTypes.SECOND(3)).bridgedTo(Long.class)
PRIMITIVE_ARRAY(...) PRIMITIVE_ARRAY<...> ARRAY(DATATYPE.notNull().bridgedTo(PRIMITIVE.class))
PRIMITIVE_ARRAY(BYTE()) PRIMITIVE_ARRAY<BYTE> BYTES()
OBJECT_ARRAY(...) OBJECT_ARRAY<...> ARRAY(DATATYPE.bridgedTo(OBJECT.class))
MULTISET(...) MULTISET(...)
MAP(..., ...) MAP<...,...> MAP(...)
other generic types ANY(...)

Note: The new data type if there are problems. Users can always fall back to type the information defined in org.apache.flink.table.api.Types in.

New Blink Planner

The new Blink planner supports all types Old planner program. Including Java expression string and type information listed ..

It supports the following data types:

Data Type Remarks for Data Type
STRING CHAR and VARCHAR are not supported yet.
BOOLEAN
BYTES BINARY and VARBINARY are not supported yet.
DECIMAL Supports fixed precision and scale.
TINYINT
SMALLINT
INTEGER
BIGINT
FLOAT
DOUBLE
DATE
TIME Supports only a precision of 0.
TIMESTAMP Supports only a precision of 3.
TIMESTAMP WITH LOCAL TIME ZONE Supports only a precision of 3.
INTERVAL Supports only interval of MONTH and SECOND(3).
ARRAY
MULTISET
MAP
ROW
ANY

limitation

Expression String Java : Java expression in the Table API strings, such as table.select ( "field.cast (STRING)" ) has not been updated for a new type of system. Using the string representation old planner portion declared.

Descriptors and the SQL Client Connector : descriptor string representation of the system has not been updated to the new type. Use the " Connect to External Systems Section strings declared" section representation

Functions-defined the User : User-defined functions can not yet declare a data type.

Data Type list

This section lists all predefined data types. For JVM-based Table API, also be used in these types of org.apache.flink.table.api.DataTypes.

String

CHAR

Data type fixed-length string.

SQL

CHAR
CHAR(n)

JAVA / SCALE

DataTypes.CHAR(n)

You may be used CHAR (n) declare type, where n is the number of code points. The value of n must be between 1 to 2,147,483,647 (including both). If length is not specified, then n is equal to 1.

STRING is VARCHAR (2147483647) synonyms.

VARCHAR / STRING

Variable-length string data type.

SQl

VARCHAR
VARCHAR(n)

STRING

JAVA / SCALE

DataTypes.VARCHAR(n)

DataTypes.STRING()

可以使用VARCHAR(n)声明类型,其中n是最大代码点数。n的值必须介于1到2,147,483,647(包括两者)之间。如果未指定长度,则n等于1。

STRING是VARCHAR(214748364)的同义词。

桥接到JVM类型

Java Type Input Output Remarks
java.lang.String X X Default
byte[] X X Assumes UTF-8 encoding.

二进制字符串

BINARY

固定长度的二进制字符串(=字节序列)的数据类型。

SQL

BINARY
BINARY(n)

JAVA/SCALA

DataTypes.BINARY(n)

可以使用BINARY(n)声明类型,其中n是字节数。n的值必须介于1到2,147,483,647(包括两者)之间。如果未指定长度,则n等于1。

桥接到JVM类型

Java Type Input Output Remarks
byte[] X X Default

精确数值

DECIMAL

具有固定精度和小数位数的十进制数字的数据类型。

SQL

DECIMAL
DECIMAL(p)
DECIMAL(p, s)

DEC
DEC(p)
DEC(p, s)

NUMERIC
NUMERIC(p)
NUMERIC(p, s)

JAVA/SCALA

DataTypes.DECIMAL(p, s)

可以使用DECIMAL(p,s)声明类型,其中p是数字(精度)中的位数,而s是数字(小数位)中小数点右边的位数。p的值必须介于1到38之间(包括两者之间)。s的值必须介于0到p之间(包括两者之间)。p的默认值为10。s的默认值为0。

NUMERIC(p,s)和DEC(p,s)是此类型的同义词。

桥接到JVM类型

Java Type Input Output Remarks
java.math.BigDecimal X X Default

TINYINT

1字节有符号整数的数据类型,其值从-128到127。

SQL

TINYINT

JAVA/SCALA

DataTypes.TINYINT()

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Byte X X Default
byte X (X) Output only if type is not nullable.

SMALLINT

2字节有符号整数的数据类型,其值从-32,768到32,767。

SQL

SMALLINT

JAVA/SCALA

DataTypes.SMALLINT()

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Short X X Default
short X (X) Output only if type is not nullable.

INT

一个4字节有符号整数的数据类型,其值从-2,147,483,648到2,147,483,647。

SQL

INT

INTEGER

JAVA/SCALA

DataTypes.INT()

INTEGER是此类型的同义词。

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Integer X X Default
int X (X) Output only if type is not nullable.

BIGINT

一个8字节有符号整数的数据类型,其值从-9,223,372,036,854,775,808到9,223,372,036,854,775,807。

SQL

BIGINT

JAVA/SCALA

DataTypes.BIGINT()

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Long X X Default
long X (X) Output only if type is not nullable.

近似数值

FLOAT

4字节单精度浮点数的数据类型。与SQL标准相比,该类型不带参数。

SQL

FLOAT

JAVA/SCALA

DataTypes.FLOAT()

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Float X X Default
float X (X) Output only if type is not nullable.

DOUBLE

8字节双精度浮点数的数据类型。

SQL

DOUBLE

DOUBLE PRECISION

JAVA/SCALA

DataTypes.DOUBLE()

DOUBLE PRECISION是此类型的同义词。

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Double X X Default
double X (X) Output only if type is not nullable.

日期和时间

DATE

日期的数据类型,由年-月-日组成,值的范围从0000-01-01到9999-12-31。与SQL标准相比,范围从0000年开始。与SQL标准相比,范围从0000年开始。

SQL

DATE

JAVA/SCALA

DataTypes.DATE()

桥接到JVM类型

Java Type Input Output Remarks
java.time.LocalDate X X Default
java.sql.Date X X
java.lang.Integer X X Describes the number of days since epoch.
int X (X) Describes the number of days since epoch. Output only if type is not nullable.

TIME

不带时区的时间的数据类型,由小时:分钟:秒[.fractional]组成,精度高达纳秒,范围从00:00:00.000000000到23:59:59.999999999。与SQL标准相比,不支持leap秒(23:59:60和23:59:61),因为语义更接近java.time.LocalTime。没有提供带时区的时间。

SQL

TIME
TIME(p)

JAVA/SCALA

DataTypes.TIME(p)

桥接到JVM类型

Java Type Input Output Remarks
java.time.LocalTime X X Default
java.sql.Time X X
java.lang.Integer X X Describes the number of milliseconds of the day.
int X (X) Describes the number of milliseconds of the day. Output only if type is not nullable.
java.lang.Long X X Describes the number of nanoseconds of the day.
long X (X) Describes the number of nanoseconds of the day. Output only if type is not nullable.

TIMESTAMP

不带时区的时间戳记的数据类型,由年-月-日hour:minute:second [.fractional]组成,精度高达纳秒,范围从0000-01-01 00:00:00.000000000到9999-12-31 23:59:59.999999999。

与SQL标准相比,不支持leap秒(23:59:60和23:59:61),因为语义更接近java.time.LocalDateTime。

不支持与BIGINT(JVM长类型)之间的转换,因为这暗示了时区。但是,此类型没有时区。有关更多类似于java.time.Instant的语义,请使用TIMESTAMP WITH LOCAL TIME ZONE。

SQL

TIMESTAMP
TIMESTAMP(p)

TIMESTAMP WITHOUT TIME ZONE
TIMESTAMP(p) WITHOUT TIME ZONE

JAVA/SCALA

DataTypes.TIMESTAMP(p)

可以使用TIMESTAMP§声明类型,其中p是小数秒(精度)的位数。p必须具有介于0和9之间的一个值(包括两者)。如果未指定精度,则p等于6。

TIMESTAMP§WITH TIME ZONE是此类型的同义词。

桥接到JVM类型

Java Type Input Output Remarks
java.time.LocalDateTime X X Default
java.sql.Timestamp X X

TIMESTAMP WITH TIME ZONE

时间戳的数据类型,时区由年-月-日hour:minute:second [.fractional]区域组成,精度达纳秒,范围从0000-01-01 00:00:00.000000000 +14:59到9999-12-31 23:59:59.999999999 -14:59。

与SQL标准相比,不支持leap秒(23:59:60和23:59:61),因为语义更接近java.time.OffsetDateTime。

与具有本地时区的TIMESTAMP相比,时区偏移量信息物理存储在每个数据中。它单独用于每个计算,可视化或与外部系统的通信。

SQL

TIMESTAMP WITH TIME ZONE
TIMESTAMP(p) WITH TIME ZONE

JAVA/SCALA

DataTypes.TIMESTAMP_WITH_TIME_ZONE(p)

可以使用TIMESTAMP§WITH TIME ZONE声明类型,其中p是小数秒(精度)的位数。p必须具有介于0和9之间的一个值(包括两者)。如果未指定精度,则p等于6。

桥接到JVM类型

Java Type Input Output Remarks
java.time.OffsetDateTime X X Default
java.time.ZonedDateTime X Ignores the zone ID.

TIMESTAMP WITH LOCAL TIME ZONE

具有本地时区的时间戳记的数据类型,该时区包括年-月-日hour:minute:second [.fractional]区域,精度达纳秒,范围从0000-01-01 00:00:00.000000000 +14:59到9999-12-31 23:59:59.999999999 -14:59。

不支持秒(23:59:60和23:59:61),因为语义更接近java.time.OffsetDateTime。

与TIMESTAMP WITH TIME ZONE相比,时区偏移量信息并非物理存储在每个基准中。相反,该类型在表生态系统边缘的UTC时区中采用java.time.Instant语义。每个数据都在当前会话中配置的本地时区中进行解释,以进行计算和可视化。

通过允许根据配置的会话时区解释UTC时间戳,此类型填补了时区空闲和时区强制时间戳类型之间的空白。

SQL

TIMESTAMP WITH LOCAL TIME ZONE
TIMESTAMP(p) WITH LOCAL TIME ZONE

JAVA/SCALA

DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)

可以使用TIMESTAMP§WITH LOCAL TIME ZONE声明类型,其中p是小数秒(精度)的位数。p必须具有介于0和9之间的一个值(包括两者)。如果未指定精度,则p等于6。

桥接到JVM类型

Java Type Input Output Remarks
java.time.Instant X X Default
java.lang.Integer X X Describes the number of seconds since epoch.
int X (X) Describes the number of seconds since epoch. Output only if type is not nullable.
java.lang.Long X X Describes the number of milliseconds since epoch.
long X (X) Describes the number of milliseconds since epoch. Output only if type is not nullable.

INTERVAL YEAR TO MONTH

一组年月间隔类型的数据类型。

必须将类型参数化为以下分辨率之一:

  • 年间隔
  • 几年到几个月的间隔
  • 或间隔几个月

年-月的间隔由+年-月组成,其值的范围为-9999-11至+ 9999-11。

所有类型的分辨率的值表示均相同。例如,间隔为50的月间隔始终以“年间隔”格式表示(默认为“年精度”):+ 04-02。

SQL

INTERVAL YEAR
INTERVAL YEAR(p)
INTERVAL YEAR(p) TO MONTH
INTERVAL MONTH

JAVA/SCALA

DataTypes.INTERVAL(DataTypes.YEAR())
DataTypes.INTERVAL(DataTypes.YEAR(p))
DataTypes.INTERVAL(DataTypes.YEAR(p), DataTypes.MONTH())
DataTypes.INTERVAL(DataTypes.MONTH())

可以使用上述组合声明类型,其中p是年份的位数(年精度)。p的值必须介于1到4之间(包括两者之间)。如果未指定年份精度,则p等于2。

桥接到JVM类型

Java Type Input Output Remarks
java.time.Period X X Ignores the days part. Default
java.lang.Integer X X Describes the number of months.
int X (X) Describes the number of months. Output only if type is not nullable.

INTERVAL DAY TO MONTH

一组天时间间隔类型的数据类型。

必须将类型参数设置为以下分辨率之一,精度最高为纳秒:

  • 天间隔
  • 几天到几小时的间隔
  • 天到分钟的间隔
  • 天到秒的间隔
  • 小时间隔
  • 小时到几分钟的间隔
  • 几小时到几秒钟的间隔
  • 分钟的间隔
  • 分钟到秒的间隔
  • 秒间隔

白天的时间间隔由+ day hours:months:seconds.fractional组成,范围从-999999 23:59:59.999999999到+999999 23:59:59.999999999。所有类型的分辨率的值表示均相同。例如,秒间隔始终以天间隔格式(具有默认精度)表示:+00 00:01:10.000000。

SQL

INTERVAL DAY
INTERVAL DAY(p1)
INTERVAL DAY(p1) TO HOUR
INTERVAL DAY(p1) TO MINUTE
INTERVAL DAY(p1) TO SECOND(p2)
INTERVAL HOUR
INTERVAL HOUR TO MINUTE
INTERVAL HOUR TO SECOND(p2)
INTERVAL MINUTE
INTERVAL MINUTE TO SECOND(p2)
INTERVAL SECOND
INTERVAL SECOND(p2)

JAVA/SCALA

DataTypes.INTERVAL(DataTypes.DAY())
DataTypes.INTERVAL(DataTypes.DAY(p1))
DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.HOUR())
DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.MINUTE())
DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.SECOND(p2))
DataTypes.INTERVAL(DataTypes.HOUR())
DataTypes.INTERVAL(DataTypes.HOUR(), DataTypes.MINUTE())
DataTypes.INTERVAL(DataTypes.HOUR(), DataTypes.SECOND(p2))
DataTypes.INTERVAL(DataTypes.MINUTE())
DataTypes.INTERVAL(DataTypes.MINUTE(), DataTypes.SECOND(p2))
DataTypes.INTERVAL(DataTypes.SECOND())
DataTypes.INTERVAL(DataTypes.SECOND(p2))

可以使用上述组合来声明类型,其中p1是天的位数(天精度),p2是小数秒的位数(分数精度)。p1的值必须介于1到6之间(包括两者之间)。p2的值必须介于0到9之间(包括两者之间)。如果未指定p1,则默认情况下等于2。如果未指定p2,则默认情况下等于6。

桥接到JVM类型

Java Type Input Output Remarks
java.time.Duration X X Default
java.lang.Long X X Describes the number of milliseconds.
long X (X) Describes the number of milliseconds. Output only if type is not nullable.

复合数据类型

ARRAY

具有相同子类型的元素数组的数据类型。

与SQL标准相比,无法指定数组的最大基数,但固定为2,147,483,647。另外,任何有效类型都支持作为子类型。

SQL

ARRAY<t>
t ARRAY

JAVA/SCALA

DataTypes.ARRAY(t)

可以使用ARRAY 声明类型,其中t是所包含元素的数据类型。ARRAY是接近SQL标准的同义词。例如,INT ARRAY等效于ARRAY 。

桥接到JVM类型

Java Type Input Output Remarks
t[] (X) (X) Depends on the subtype. Default

MULTISET

多重集的数据类型(=bbag)。与集合不同,它允许每个元素具有公共子类型的多个实例。每个唯一的值(包括NULL)都映射到某些多重性。

元素类型没有限制;确保唯一性是用户的责任。

SQL

MULTISET<t>
t MULTISET

JAVA/SCALA

DataTypes.MULTISET(t)

可以使用MULTISET 声明类型,其中t是所包含元素的数据类型。t MULTISET是接近SQL标准的同义词。例如,INT MULTISET等效于MULTISET 。

桥接到JVM类型

Java Type Input Output Remarks
java.util.Map<t, java.lang.Integer> X X Assigns each value to an integer multiplicity. Default

ROW

字段序列的数据类型。

字段由字段名称,字段类型和可选描述组成。表中某行的最特定类型是行类型。在这种情况下,该行的每一列对应于具有与该列相同序号位置的行类型的字段。

与SQL标准相比,可选的字段描述简化了复杂结构的处理。

Row类型类似于其他非标准兼容框架中已知的STRUCT类型。

SQL

ROW<n0 t0, n1 t1, ...>
ROW<n0 t0 'd0', n1 t1 'd1', ...>

ROW(n0 t0, n1 t1, ...>
ROW(n0 t0 'd0', n1 t1 'd1', ...)

JAVA/SCALA

DataTypes.ROW(DataTypes.FIELD(n0, t0), DataTypes.FIELD(n1, t1), ...)
DataTypes.ROW(DataTypes.FIELD(n0, t0, d0), DataTypes.FIELD(n1, t1, d1), ...)

可以使用ROW <n0 t0’d0’,n1 t1’d1’,…>声明类型,其中n是字段的唯一名称,t是字段的逻辑类型,d是字段的描述。

ROW(…)是更接近SQL标准的同义词。例如,ROW(myField INT,myOtherField BOOLEAN)等效于ROW <myField INT,myOtherField BOOLEAN>。

桥接到JVM类型

Java Type Input Output Remarks
org.apache.flink.types.Row X X Default

其他数据类型

BOOLEAN

具有(可能)三值逻辑TRUEFALSEUNKNOWN的布尔数据类型。

SQL

BOOLEAN

JAVA/SCALA

DataTypes.BOOLEAN()

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Boolean X X Default
boolean X (X) Output only if type is not nullable.

NULL

表示无类型NULL值的数据类型。

空类型是SQL标准的扩展。空类型除了NULL外没有其他值,因此可以将其强制转换为类似于JVM语义的任何可空类型。

此类型有助于表示使用NULL文字的API调用中的未知类型,以及桥接到也定义此类类型的格式(例如JSON或Avro)。

这种类型在实践中不是很有用,为完整性起见在此仅提及。

SQL

NULL

JAVA/SCALA

DataTypes.NULL()

桥接到JVM类型

Java Type Input Output Remarks
java.lang.Object X X Default
any class (X) Any non-primitive type.

ANY

任意序列化类型的数据类型。此类型是表生态系统内的黑匣子,仅在边缘反序列化。any类型是SQL标准的扩展。

SQL

ANY('class', 'snapshot')

JAVA/SCALA

DataTypes.ANY(class, serializer)

DataTypes.ANY(typeInfo)

可以使用ANY(‘class’,‘snapshot’)声明类型,其中class是原始类,快照是使用Base64编码的序列化TypeSerializerSnapshot。通常,类型字符串不是直接声明的,而是在保留类型时生成的。在API中,可以通过直接提供Class + TypeSerializer或通过传递TypeInformation并让框架从那里提取Class + TypeSerializer来声明ANY类型。

桥接到JVM类型

Java Type Input Output Remarks
class X X Originating class or subclasses (for input) or superclasses (for output). Default
byte[] X

因笔者英文水平有限,如有翻译不对的地方请留言指出,笔者一定会在最短的时间内进行更改!

发布了87 篇原创文章 · 获赞 69 · 访问量 13万+

Guess you like

Origin blog.csdn.net/lp284558195/article/details/104268401