Flink1.8.0 release! New Features Behind the Scenes

1.Flink1.8.0 introduced to clean up the state?
2. Save point compatibility, which version is not compatible?
3.Maven rely on what has changed in convenient Hadoop?
4.Flink whether to publish binary files with the Hadoop?

Flink1.8.0 release, the major changes are as follows:

1. The increment will remove the old State
2. Support for hadoop changed
3. programming TableEnvironment abandoned
4.Flink1.8 will not publish binary files with the Hadoop

More details as follows:

The releases discussed the important aspects of the changes between Flink 1.7 and Flink 1.8, such as configuration, characteristics or dependence.

status

1, using the TTL (time) of successive incremental remove the old Key state

We in Flink 1.6 (FLINK-9510) for the introduction of the Key state TTL (time to live). This feature allows the state to clean and Key entry when accessing inaccessible. In addition, save point / checkpoint in writing, but now they will clean state. Flink 1.8 introduces a number of consecutive clean up old pieces of RocksDB state backend (FLINK-10471) and the rear end of the stack (FLINK-10473) is. This means that the number of old (according to TTL settings) continue to be cleared away.

2, while saving recovery points to the new schema migration support

1.7.0, we added support for changes to state mode when using AvroSerializer (FLINK-10605) using Flink. Use Flink1.8.0, we TypeSerializers migrate all built into the new sequence of snapshots abstract has made great progress, which theoretically allows the abstract schema migration. In serializer Flink comes, we now support PojoSerializer (FLINK-11485) and Java EnumSerializer (FLINK-11334) and Kryo under limited circumstances (FLINK-11323) format migration patterns.

3, save the Compatibility

This update TraversableSerializer serializer (FLINK-11539) in, Flink 1.2 Scala contained in savepoint no longer compatible with Flink 1.8. You can work around this limitation by upgrading to version between 1.7 Flink 1.3 and Flink, and then update to Flink 1.8.

4, RocksDB and switch to a version conflict FRocksDB (FLINK-10471)

You need to switch to a custom named FRocksDB of RocksDB building, because of the need to change some of RocksDB to support the use of TTL continuous state cleanup. FRocksDB has been used to upgrade version RocksDB based version of 5.17.2. For Mac OS X, version only supports OS X> = 10.13 RocksDB version 5.17.2.

Maven relies

1, Flink bundled Hadoop database changes (FLINK-11266)

Convenient binary file containing hadoop is no longer published.

If the deployment is dependent on flink-shaded-hadoop2 comprising flink-dist, you must download and packaged Hadoop jar and copy it to / lib directory optional component part manually from the download page. Another method can be constructed containing Flink hadoop distributed by packing flink-dist include-hadoopmaven and activation profiles.

Since hadoop flink-dist is no longer included by default, so specifying -DwithoutHadoop when packaged flink-dist will not affect the building.

Configuration

1, TaskManager configuration (FLINK-11716)

TaskManagers now default binding to the host IP address instead of the hostname. Taskmanager.network.bind-policy can control this behavior by configuring options. If your Flink cluster upgrade experience connection problems somehow, try to set taskmanager.network.bind-policy: name setting behavior before flink-conf.yaml return of 1.8.

Table API

1, direct table constructor cancellation prediction (FLINK-11447)

Flink 1.8不赞成Table在Table API中直接使用该类的构造函数。此构造函数以前将用于执行与横向表的连接。你现在应该使用table.joinLateral()或 table.leftOuterJoinLateral()代替。这种更改对于将Table类转换为接口是必要的,这将使Table API在未来更易于维护和更清洁。

2、引入新的CSV格式符(FLINK-9964)

此版本为符合RFC4180的CSV文件引入了新的格式符。新描述符可用作 org.apache.flink.table.descriptors.Csv。目前,这只能与Kafka一起使用。旧描述符可org.apache.flink.table.descriptors.OldCsv用于文件系统连接器。

3、静态生成器方法在TableEnvironment(FLINK-11445)上的弃用

为了将API与实际实现分开,TableEnvironment.getTableEnvironment()不推荐使用静态方法。你现在应该使用 Batch/StreamTableEnvironment.create()。

4、表API Maven模块中的更改(FLINK-11064)

之前具有flink-table依赖关系的用户需要更新其依赖关系flink-table-planner以及正确的依赖关系flink-table-api-*,具体取决于是使用Java还是Scala: flink-table-api-java-bridge或者flink-table-api-scala-bridge。

5、更改为外部目录表构建器(FLINK-11522)

ExternalCatalogTable.builder()不赞成使用ExternalCatalogTableBuilder()。

6、更改为表API连接器jar的命名(FLINK-11026)

Kafka/elasticsearch6 sql-jars的命名方案已经更改。在maven术语中,它们不再具有sql-jar限定符,而artifactId现在以前缀为例,flink-sql而不是flink例如flink-sql-connector-kafka。

7、更改为指定Null的方式(FLINK-11785)

现在Table API中的Null需要定义nullof(type)而不是Null(type)。旧方法已被弃用。

连接器

1、引入可直接访问ConsumerRecord的新KafkaDeserializationSchema(FLINK-8354)

对于FlinkKafkaConsumers,我们推出了一个新的KafkaDeserializationSchema ,可以直接访问KafkaConsumerRecord。这包含了该 KeyedSerializationSchema功能,该功能已弃用但目前仍可以使用。

2、FlinkKafkaConsumer现在将根据主题规范过滤恢复的分区(FLINK-10342)

从Flink 1.8.0开始,现在FlinkKafkaConsumer总是过滤掉已恢复的分区,这些分区不再与要在还原的执行中订阅的指定主题相关联。此行为在以前的版本中不存在FlinkKafkaConsumer。如果您想保留以前的行为。请使用上面的 disableFilterRestoredPartitionsWithSubscribedTopics()配置方法FlinkKafkaConsumer。

考虑这个例子:如果你有一个正在消耗topic的Kafka Consumer A,你做了一个保存点,然后改变你的Kafka消费者而不是从topic消费B,然后从保存点重新启动你的工作。在此更改之前,您的消费者现在将使用这两个主题A,B因为它存储在消费者正在使用topic消费的状态A。通过此更改,您的使用者将仅B在还原后使用topic,因为我们使用配置的topic过滤状态中存储的topic。

其它接口改变:

1、从TypeSerializer接口(FLINK-9803)中删除了canEqual()方法

这些canEqual()方法通常用于跨类型层次结构进行适当的相等性检查。在TypeSerializer实际上并不需要这个属性,因此该方法现已删除。

2、删除CompositeSerializerSnapshot实用程序类(FLINK-11073)

该CompositeSerializerSnapshot实用工具类已被删除。现在CompositeTypeSerializerSnapshot,你应该使用复合序列化程序的快照,该序列化程序将序列化委派给多个嵌套的序列化程序。


640?wx_fmt=png


640?wx_fmt=jpeg

Guess you like

Origin blog.csdn.net/u013411339/article/details/90625563