Flink actual combat: how to solve the technical problems in the application?

5 days countdown! From April 25 to 26, the world's first Apache top-level project online conference Flink Forward Elite Edition will be launched.

The essence of Flink Forward Global Online Conference is live broadcast in Chinese. The core content is divided into Keynote and the most interesting talk part of community voting. The original English talk is translated and explained by Apache Flink core contributors. You can directly online for free Watch. This article will detail the live broadcast agenda on the afternoon of April 25.

4/25 Flink Forward Live Afternoon Highlights

1. Keynote : You can learn about the latest progress of Cloudera integration with Flink.
2. Practice series : I will share the application practice of Uber Flink CEP, Netflix's automatic expansion and contraction, the large-scale application of Didi StreamSQL, and what are the worst practice negative teaching materials.
3. Community ecology : Introduce the hands-on application of PyFlink + Zeppelin, how to use AI Flow and Flink to define a production-level AI workflow.
4. Flink SQL : In-depth analysis of Flink SQL and the latest trends of 2020 will be shared.

■ Talk 1

圆桌 | Keynote: Apache Flink - Completing Cloudera’s End to End Streaming Platform

In January this year, Cloudera Hadoop Arun announced on Twitter that Cloudera Data Platform officially integrated Flink as its stream computing product, and Apache Flink PMC Chair Stephan also responded: "This move is significant." This means that all CDH distributions cover the world Business users will be able to use Flink for streaming data processing.

Today, what is the performance of Cloudera Data Platform integrated with Flink. In this Flink Forward, technical experts from Cloudera will share the detailed functions and technical details of their end-to-end stream processing platform.

Sharing guests:

  • Marton Balassi, Apache Flink PMC, one of the first contributors to the streaming API.
  • Joe Witt, VP of Engineering at Cloudera, specializes in Cloudera Data Flow (CDF) products.

Commentary guests:

Yang Kete (Rooney), Apache Member, Apache Flink PMC, Alibaba senior technical expert.

■ Talk 2

Round Table | Flink SQL 2020: Who We Are

Four years ago, the Apache Flink community began adding SQL support to simplify and unify the processing of static and streaming data. Today, Flink runs business-critical batch and streaming SQL queries at Alibaba, Huawei, Lyft, Uber, Yelp, and many other companies. Although the community has made significant progress in the past few years, there are still far-reaching goals on the blueprint, and we are also accelerating development.

In the past few months, the community has added some important improvements and extensions, including support for DDL, refactoring of the type system and Catalog interface, and integration of Apache Hive. In order to follow up all the development work done by Flink SQL and its ecosystem, this conference will focus on introducing Flink SQL in 2020 with a complete example of a system. Based on the actual use case scenario, we will show:

  • How to define tables supported by various storage systems
  • How to use streaming SQL queries to solve common problems
  • Demonstrate the integration of Flink and Hive
  • Demonstrate how to define and use user-defined functions

And, we will share upcoming features and future prospects.

Sharing guests:

  • Fabian Hueske,Apache Flink PMC。
  • Timo Walther,Apache Flink PMC。

Commentary guests:

Wu Chong (Yun Xie), Apache Flink PMC, Alibaba technical expert.

■ Talk 3

Round table | Apache Flink pain of misuse

Distributed stream processing is evolving from a technology at the edge of big data to a key technology that enables companies to provide highly scalable real-time services to their customers. Ververica, the parent company of Apache Flink business, and other users in the Flink community have witnessed this development. In cooperation with our users and the wider community, we have seen some successful cases and also saw some problems.

In this talk, I will share some anecdotes and lessons learned about using distributed stream processing, including the unique and cross-framework of Apache Flink. Through this sharing, you will understand how to eliminate the occurrence of failures and how to look at the big screen without worry.

Sharing guests: Konstantin Knauf, Product Leader of Ververica Platform.

Guest speakers: Sun Jincheng (Jinzhu), Apache Member, Apache Flink PMC, Alibaba Senior Technical Expert.

■ Talk 4

Round Table | Netflix's Flink Auto Scaling

The Keystone data pipeline manages thousands of Flink pipelines with variable workloads. These pipes are simple data routes that read from Kafka and write to one of the three receivers. In order to reduce the operation overhead, we have implemented automatic expansion and contraction for these routing programs.

Automatic scaling reduces our resource usage by 25% -45% (varies by region and time), greatly reducing the burden. This talk will delve into the details of the mathematics, algorithms and infrastructure details for the automatic scaling of large-scale simple pipelines and discuss the future work of automatic scaling of complex pipelines.

Sharing guests : Timothy Farkas, Netflix software engineer.

Guest speaker : Lv Wenlong (Long San), Alibaba technical expert.

■ Talk 5

Round table | Uber: Practice of using Flink CEP for geographic situation detection

Uber operates in a complex physical world. One of the challenges of providing reliable services is to detect geolocation and dynamic scenes in real time, such as hot spots in the space, unbalanced demand / supply streets, etc. Due to Uber's global scale and congested streets and traffic, this problem is difficult to solve.

To solve this problem, Uber engineers built a geospatial condition detection platform supported by Apache Flink and CEP libraries. In this talk, Uber engineers will introduce how to use Apache Flink and derive the geospatial semantics and the challenges involved in various technologies built and adopted on the platform through CEP pattern matching.

Sharing guests : Teng (Niel) Hu, Uber software engineer.

Guest speakers : Fu Dian, Apache Flink Committer, Alibaba technical expert.

■ Talk 6

Presentation | A deep dive into Flink SQL

In the past two major versions (1.9 and 1.10), the Apache Flink community has spent a lot of energy to transform the architecture and make the architecture more unified. An example is Flink SQL provides multiple SQL planner support under a set of APIs. This lecture will first discuss the motivation behind these actions, and then will go deep into Flink SQL to introduce some of its internal operating mechanisms.

This presentation will introduce the unified architecture of streaming batches, and how Flink translates queries into relational expressions, and uses Calcite to optimize them, and then generates efficient runtime code. In addition, it will also introduce the life cycle of the query in detail, how some common optimizations work, how Flink uses the binary data format as the basic data structure, and how certain specific operators work. This will bring listeners a better understanding of the internal mechanisms of Flink SQL.

Sharing guests:

  • Yang Kete (Rooney), Apache Member, Apache Flink PMC, Alibaba senior technical expert.
  • Wu Chong (Yun Xie) Apache Flink PMC, Alibaba technical expert.

■ Talk 7

Speech | Flink's application at Didi

Didi has rich real-time computing scenarios. Flink has been widely used in real-time monitoring, data channels, feature extraction, real-time data warehouse, online business and other fields. We have also built StreamSQL products based on the Flink Table API, combined with a one-stop development platform, Reduced the user's cost of use. At present, StreamSQL coverage has exceeded 80%. At present, the real-time computing tasks of Didi have reached 7000+, and the amount of data processed daily exceeds 2 trillion.

Guest speaker: Xue Kang, current Didi technical expert, person in charge of real-time computing. He graduated from Zhejiang University and was a senior R & D engineer at Baidu. He has extensive experience in big data ecological construction.

■ Talk 8

Speech | Finally waiting for you: PyFlink + Zeppelin

Flink has made tremendous progress in its unified batch and stream processing core engine, but the threshold for users to get started is still very high. For example, it is especially difficult for data analysts and data scientists who are only familiar with Python and SQL. For many years, users have requested to provide built-in and comprehensive Python support in Apache Flink so that they can take advantage of Flink ’s unique features while using their familiar programming languages.

Version 1.9 of Apache Flink added the Python Table API (also known as PyFlink); and added support for native Python UDF (Portability Framework based on Apache Beam) in 1.10. In the future, we will continue to improve PyFlink. In the next version, we will support the definition of Python's machine learning process, which will enable users to implement complex machine learning applications completely in PyFlink. In addition, we also integrated Flink and Zeppelin notebook, and redesigned the outdated Flink interpreter in Zeppelin, making it suitable for the following three main Flink scenarios:

Batch processing ETL and exploratory data analysis
through Flink batch processing SQL + UDF + Zeppelin's built-in visualization function; streaming ETL and streaming data analysis through Flink streaming SQL + UDF + Zeppelin's built-in visualization function;
through PyFlink + Alink Write a machine learning process.

Sharing guests:

  • Sun Jincheng (Jinzhu), Apache Member, Apache Flink PMC, Alibaba Senior Technical Expert.
  • Zhang Jianfeng (Jian Feng), Apache Member, Apache Zeppelin PMC, senior technical expert of Alibaba.

■ Talk 9

Speech | Flink + AI Flow: Making AI easy

At present, there are already many projects to help users build their artificial intelligence platforms, such as MLFlow, TFX, Metaflow, Sagemaker, etc. Most of these projects focus on offline training and online inference scenarios, and some of them are only available on specific engines and platforms.

In this presentation, we will introduce a new project called AI Flow, which solves both online and offline training processes, and does not rely heavily on engines and platforms, so users can easily define a in a highly mixed environment AI workflow. On the other hand, as a unified engine, Flink is one of the few engines that can implement all the semantics defined in AI Flow. We will demonstrate how users can use AI Flow and Flink to define a production-level AI workflow.

Sharing guests: Qin Jiangjie, Apache Flink PMC, Alibaba senior technical expert.

Flink Forward Global Online Conference Chinese Essence Edition

■ The best way to watch

This live broadcast will be performed on the official website of the Chinese version of the Flink Forward conference. Click "Read Original" or copy the link on the official website below for more details. After registering and logging in, you can schedule a live broadcast to watch. At that time, the community will remind you to participate in the form of SMS notification in advance.

Conference official website live reservation:
https://developer.aliyun.com/topic/ffsf2020

640.jpg

After the reservation is successful, the following is displayed:

640 22.png

■ Full version of the agenda

The Flink Forward Global Live Essence Edition is divided into four parts: Keynote key issues, Flink best practices, in-depth technology applications, and community ecology. The form is live broadcast in turns from Beijing, Shanghai, and Hangzhou. Here you will pass the practice of diverse scenarios Case studies Flink's core advantages and future development.

  • Live time: April 25-26
  • Sharing guests:

    • Apache Member、Flink PMC
    • Apache Flink core contributor
    • First-line technical expert of Dachang
  • Detailed agenda:

The latest version of FFSF_0407_banner_0407_750_250 Copy 2.jpg

(The final issue is subject to reality)

April 25-26, lock the Flink Forward Global Live Chinese Digest Edition! For more details of the conference, please scan the QR code below to enter the group consultation ~

[2 groups] QR code of community large group.jpg

Guess you like

Origin yq.aliyun.com/articles/756495
Recommended