FFA 2023|Bytedance 7 topics selected

 
Flink Forward is the official technical conference of the Apache Flink community officially authorized by Apache. As one of the most anticipated annual summits among Apache Flink community developers, FFA 2023 will continue to gather industry best practices and the latest technology trends of Flink. It is the most anticipated event for Flink development in China. A technological feast that developers and users cannot miss.
 
This year Flink Forward Asia (hereinafter referred to as FFA) will return offline and will Held on December 8-9 at Hyatt Regency Beijing Wangjing . Continuing FFA's practice, all topics at the summit are open for collection and rated and screened by a professional topic selection committee to ensure that the content represents the leading level in the industry, outputs more high-quality dry information for developers, and provides practices that can be learned by various companies. experience. This conference, Bytedance 7 topics were selected, including OLAP Serverless, Streaming batch In addition to the large-scale implementation of integrated and automated operation and maintenance, it also provides monitoring and early warning and real-time data warehouse behind Douyin, Toutiao and other businesses. , recommend platforms to practice and share.
FFA 2023 官网: https://flink-forward.org.cn/
 

Topic recommendation

FlinkOLAP The implementation of serverless capabilities in ByteDance

Feng Xiangyu, Bytedance Infrastructure Engineer
Speech introduction: With the gradual implementation of Flink OLAP’s job scheduling and job execution optimization under high QPS, Flink OLAP’s business scale within Byte has also grown significantly, and it has also More diverse computing challenges have been encountered, the most important of which is the serverlessization of computing capabilities. By developing and implementing features such as resource isolation, elastic expansion and contraction, graceful exit, cold start optimization, and multi-policy current limiting, we have completed the transformation of Flink OLAP's Serverless capabilities and successfully completed business promotion internally. In this sharing, we will introduce the Serverless capability challenges encountered by Flink OLAP from five aspects: resource isolation implementation, cloud native capability enhancement, elastic expansion and contraction, business implementation and planning.
 

The large-scale implementation of streaming and batch integration in ByteDance

Su Dewei, Bytedance Infrastructure Engineer
Speech introduction: Flink is the de facto standard for stream computing, but it is not widely used in batch computing scenarios. With the improvement of the streaming-batch integration capability of the Flink engine, within ByteDance we have migrated 2.2w+ Spark SQL jobs in offline data synchronization scenarios to Flink Batch SQL to promote the implementation of streaming-batch integration. Spark SQL jobs have rich patterns. By solving a large number of Spark SQL compatibility issues and performing data accuracy verification and automated migration, we can smoothly migrate daily routine jobs to Flink Batch and run them stably online. This sharing will mainly introduce the large-scale implementation of streaming and batch integration in ByteDance from the following five parts.
 

Byte full incremental integrated real-time data construction solution

Qin Binglun: Bytedance real-time data engineer & Liu Xiang: Bytedance real-time data engineer
Speech introduction: Byte’s business includes many full-increment integrated data scenarios. Typical scenarios include user portraits, risk control, real estate, etc. Operations require both long-term and long-term scope. There is a high demand for data freshness, so the full incremental integration of real-time data is of great value. This sharing mainly introduces the challenges and solutions faced in the construction process of full incremental integrated flow computing.
 

Flink Large-scale implementation of automated operation and maintenance

Chen Zhanghao, Bytedance Infrastructure Engineer
Speech introduction: In the past few years, the scale of ByteDance’s internal Flink operations has gradually increased to tens of thousands, and the limited manpower of the business has tended to be unable to cope with changes in traffic and the need for manual labor Tuning resource configurations and single-machine problems can easily lead to manual migration of slow nodes, which brings about operational pressure. In the face of operation and maintenance problems, a set of Flink automated operation and maintenance systems has been precipitated and implemented internally on a large scale: 1.5W+ tasks are hosted, and the capacity is automatically expanded and reduced when traffic changes to avoid consumption backlogs and improve resource utilization. Automatic migration of 1K+ slow nodes every day eliminates consumption backlog and effectively reduces the Flink operation and maintenance pressure of the business. In this sharing, we will introduce the implementation practice of Flink automated operation and maintenance from the following five aspects.
 

BytedanceReal-time data warehouseQuality and cost management platform practice

Zhu Fusheng, ByteDance Data Engineer
Speech introduction: With the improvement of business requirements for data timeliness and the increasing number of real-time tasks, ByteDance has run tens of thousands of Flink real-time tasks. Due to the real-time task Various subjective and objective factors such as numerous dependent components, numerous developers, uneven development habits and experiences, lead to frequent problems such as task stability and resource waste. Therefore, task governance is imperative, but throughout the entire governance process, there are still the following contradictions: the contradiction between business and governance, the contradiction between human resources and governance, and the contradiction between problems and evaluability. This article will introduce how real-time health points can make real-time governance simple, efficient and sustainable through four aspects: governance background, health sub-system, governance benefits, and health sub-planning.
 

Byte recommendation is for the next generationFeature engineeringArchitecture evolution path

Liu Shouwei, ByteDance recommendation architecture engineer
Liu Fangqi, ByteDance recommendation architecture engineer
Speech introduction: In the development of ByteDance in the past few years, the recommendation system has built a characteristic production system that supports trillions of data throughput based on big data components such as Flink, Spark, and Hudi. , with the rapid development of live streaming, e-commerce, life services and other businesses and the expansion of the scale of algorithm engineers, further challenges have been posed to the offline components in the recommendation system in terms of ease of use, cost, and architecture. In this context, we also proposed It has introduced a new generation of feature production and links into the lake, including the introduction of the recommendation system Planner, user-oriented Python SDK, and Flink streaming and batch-integrated sample input into the lake, etc., which have significant benefits in development efficiency, cost and performance: among them The development and launch cycle of original feature production can be reduced from N days to 1 week to the hour level, and the computing performance of recommended samples into the lake has also been improved to 3 times + the original.
 

Flink Implementation practice in Douyin real-time monitoring and early warning scenarios

Zhang Hongbo, ByteDance data engineer
Speech introduction: With the development and construction of real-time data warehouses and the strong demand of businesses for real-time data, real-time data warehouses support more and more high-quality services, and at the same time, they also encounter A new challenge has arrived. From initially supporting business quickly to paying more attention to timeliness and accuracy, our goals have been continuously improved, our architecture has been continuously improved, and we have continued to explore on the road to improving data timeliness and accuracy. After going through a series of iterative evolutions of solutions->tools->platform, a real-time monitoring and warning system based on Flink SQL was finally established to help timely discover data problems and help the business achieve monitoring requirements. This sharing will introduce the implementation of Douyin's real-time monitoring and early warning capabilities based on Flink from two perspectives: the data level and the business level.
 
Live broadcast reservation & conference registration
PC terminal You can go to the official website of the FFA 2023 conference: https://flink-forward.org.cn/.
Mobile terminal You can follow the "Apache Flink" video account to schedule a live broadcast.
 
Tang Xiaoou, founder of SenseTime, passed away at the age of 55 In 2023, PHP stagnated Wi-Fi 7 will be fully available in early 2024 Debut, 5 times faster than Wi-Fi 6 Hongmeng system is about to become independent, and many universities have set up “Hongmeng classes” Zhihui Jun’s startup company refinances , the amount exceeds 600 million yuan, and the pre-money valuation is 3.5 billion yuan Quark Browser PC version starts internal testing AI code assistant is popular, and programming language rankings are all There's nothing you can do Mate 60 Pro's 5G modem and radio frequency technology are far ahead MariaDB splits SkySQL and is established as an independent company Xiaomi responds to Yu Chengdong’s “keel pivot” plagiarism statement from Huawei
{{o.name}}
{{m.name}}

おすすめ

転載: my.oschina.net/u/5941630/blog/10310117