The Flink project is a rising star in the field of big data computing. The development of big data computing engines has gone through several processes, from the first generation of MapReduce to the second generation of Tez based on directed acyclic graphs, the third generation of Spark based on memory computing, and the fourth generation of Flink. Because Flink can be developed and used based on Hadoop, Flink will not replace Hadoop, but is closely integrated with Hadoop.

Flink mainly includes DataStream API, DataSet API, Table API, SQL, Graph API and FlinkML. Now Flink also has its own ecosystem, involving offline data processing, real-time data processing, SQL operations, graph computing, and machine learning libraries.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

main content

This article is divided into 11 chapters, the main content of each chapter is as follows:

Chapter 1 Overview of Flink; This chapter explains the basic principles of Flink, including Flink principles and architecture analysis, Flink component introduction, comparison of stream processing and batch processing in Flink, analysis of some typical application scenarios of Flink, and Flink and other streaming The difference in calculation framework, etc.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 2 Flink Quick Start; Chapter 1 analyzes the basic principles, architecture and components of Flink. This chapter starts to implement quickly-an introductory case of Flink, which can deepen the understanding of the previous content.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 3 Installation and deployment of Flink ; we have a basic understanding of Flink, and also master the development steps of Flink program. Let's take a look at how to install and deploy a Flink cluster, and actually run Flink programs on the cluster.

The installation and deployment of Flink are mainly divided into local mode and cluster mode. The local mode can be used by decompressing directly without modifying any parameters. It is generally used when doing some simple tests. The cluster mode includes Standalone.Flink on Yarn and other modes, which are suitable for use in a production environment and need to modify the corresponding configuration parameters.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 4 Detailed explanation of Flink common APIs; this chapter mainly analyzes and explains the common APIs of Flink DataStream and DataSet, and also involves some common operations of FlinkTableAPI and Flink SQL.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 5: Use of Flink's advanced features; this chapter mainly analyzes the advanced features in Flink, including Broadcast. Accumulator and DistributedCache.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 6 Flink State Management and Recovery; This chapter mainly focuses on the analysis of Flink State (state), including state management and recovery, as well as the task restart strategy in Flink.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 7 Flink window detailed explanation; this chapter mainly focuses on the analysis of Flink window (Window), including common windows provided in Flink, and Window aggregation operations.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 8 Flink Time Detailed Explanation; This chapter mainly focuses on the Event Time, Ingestion Time, Processing Time and Watermark in Flink Time for detailed explanation.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 9 Detailed explanation of Flink parallelism; this chapter mainly focuses on the detailed analysis of the parallelism in Flink. The parallelism setting in Flink is divided into 4 levels: Operator Level (operator level), Execution Environment Level (execution environment level), Client Level (client level) and System Level (system level).

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 10 Detailed explanation of Flink Kafka Connector; Flink provides many Connector components, of which Kafka is widely used. In this chapter, we mainly focus on the detailed analysis of the application of Kafka Connector in Flink.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

Chapter 11 Flink actual combat project development; this chapter mainly analyzes some actual combat application scenarios of Flink, including architecture design and code implementation. Two application scenarios are mainly introduced here: one is real-time data cleaning, also known as real-time ETL; the other is real-time data reporting.

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

This [Flink introduction and actual combat] document has a total of 254 pages. If you need a full version, you can forward this article and follow the editor, scan the code below to get it! !

There is also the following video for you to learn~~~

Finally finished learning the Flink introduction and actual combat PDF recommended by Alibaba Cloud big data architect

The popularization of big data technology and continuous upgrades and iterations have greatly promoted the accelerated realization of an intelligent society, and big data-related technologies have become an increasingly basic service. Flink's many characteristics different from other big data technologies have attracted more and more practitioners' attention. The author of this article has been deeply involved in the field of big data for several years, has a wealth of practical experience, and has a deep understanding of big data processing frameworks such as MapReduce, Spark and Storm. It introduces some key technologies and features of Flink in a simple way, and combines their own practical experience to help readers get started quickly.

Flink is the current mainstream big data real-time computing framework. This article explains Flink's design principles and implementation mechanisms in a simple and simple way. There are more detailed explanations from interface use, platform operation and maintenance to case practical operations. This article can be used as an introductory book for Flink application developers, or as a handbook for Flink platform operation and maintenance personnel.

Worship! ! Finally finished learning the Flink introduction and actual documentation recommended by Alibaba Cloud Big Data Architects, it is really strong!

table of Contents

main content

Chapter 2 Flink Quick Start; Chapter 1 analyzes the basic principles, architecture and components of Flink. This chapter starts to implement quickly-an introductory case of Flink, which can deepen the understanding of the previous content.

Chapter 3 Installation and deployment of Flink ; we have a basic understanding of Flink, and also master the development steps of Flink program. Let's take a look at how to install and deploy a Flink cluster, and actually run Flink programs on the cluster.

Chapter 4 Detailed explanation of Flink common APIs; this chapter mainly analyzes and explains the common APIs of Flink DataStream and DataSet, and also involves some common operations of FlinkTableAPI and Flink SQL.

Chapter 5: Use of Flink's advanced features; this chapter mainly analyzes the advanced features in Flink, including Broadcast. Accumulator and DistributedCache.

Chapter 6 Flink State Management and Recovery; This chapter mainly focuses on the analysis of Flink State (state), including state management and recovery, as well as the task restart strategy in Flink.

Chapter 7 Flink window detailed explanation; this chapter mainly focuses on the analysis of Flink window (Window), including common windows provided in Flink, and Window aggregation operations.

Chapter 8 Flink Time Detailed Explanation; This chapter mainly focuses on the Event Time, Ingestion Time, Processing Time and Watermark in Flink Time for detailed explanation.

Chapter 10 Detailed explanation of Flink Kafka Connector; Flink provides many Connector components, of which Kafka is widely used. In this chapter, we mainly focus on the detailed analysis of the application of Kafka Connector in Flink.

Guess you like