Dinky 0.6.1 has been released to optimize the Flink application experience

1. Background

Apache Flink, as a new generation of real-time computing framework, has been applied to various industries and fields. Although the degree of application is different, it will encounter some pain points in use. The basic application pain points such as unfriendly FlinkSQL job submission, no job monitoring alarm, etc. To a large extent, FlinkSQL has greatly accelerated the application promotion of Flink, and this article will briefly describe how the open source project Dinky improves Flink's pain points and optimizes the FlinkSQL application experience.

https://github.com/DataLinkDC/dlink

https://gitee.com/DataLinkDC/Dinky

2. Introduction

A , , based on , a real-time computing platform connecting with and many other frameworks , dedicated to the construction and practice of and .开箱即用易扩展 Apache FlinkOLAP数据湖 一站式流批一体湖仓一体

Its main objectives are as follows:

  • Visual interactive FlinkSQL and SQL data development platform: automatic prompt completion, syntax highlighting, debugging execution, syntax verification, statement beautification, global variables, etc.

  • Supports comprehensive multi-version FlinkSQL job submission methods: Local, Standalone, Yarn Session, Yarn Per-Job, Yarn Application, Kubernetes Session, Kubernetes Application

  • Support all Connectors, UDFs, CDCs, etc. of Apache Flink

  • Support FlinkSQL syntax enhancement: compatible with Apache Flink SQL, table-valued aggregate functions, global variables, CDC multi-source merge, execution environment, statement merge, shared session, etc.

  • Supports easily extensible SQL job submission methods: ClickHouse, Doris, Hive, Mysql, Oracle, Phoenix, PostgreSql, SqlServer, etc.

  • Support real-time debugging preview Table and ChangeLog data and graphics display

  • Support syntax logic check, job execution plan, field-level lineage analysis, etc.

  • Support Flink metadata, data source metadata query and management

  • Support real-time task operation and maintenance: job online and offline, job information, cluster information, job snapshot, exception information, job log, data map, ad hoc query, historical version, alarm record, etc.

  • Ability to support as multi-version FlinkSQL Server and OpenApi

  • Support easy-to-expand real-time job alarms and alarm groups: DingTalk, WeChat Enterprise Account, etc.

  • Support for fully managed SavePoint startup mechanisms: latest, earliest, specified, etc.

  • Support multiple resource management: cluster instance, cluster configuration, Jar, data source, alarm group, alarm instance, document, user, system configuration, etc.

  • More hidden features are waiting for friends to explore

3. Principle

 

Four, wonderful moments

FlinkSQL Studio

Live debug preview

Grammar and logic checking

Field-level blood

BI exhibition

Metadata query

Real-time task monitoring

real-time job information

Data source registration

5. Optimize the Flink experience

Immersive FlinkSQL IDE

Apache Flink provides sql-client, but sql-client is only a beta function and is difficult to be applied to production.

Dinky provides immersive FlinkSQL IDE development capabilities, and provides professional functions such as automatic prompting and completion, syntax highlighting, statement beautification, syntax verification and logic checking, debugging preview results, field-level blood relationship analysis, etc. As comfortable and easy as SQL development.

Easy-to-use way of building tasks

Flink usually needs to consider dependency and version maintenance, code writing, and tedious compilation and packaging process when building a FlinkSQL Jar task.

Dinky simplifies the construction of FlinkSQL tasks. Developers only need to focus on writing in the caliber of FlinkSQL, and can check and debug in real time. In the process of task submission, it is quickly and automatically managed to realize that a FlinkSQL statement can be used in all Execution mode can be switched at will with external clusters.

For Dinky, there are mainly two types of users. One is platform operation and maintenance personnel, who need to manually build a stable Dinky operating environment according to the official website documents and their own Flink knowledge reserves, and the threshold is high; the other is data developers, who only need to be familiar with the syntax of FlinkSQL With common application scenarios, you can quickly and efficiently develop and operate FlinkSQL to achieve an easy-to-use task construction method. This is also the division of labor strategy that is most in line with the production team of the enterprise, and the platform and development are separated.

Non-intrusive deployment mode

Some open source projects or self-built platforms usually need to bind the Flink cluster or invade the Flink source code, which is easy to limit the Flink function or cause problems during construction and subsequent expansion.

Dinky is completely non-intrusive and can be deployed outside each cluster, connecting and monitoring multiple clusters at the same time. Easily connect Flink clusters of various versions and Flink clusters optimized by the company's warehouse branch, and are fully compatible with Flink's own connectors, udf, cdc, etc.

Enhanced functional experience

Some open source projects and self-built platforms generally only focus on the submission and operation and maintenance of Flink tasks.

Dinky, on the other hand, has been enhanced to use Flink's related functions more comfortably, such as table-valued aggregate functions, global variables, CDC multi-source merge, execution environment, statement merge, shared session, etc., and is constantly expanding new ones. Feature enhancements to bring Flink closer to the needs of enterprises.

Real-time monitoring and alarm

Dinky provides real-time monitoring and alarming capabilities, guards the online stream or batch tasks in real time, will alarm and notify in real time when the task triggers abnormal stop and completes successfully, and records the real-time task information of the external cluster, gets rid of the limitation of History Server, and makes up for deploy The problem is that it is difficult to query the information after the cluster job fails, and users can trace the execution information and exceptions of historical jobs anytime, anywhere.

One-stop development operation and maintenance

Dinky provides one-stop development, operation and maintenance capabilities, from FlinkSQL development and debugging to operation and maintenance monitoring of job online and offline, to OLAP and general query capabilities of data sources, etc., enabling all tasks in the process of data warehouse construction or data governance. All can be done on Dinky.

Easily extensible code implementation

Dinky pays great attention to the expansion capability of the code, and uses the SPI mechanism in the source code to support users to customize and expand new functions at low cost, such as data source, alarm mode, custom syntax and other extensions.

Dinky's functional experience also pays great attention to extension capabilities, and the functional design has opened up the maximum configuration capabilities as much as possible, such as custom prompts and completion syntax, custom data source Flink configuration and generation rules, custom global variables, auto Define the Flink execution environment, customize various configuration items of the cluster configuration, and more.

Dinky's external docking also pays great attention to scalability. The high cohesion and low coupling design of SpringBoot-based code and the OpenAPI that provides multiple specifications make it easy to expand third-party ecosystems, microservices or platforms, such as dolphin scheduling.

Small and beautiful product form

Conventional big data platforms or open source projects are generally very large and have high maintenance costs.

As Dinky's real name explains, it is small and exquisite, and it has always been the primary goal of open source project construction. Small specifically refers to easy to build, not bound to any external middleware or file system, and the code is concise and easy to maintain; exquisite refers to immersive pages, various polished functions, etc.

6. Near-term plans

Multi-tenancy and namespaces

Dinky currently needs a multi-tenant capability to separate business data and resource queues, and requires namespaces to enhance and constrain the implementation and expansion of business permissions.

Global bloodline and influence analysis

Dinky currently needs to store all field-level bloodlines to build a global bloodline and impact analysis, so that users can more easily trace data problems.

Unified metadata management

Dinky currently needs a unified metadata center to manage the metadata of external data sources, so that it can automatically synchronize the structure between the physical model of the database and the logical model of the platform, and enhance the one-stop development capability of the platform.

Flink metadata persistence

Dinky currently needs to persist the Flink Catalog, so that there is no need to write CREATE TABLE and other statements during job development, and it can be transformed into a visual metadata management function.

Multi-version Flink-Client Server

Dinky's current Flink multi-version support requires launching multiple instances of different versions to support it. In the future, it is necessary to separate the client from the server and implement multiple versions of the server separately.

Whole library synchronization

The whole database synchronization of the database is a common scenario, and Dinky will provide a short FlinkSQL in the future to realize the ability to build the whole database synchronization task.

7. Thank you

Standing on the shoulders of giants, Dinky was born. For this we express our heartfelt thanks to all the open source software used and its communities! We also hope that we are not only beneficiaries of open source, but also contributors to open source. We also hope that partners who have the same enthusiasm and belief in open source will join in and contribute to open source together! Acknowledgments are listed below:

Apache Flink

Apache Dolphinscheduler

Ant-Design-Pro

Mybatis Plus

Monaco Editor

SpringBoot

Guess you like

Origin www.oschina.net/news/190033/dinky-0-6-1-released