Apache SeaTunnel (Incubating) 2.1.0 released, kernel refactoring, full support for Flink

On December 9, 2021, SeaTunnel (formerly known as Waterdrop) successfully joined the Apache incubator. After entering the incubator, the SeaTunnel community spent a lot of time sorting out the external dependencies of the entire project to ensure the compliance of the entire project. After months of hard work, the community officially released the first Apache version on March 18, 2022. This version passed the rigorous 2 rounds of voting review by the Apache incubator at one time, ensuring the compliance of the SeaTunnel software license to the greatest extent. At the same time, this means that version 2.1.0 is the first official version of Apache released after double-checking by the SeaTunnel community and the Apache incubator. Enterprises and individual users can use it safely.

2.1.0 Download address:

https://seatunnel.apache.org/download

GitHub Release:

https://github.com/apache/incubator-seatunnel/releases/tag/2.1.0

Note:

A license is a contract or instruction of a legal nature that regulates the use or distribution of copyrighted software. A software license is a contract between a software developer and its user to ensure that the user will be protected within the scope of the license. It is highly recommended that users and developers, before choosing open source software, first pay attention to whether the license of the software is applicable to their own products, and the Apache License is a very business-friendly license.

01 This release version description

new features

1. A lot of optimizations have been made to the core part of the microkernel plug-in architecture. The core is mainly Java, and a lot of improvements have been made to command line parameter parsing, plug-in loading, etc. At the same time, the plug-in extension can be based on what users (or contributors) are good at. language for development, which greatly reduces the threshold for plug-in development.

2. Flink is fully supported, but at the same time, users can freely choose the underlying engine. This update also brings you a large number of Flink plug-ins, and you are welcome to contribute related plug-ins in the future.

3. Provide local development speed start environment support (example), contributors or users can start quickly and smoothly without changing any code, which is convenient for local rapid development and debugging experience. This is exciting news for contributors or users who need custom plugins. In fact, in our pre-release testing, there are also a large number of contributors who use this method to quickly test the plugin.

4. Provide Docker container installation. Users can deploy and install SeaTunnel through Docker very quickly. In the future, we will also make a lot of iterations around Docker&K8s. Welcome to discuss and exchange.

Specific function description

  • Using JCommander to do command line parameter parsing makes developers pay more attention to the logic itself.

  • Flink has been upgraded from 1.9 to 1.13.5, maintaining compatibility with the old version and paving the way for subsequent CDC.

  • Connector plugins such as Doris, Hudi, Phoenix, Druid, etc. are supported. For complete plugin support, you can find [plugins-supported-by-seatunnel] here.

  • The local development is supported by an extremely fast startup environment. You can use the example module to start quickly without modifying any code, which is convenient for developers to experience local debugging.

  • Support for installing and trying out SeaTunnel via Docker containers.

  • The Sql component supports the SET statement and supports configuration variables.

  • The Config module is refactored to reduce the understanding cost of contributors while ensuring the code compliance (License) of the project.

  • Project structure re-adjusted to accommodate the new Roadmap.

  • CI&CD support, automatic code quality control, (there will be more plans to support CI&CD development in the future).

02 User Message

I have witnessed the growth of SeaTunnel along the way. Since the early waterdrop, Huya has been using it as the core component of data pipeline docking. The plug-in capability greatly simplifies the tedious work of data docking. Recently, SeaTunnel has been deeply optimized in many aspects, especially in terms of expansion capabilities: the engine supports both Spark and Flink, and has the expansion capabilities of other engines; the plug-in supports nearly 20 kinds of common data storage, with other Expandability of multi-language development plug-ins. Through the continuous efforts of the community, SeaTunnel has made unprecedented progress in documentation, configuration, and development and testing environment. At the same time, SeaTunnel has also made bold adjustments to the project structure to support future features such as CDC, CI&CD, and code quality automation. bedding. SeaTunnel has a bright future. I hope you will continue to pay attention to China's own leading open source project, Fighting!

---Huya Data Architect Huang Qiang

I am glad to see the release of the first Apache version of SeaTunnel. The code structure of the new version is clearer and the supported plugins are more abundant. I will continue to contribute to SeaTunnel in the future. Work with the community to make usage easier and more efficient.

--- OPPO Senior Engineer Fan Weitai

SeaTunnel's unique architecture design, advanced ideas of modularity and plug-in are worth learning from. When Seatunnel was still Waterdrop, we continued to follow the development of the project and verified it in various etl scenarios. We combine a graphical interface, so that users can perform ETL operations through simple configuration and apply them in a large-scale production environment. Hope SeaTunnel develops better and better!

--- Nie Lei, person in charge of the Big Data Basic Platform of Ideal Auto

Congratulations to Seatunnel on the release of the first Apache version after joining Apache. 2.1.0 is based on a clearer code structure and a richer plug-in family. It is excellent and easy to use, making it very suitable for second-opening and enterprise implementation. In addition, the architecture is upgraded and optimized, and performance is improved The improvement will help the enterprise's data transmission more efficiently and increase the value of the data.

--- Zhang Zongyao, senior developer of Bilibili

The emergence of Apache SeaTunnel (Incubating) fills the gap of high-concurrency data push and cleaning in the big data open source ecosystem. Its plug-in ideological architecture has attracted a large number of contributors to continuously supplement and improve, making multi-source data exchange simpler and more convenient, and these The highlights are also best reflected in the latest version 2.1.0, which greatly saves the cost of its users' second-opening. As one of the fans of Apache SeaTunnel (Incubating), I sincerely hope that SeaTunnel will get better and better. In the future, I will also synchronize the experience of personal and company use to the community to contribute to the more efficient and easy-to-use SeaTunnel.

--- Kidswant OLAP Platform Architect Yuan Hongjun

Congratulations on the release of the first Apache version of SeaTunnel. When I first came into contact with SeaTunnel, I was attracted by its simplicity and ease of use. The new version not only has a great improvement in architecture, but also supports more abundant data sources. At the same time, the community is becoming more and more mature, and we hope that more friends who love open source will participate together to make SeaTunnel shine.

--- Wu Di, Big Data Engineer of Shuhai Supply Chain

I am very happy to see that Seatunnel has released the first version after joining Apache. The new version has made great progress in system architecture, configuration optimization, and performance improvement. If you are still working hard for distributed data access and cleaning, you may join the Seatunnel community, there are huge surprises waiting for you!

--- CETC Chen Hu

03 Thanks

Thanks to the following students who participated in the contribution (GitHub ID, in no particular order):

Al-assad, BenJFan, CalvinKirs, JNSimba, JiangTChen, Rianico, TyrantLucifer, Yves-yuan, ZhangchengHu0923, agendazhang, an-shi-chi-fan, asdf2014, bigdataf, chaozwn, choucmei, dailidong, dongzl, felix-thinkingdata, fengyuceNv, garyelephant, kalencaya, kezhenxu94, legendtkl, leo65535, liujinhui1994, mans2singh, marklightning, mosence, nielifeng, ououtt, ruanwenjun, simon824, totalo, wntp, wolfboys, wuchunfu, xbkaishui, xtr1993, yx91490, zhangbutao, zhaomin1423, zhongjiajie, zhuangchong, zixi0825.

We also sincerely thank our Mentor:

Zhenxu Ke, Willem Jiang, William Guo, LiDong Dai, Ted Liu, Kevin, JB

help in this process.

04 Planning for future versions

  • CDC (Change Data Capture) is a technology for capturing database change data. We will support Spark and FlinkCDC in the future;

  • The monitoring system includes monitoring of common indicators such as data read time/s, the total amount of input data read by tasks, and data transmission records.

  • UI system support, support for user interface editing;

  • SDK support, support service, more user-friendly.

  • More Connector support, and more efficient Sink support, such as ClickHouse, will be available soon in the next version.

The follow-up features are jointly decided by the community. We also call on everyone to participate in the follow-up construction of the community together. If you pay attention to which feature, you can raise an issue or reply in the issue. Issues that pay more attention will be implemented first.

05 Community Development

Recent overview

Since entering the Apache incubator, the number of contributors has increased from 13 to 55, and it has continued to maintain an upward trend. The average weekly Commits remains at 20+. Three contributors from different companies (Lei Xie, HuaJie Wang, Chunfu Wu,) passed Their contributions to the community are invited as Committers.

We held two MeetUps, from station B, OPPO, Vipshop and other corporate lecturers shared SeaTunnel's large-scale production and implementation practices in their enterprises (we will also maintain a monthly meetup in the future, welcome everyone to use SeaTunnel. users or contributors to share SeaTunnel and your stories).

Users of Apache SeaTunnel (Incubating)

The current registered users of Apache SeaTunnel (Incubating) are as above. If you are also using Apache SeaTunnel, please register in Who is using SeaTunne (https://github.com/apache/incubator-seatunnel/issues/686)!

Note: Only registered users are included

06 PPMC Testimonials

Apache SeaTunnel (Incubating) PPMC LiFeng Nie said when talking about the release of the first Apache version, from the first day we entered Apache Incubator, we have been working hard to learn the Apache Way and various Apache policies, the process of releasing the first version It takes a lot of time (mainly compliance), but we think this time is worth spending, which is also a very important reason why we choose to enter Apache, we need to make users feel at ease, and Apache is undoubtedly the most It is the best choice, and its license is almost rigorously checked, so that users can avoid related compliance problems as much as possible and ensure the reasonable and legal circulation of the software. In addition, its practice of the Apache Way, such as public welfare mission, pragmatism, community is better than code, openness and transparency and consensus decision-making, meritocracy, etc., can help the SeaTunnel community to be more open, transparent and diversified.

**07 Message from Committer & Contributor**

Apache SeaTunnel links data and releases value. From entering the Apache incubator to the release of the first Apache version, I have been paying close attention to and participating in it. I am very happy with the release of the first Apache version of SeaTunnel. The new version has both code structure and specifications It has been greatly improved, and the Apache SeaTunnel community is also very active. I will continue to contribute in the future. I welcome more friends to join in and contribute to the development of SeaTunnel.

--- Apache SeaTunnel Committer 王华杰

I am very happy to see that SeaTunnel has released the first Apache version. Although it is the first version, SeaTunel already has strong capabilities in terms of ease of use and data source support, which can help users complete data synchronization tasks simply, quickly and efficiently. . At the same time, the community is also developing vigorously. I hope everyone can participate in the contribution of Apache SeaTunnel (Incubating) and contribute to the growth of SeaTunnel.

--- Apache SeaTunnel Contributor 范佳

With the joint efforts of the community members, we are very happy to welcome the first Apache version to enter the Apache incubator. The first Apache version has done a lot of refactoring work at the code level compared to the previous non-Apache version. The Apache SeaTunnel community is very active, and I hope more small partners can join in and contribute to you.

--- Apache SeaTunnel Committer Wu Chunfu

08 About SeaTunnel

SeaTunnel (formerly Waterdrop) is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data. It can synchronize hundreds of billions of data stably and efficiently every day, and has been used in production by nearly 100 companies.

Why we need SeaTunnel

SeaTunnel does its best to solve the problems you may encounter in mass data synchronization:

  • Data loss and duplication

  • Task stacking and delays

  • low throughput

  • The application to the production environment has a long cycle

  • Lack of application health monitoring

SeaTunnel usage scenarios

  • Mass data synchronization

  • Mass data integration

  • ETL for massive data

  • Mass data aggregation

  • Multi-source data processing

Features of SeaTunnel

How to get started with SeaTunnel quickly?

Want to experience SeaTunnel quickly. 2.1.0 Ten seconds to take you to the extreme speed experience:

https://seatunnel.apache.org/docs/2.1.0/developement/setup

How to contribute?

We sincerely invite all partners who are interested in making local open source global, to join the SeaTunnel contributor family and build open source together!

Submit questions and suggestions:

https://github.com/apache/incubator-seatunnel/issues

Contributed code:

https://github.com/apache/incubator-seatunnel/pulls

Subscribe to the community development mailing list :

[email protected]

Development mailing list:

[email protected]

Join Slack:

https://join.slack.com/t/apacheseatunnel/shared_invite/zt-10u1eujlc-g4E~ppbinD0oKpGeoo_dAw

Follow on Twitter:  

https://twitter.com/ASFSeaTunnel

Sincerely welcome you to join us!

Welcome to operate WeChat!

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324145430&siteId=291194637