[Reprint] 2018 news: the first domestic! Tencent released a new version of the leading Apache Hadoop

The first domestic! Tencent released a new version of the leading Apache Hadoop

Recently, Tencent led the latest version of Apache Hadoop2.8.4 release for domestic technology companies has taken an important step in the international open source contributions.

In 2006 Apache Hadoop released in 2008 Hadoop become the top-level Apache project. At that time, China Mobile, Baidu, Taobao and so have started using Hadoop technology. Hadoop gold project now has become one of the Apache Software Foundation. Not only that, it also gave birth to include HBase, Hive, ZooKeeper and a series of well-known top-level Apache project, they are beginning to form a sub-project of Apache Hadoop community in the operation, known as a developer.

The dominant Tencent released version of Apache Hadoop2.8.4, involving more than 20 large and small features and optimizations, the following list:
image description

This version is responsible for the overall progress of the Release Manager, is an expert researcher from Tencent Cloud Product big data and artificial intelligence products center block Junping, who is also a member of Apache Hadoop community of PMC.

PMC Apache Software Foundation respected institution, every open source project is a PMC, namely project management committee can decide the direction of technology and community development mode of operation, but need public information, and report regularly to the Board of Directors of Apache, so that the Board supervision.

Become a member of the PMC but very easy thing to do, we must step by step "Daguai upgrade." To leap from an ordinary Developer to PMC Member, in addition to code the code, the open source community have strong organizational skills, is not it awesome?
image description

堵俊平,腾讯T4大数据技术专家,曾任EMC,VMware资深研发工程师,Hortonworks美国YARN团队负责人。深耕云计算,大数据方向10余年,在多个社区均享有极高知名度,包括Apache Hadoop社区Committer & PMC,并领导hadoop 2.6、2.8等应用非常广泛的社区release。曾领导开发多个Hadoop在云平台上优化与拓展的项目与产品。目前在腾讯致力于领导腾讯云大数据及人工智能产品研发工作。

小编采访了一下这位男神,接下来就让他给大家讲一下,这次新版本发布过程中的小细节吧。

Q:很多人可能听说过“开源”,但了解不多,您可以简单介绍一下吗?

A:开源可以理解为“向公众开放源代码”。近几年来热度不减的大数据,就是由开源的软件来驱动整个产业生态的。这里就不得不提到一个里程碑式的开源产品——hadoop:从谷歌的三篇论文,到雅虎的hadoop,开启了如今的大数据时代。

过去的系统软件主要是由闭源软件来驱动的。虽然像操作系统涌现出Linux这样优秀的软件,但后面的数据库和应用服务器,仍然几乎都是从闭源产品去驱动的。

hadoop诞生后的这十年来,一直都是大数据生态的核心,它改变了以往的软件形式,成为了最主流的开源项目之一。现在基本上各家数据平台团队,都是在hadoop生态系统上小修小改,去支持大数据相关的业务系统,可以认为它是开源的一个标准吧。

Q:与传统的闭源生态相比,开源有什么好处呢?

A:首先是避免“重复发明轮子”的问题,不同的个人和团体可以在公开的代码平台上集体创作,而不是封闭起来做重复的事情。其次是用户不必被绑架在特定的软件平台上,随时可以迁移应用和数据。最后是核心知识产权,比如以前的IOE,不只是没有“中国芯”的问题,上面的应用软件和系统软件可能随时面临被人封锁的危险,而开源就不会有这个问题,它完全公开透明。另外,开源也鼓励公司规划长线的技术投资,而不只是短线的利益操作。

Q:腾讯这次主导阿帕奇社区hadoop新版本的发布,在国内算是首次吗?

A: Yes, before all paries by Microsoft, Hortonworks and Cloudera Big Data and other foreign manufacturers, but this version is the first time entirely by domestic companies to support. From a technical appeal or influence over the entire open source community, it may be developers and encourage domestic companies to participate more actively contribute open source projects, the courage to take greater responsibility, greater open source community feedback.

Q: society as a whole has also brought what a positive impact it?

A: First of all, a large part of the data base software technology, this platform is the leading technology Tencent do, be regarded as a breakthrough in the country. Secondly, for the developer community, a community is more reliable, the most popular items. Finally, for ordinary people, but also it can benefit from. Because the ability to improve the basic platform, but also along with the improvement of data processing capabilities, can make everyone's life easier. Before the arrival of the era of big data, it may not have as much data-oriented services, such as map services, O2O business, intelligent recommendation systems. Including the current topic of great AI Artificial Intelligence, Big Data platform if there is no progress, but also the development is not up.

Q: There are limits to technical difficulties before it?

A: Over the past decade, the rapid development of China's Internet companies, we are in pursuit of business, mainly in technology and open source is not doing enough, this is our short board compared to the West. In fact, many domestic companies have tried to open source, but it is only open source code is not open source community, which is the feel of a product doing well, put out the open source code for it.

The open source code and open source community are two different things, the difference is you this open source code and other third parties (especially your ecological partner) have the ability to participate.

Now the whole craze of big data, in fact, been several open source core software driven. China's after these big companies have the economic strength, but also began to open source as a means to want to construct a better ecosystem. This may require a process, but we have gradually realized the importance of these fundamental open source software combined with the.

Guess you like

Origin www.cnblogs.com/jinanxiaolaohu/p/11096030.html
Recommended