It Changhua ants gold dress Chief Architect: Open source is SQLFlow chopper first test, real-time big data system is the cornerstone of the future

Ali sister REVIEW: Open Source SQLFlow, nurturing the industry, while AI showed off a little muscle. This is the ants gold dress recently open after the first reaction will be applied to SQL AI engine project SQLFlow, given the industry. He led the team developed SQL ants gold dress computing storage chief architect He Changhua. Today, we are invited to He Changhua and talk about his recent ideas and explore.

May 6, ants gold dress Hu Xi, deputy CTO announced the open source machine learning tools SQLFlow: "the next three years, AI capabilities will become the basic capabilities of every skill we hope that through the open source SQLFlow, reduce the technical applications of artificial intelligence. threshold, so that technicians call the AI ​​is as simple as SQL. "

image

Ants cloth gold clothes deputy CTO Hu Xixuan open source SQLFlow

And earlier JavaScript, Swift and other technologies are very similar, although many fans of artificial intelligence, but the threshold is very high, do not have a wide range of "universal", related professionals are very scarce. The current core areas of artificial intelligence is "machine learning", and in-depth master "machine learning" requires a very rich stock of knowledge, such as mathematics, statistics, probability theory and programming, but also need to keep him in other areas of very high level of knowledge, these demanding requirements of people make a lot of technology is difficult to become an expert in the field of machine learning, artificial intelligence and thus restricting the development of the whole industry.

With easy to learn, easy to use features SQLFlow precisely in order to solve these problems come. Hu Xi said, "is to open SQLFlow by simplifying the technology revolution, the machine learning capabilities in the hands of business experts, thus promoting more artificial intelligence application scenarios are found and creation."

SQLFlow the abstruse AI with simple SQL together, greatly simplifying the threshold data engineers use AI techniques. The developed SQLFlow, it is ants gold dress computing AI Infra team under the leadership of memory Chief Architect He Changhua.

image

He Changhua Dr. Stanford graduate, to work at Google headquarters for seven years, he won the company's highest technology award, subsequently working in Unicorn Airbnb 2 years, responsible application architecture back-end systems.

May 2017, he officially joined the ants gold dress, served as chief architect computing storage.

在蚂蚁金服,何昌华的工作是开发新一代计算引擎,搭建金融型数据智能平台。

而 SQLFlow,就是计算引擎主线上的结晶之一。

不过对何昌华来说,世界正在巨变,他还要带队探索一些没人做成的事情。比如全实时的大数据智能系统。

未来技术基石

大数据的概念,最早来自于搜索引擎行业,因为搜索引擎面对的是人类在互联网上留下的爆炸性增长的庞大数据。

2010年底,谷歌宣布新一代搜索引擎“咖啡因”正式上线,这项技术的革命性在于,任何时刻,世界上的任何网页发生了变化,都可以实时地添加到索引中,用户也可以实时地搜索到,解决了传统搜索引擎的延时问题。

何昌华当时正是咖啡因开发团队的核心技术负责人之一。

他解释,“咖啡因所实现的最核心的功能,就是实时。”

而现在何昌华在蚂蚁金服工作的目标,同样是搭建一个“完全实时”的大数据处理系统,或称之为大数据智能平台。由于线下生活场景的多样性和复杂性,这是个比构建实时搜索更有挑战性的任务。

他认为,这将成为未来技术的基石。

对于计算机来说,实时就是在发出请求到返回响应之间的延迟尽量小,对于大数据处理系统来说,这还意味着从数据生产到消费的延迟尽可能低,所有这些都意味着计算速度和能力的提升。

此前常用的大数据计算模型 MapReduce,对数据的处理是“分片式”的,数据的片与片之间有边界的概念,这种批处理的模式不可避免地会带来延时问题。

以搜索的场景为例,假如以天为时间单位对数据进行批处理,那就意味着今天更新的网页,用户明天才能搜索到,调高处理的频率可以部分解决问题,一天两次、一天四次、两小时一次……

虽然能逐步接近“准实时”,但成本也会急剧上升。

要实现真正的实时,就必须打破这种批处理的边界,让数据处理的过程像水流一样,随来随算,随时反馈。

这也催生了后来流式计算引擎的蓬勃发展。

而在何昌华看来,除了快,“实时系统”还有两层重要含义。

第一是 OLTP(联机事务处理)和 OLAP(联机分析处理)的融合。

在以往的观念里,OLTP 对实时性的要求高,OLAP 对时效性的要求不那么高。

举例而言,用支付宝进行一笔交易,需要即时查询和增删记录,就是由 OLTP 来处理的。而对用户行为特征的数据分析,则由 OLAP 来处理。

但现在随着业务场景需求的不断变化,OLAP 的时效性要求也越来越高。

例如互联网金融中的风控场景,就需要在完成一笔交易的极短时间中,通过分析用户的特征数据判断风险,这要求 OLAP 也要能实时反馈,且反馈结果马上就能够在线访问。

第二是智能和数据系统的融合。

人工智能和机器学习是大数据应用最热门的领域,而现在绝大多数公司的做法,是将数仓和机器学习平台分开,从数仓取一批数据,放到机器学习平台上去训练模型。

随着业务场景的复杂化和多样化,这种模式逐渐显露问题,因为模型能否实时更新,能否能用更实时的数据来训练模型,直接影响了应对复杂场景的能力。

image

“数据实时流入、实时训练模型,模型实时上线决策并反馈数据——这一条线如果能完全打通,对于业务将产生不可估量的价值”, 何昌华说。

数据、计算、智能,所有这一切构成了何昌华设想中的“高效率的大数据底盘”,也就是一个融合的实时数据智能平台,或者叫“Big Data Base”,就像曾经数据库成为无数场景的数据底盘一样。

如今,不仅是蚂蚁金服或者阿里巴巴,在各行各业中,数据驱动的业务都越来越多。

但大数据开发的门槛很高,如果每一项业务都从数据开发的底层做起,将会非常耗时耗力。

如何才能让做业务的人有更多精力专注于业务?

何昌华认为这就是“Big Data Base”的使命,同样也是“基石”的含义:

我们希望让这件事变得简单——各行各业的从业人员、各条业务线的同学,在坚实的平台基础上,不需要知道下层的细节,就可以很方便地开发上层应用。

离真正的智能有多远?

降低数据和智能的门槛,这是何昌华对于新引擎和数据智能平台的期望。

目前,他带领团队开发的金融型多模融合计算引擎,已经实现了流计算与图计算、流计算与机器学习的融合打通,距离他设想中的“大融合”越来越近了。

image

何昌华透露团队目标,就是让业务变得“极简”:

未来两到三年,我们希望新引擎能够承担实时在线的融合计算任务。基于这个引擎,结合其他开源引擎,我们就能够构建出一整套数据智能系统。在这个数据智能系统上,业务可以非常轻松地完成从功能开发到产品上线的流程,后续的吸引流量、分析决策等也都可以借助这个平台来完成。

他甚至勾画了一幅很科幻的未来场景:你写一个功能交给引擎,引擎会决定调用多少资源去计算,你无需关心具体的计算过程,结果将会在最短的时间内反馈给你。

当你构想出一种新型业务,数据智能平台会判断需要哪些数据,采用哪种模型,如何上线,如何运营流量。

这些流程,都可以智能化地自动完成。

这是个更长远的目标。我们开发出数据处理的能力,未来,任何人都可以使用这种能力,真正实现“数据民主化”。

这样一个融合多种能力的实时数据智能平台,目前在世界上还没有哪家公司能完全研发出来。

何昌华也谨慎而满怀信心地展望着未来:“我们也是在探索,如果完全实现了探索目标,我们就将真正站到全世界领先的位置。”

无人之境

世界瞬息万变,数据作为物理世界的镜像,理论上是无穷无尽的,问题只在于人类有没有办法去记录和采集它们。

互联网和移动互联网的普及,让人类的行为数据采集成本大大降低。

IoT 传感器设备的普及,让工业生产和社会生活中的数据也能够大量地沉淀下来。

因此在过去的二十年中,数据总量出现了爆炸性的增长。

在整个世界发生数字化巨变的同时,我们的生活也在悄然改变。

基于数据应用的发展,我们享受到了一二十年之前无法想象的便捷——电商、O2O、移动支付、智能家居……

但在何昌华看来,数字化还处在非常初级的、在把线下的数据搬到线上的阶段。

真正需要思考的问题,是未来当高度数据化的社会到来时,我们拥有什么样的能力去处理和应用海量的数据。

这关系到我们是否能够基于数据做到更多的事,催生出更高的智能,进而推动人类社会向着下一阶段发展。

image

这就是他回国加入蚂蚁金服所要寻找的答案。

之所以回来,是因为觉得在这里做的事,往大一点说,是面向人类社会发展下一阶段的探索。

In this new exploration, and dealing with vast amounts of data is compulsory, therefore, he repeatedly stressed the importance of computing power: big data, artificial intelligence, depth of learning ...... all need computing power, otherwise, forward exploration unable to move.

The development trend of artificial intelligence, but also with a larger higher and more massive computing, to simulate the human ability.

"True artificial intelligence data = +100 times more computing", Google the latest artificial intelligence model level, scaling up the equivalent of hundreds of pieces of GPU computing continue throughout the year.

He Changhua with the team effort to develop the next generation of computing engine and intelligent data platform, in fact, it is a comprehensive and powerful computing capabilities support efficient data processing capabilities.

It ants gold dress from among the vast amounts of business data and the birth scene, the original intention is to support the business ants gold dress, but as the technology matures, it can also have the versatility in a multi-scene.

Financial attributes of the high availability and high security, it can be widely used in other industries, let alone cope with life more scenes service.

Significance of the work, to the big, in promoting social change, although sounds like a great proposition, but it is not so lofty.

"Every technology must have its foothold. Specific to the ants gold dress, these technologies are closely linked with hundreds of millions of people's daily lives."

Every day, when He Changhua myself pulling out the phone bill payment using Alipay, you can intuitively feel the results of their work. Like when he worked at Google, the search function will also use every day, like: "the results of their own to make their own use every day, feel very real technology change of life."

He so stated his ideals. Leading to an ideal journey, he not only stood in the forefront of technology, but also living in the most everyday scenes, both on this inseparable:

Using technology to improve people's livelihood and promote social and human evolution always moving forward.

Original release time: 2019-06-06
author: Ali technical
article from Yunqi community partners, " Ali technology " for information may concern " Ali technology ."

Guess you like

Origin yq.aliyun.com/articles/704708