Ali data scientists once thoroughly publicize data sets, 15 PPT essence, speed turn away!

640?wx_fmt=jpeg

Text / technical community leadership

Edit / Emma

Ali big data and artificial intelligence scientists rows, Ali, head of Luo Jinpeng common data platform, to share in Yunqi Assembly, Data Tech and other Assembly: Ali's "double-Taiwan + ET" digital transformation methodology and results, as well as Ali data middle Office product OneData, OneID, OneService, the practice of building Dataphin .

This article compiled in which 15 PPT concentrated essence, let's follow the technical large coffee, work together to learn construction methodology Ali data station construction practice, the organization sets If what steps supporting data sets, data sets and construction divisions, etc. .

01

Ali data sets panorama

640?wx_fmt=jpeg(Click image to enlarge)

Ali data table on the composition of architecture, presents a "four vertical and three horizontal" structure, the underlying infrastructure from Ali cloud platform.

Four horizontal . In this architecture diagram, from the bottom up, the bottom of the main content data acquisition and access, in accordance with the format access data (such as Taobao, Lynx, horse boxes, etc.), we extract the data to the computing platform; by OneData system, "business segment analysis dimensions +" as the framework to build "public data center."

For construction in the upper according to business needs based on public data center: consumer data systems, enterprise data architecture, data architecture and other content.

640?wx_fmt=jpeg

(Source: Yunqi Community)

After deep processing, the data can be used to play the value of goods, services; finally provides a unified data services through a unified data services middleware "OneService".


640?wx_fmt=jpeg

(Click image to enlarge)

Three vertical . To ensure the Alibaba entire data system fast, efficient, high-quality data access, intelligent data need to have R & D platform to achieve, the theory and practice, through a set of system tools and development processes to protect the floor, to ensure that each team, each BU, by uniform rules to build a data system; at the same time, when the data more after the most immediate problem is the cost, we also established a unified platform for data quality management.


02

Ali "double-Taiwan" common support "small and medium-sized reception desk + Architecture"

640?wx_fmt=jpeg
Ali cloud big data and artificial intelligence scientists - line, mentioned in the interview, Ali in Taiwan is mainly reflected in the grounds of the station and station service station numbers side by side constitute a double, side by side have taken up all the front desk operations.

Business units will integrate packaging abstract background resources, transforming the core capacity of the reception friendly reusable shared resources to achieve the back-end business transformation foreground-to-use capabilities.


640?wx_fmt=jpeg

(Click image to enlarge)

Data sets from the background and business units of data flowing into the complete mass data storage, computing, product packaging process, the core data capabilities constitute the enterprise, based on the feedback of continuous data reception based on customized innovation and business units of data Evolution provides a powerful support.


640?wx_fmt=jpeg(Click image to enlarge)

The service station and a data table complementary support each constructed from a battle with strong base and rear artillery radar array.

03

Ali data sets OneData system

640?wx_fmt=jpeg

OneData Ali is the core data in the table, Ali, head of public data platform Luojin Peng introduced, the Group established a common layer of data OneData system to ensure standardized and unified caliber data from the design, development, deployment and use on the realization of the full data assets link management, standard data output.

Uniform data standards is a very complex task, for example, for the same index UV, internal unity before Ali even 10 kinds of data definitions. According to reports, OneData common data layer to a total of more than 30,000 data indicators were standardized and unified caliber, after combing reduced to more than 3,000.

In DT age, data explosion brought great challenges to the storage computational cost. According to Luo Jinpeng introduced in the absence of a public building a unified data layer, Ali internal server demand will reach 100 times as much now after five years. But after the construction of a unified public data layer, server demand after five years of relative will save 90%.

640?wx_fmt=jpeg

Ali data sets of OneData is not a "forming", which has gone through three stages of evolution capability:
the first phase: full application-driven era. During this period the primary structure of the source data to be synchronized to the same manner as the Oracle, then only two data architecture ODS + DSS, strictly speaking, only a substantially ODS layer, basically no model methodology.

The second stage: With the rapid development of business Ali, the amount of data is also growing rapidly, the performance is already a big problem, want to change the chimney-like development model through a number of modeling techniques to eliminate some redundancy, improve data consistency, so Ali introduced Greenplum.

The third stage: introduction to hadoop represented by distributed computing storage platform, the establishment of third-generation model architecture (OneData), the core layer are multidimensional CDM model. Select a model to Kimball dimensional modeling methodology as the core concept, but it has been some upgrades and expansion, Ali Group to build a data architecture system.

04

Data table PasS layer Dataphin


640?wx_fmt=jpeg

(Source: Yunqi Community)

In table mode, the entire data, PasS layer Dataphin products such as engine-like existence, down to the number of warehouse planning, from the output themed service.

Once you have Dataphin, various data fingertips problem can be solved, both standards can ensure that the data definition, data model that is designed to automate the development, thematic data services generated on the fly.

While providing data asset management portal, effectively reducing the number of warehouse construction barriers, but also improve production efficiency, reduce production costs and make the data easily from a cost center to become a real value center, and quantifiable presentation.

05

Quick BI data analysis on enterprise cloud power


640?wx_fmt=jpeg

After building big data management and complete, we need to use the Quick BI intelligence data visualization components to the value of the data behind the show in front of people.

Quick BI reversed the original heavily dependent on specialized personnel data analysis of the situation, can give front-line operational staff intelligence analysis tools, real do the "Operation Data" let the data generated value.

Now, more and more enterprises begin to cloud the data, and some industries such as government, finance because of the stringent security requirements and local self-built database, resulting in a distributed data storage status of enterprises today. The Quick BI but they can link various data sources, to meet the different needs of local and cloud, can be integrated into a unified data set.



06

Ali Big Data Competency Framework

640?wx_fmt=jpeg


Alibaba data presented in table mode, it is to solve the problem born, and through practice formation of a unified global data system, to achieve lower costs accumulated billions of dollars of computing storage, response times to enhance business efficiency, to provide fast service innovation solid guarantee.
Global data acquisition and introduction : demand-driven, global ideological diversity of data as a guide, acquisition and the introduction of full-service, multi-terminal, multi-form data.
Data architecture and standards development : a unified base layer, a common middle layer, flourishing data layered architecture model application layer, to achieve data indicators index caliber unified by a structured and standardized way.
Connected to the depth value of the data extracted : the object is formed in the center of the core business and label system is connected, the depth value data extracted.
Unified Data Asset Management : Building a metadata center, through asset analysis, application, optimization, operations of the Quartet to see the face of data assets, reduce data management costs, tracking data value.
The unifying theme-service : by building a service metadata and data center services query engine, unified data export-oriented business logic and data query, shield multiple data sources and multiple physical tables.


640?wx_fmt=jpeg

Greatly enrich and improve the Alibaba large data center, OneData, OneID, OneService and matures on to become CEO, down to front-line staff consensus methodology system.


07

Ali data sets in four stages of evolution


640?wx_fmt=jpeg

Alibaba data processing has gone through four stages, namely:

A   database stage , mostly OLTP (online transaction processing) requirements;

Second,   the data warehouse stage , OLAP (online analytical processing) become the main demand;

Three   data platform stage , mainly to solve the technical problem of BI and reporting needs;

IV.   Data sets stage , through the system to interface OLTP (transaction processing) and OLAP (Statement Analysis) needs to emphasize the ability of data services.


08

Construction step data table

640?wx_fmt=jpeg

First, the organizational structure upgrade. For example, previously responsible for the data department or team often lack the right to speak, the face of business needs tend to be passive acceptance of the role, which makes the idea of all data sets to naught, authorization is required for the data table team.

Second, the change works. Now many corporate data is the team's main tasks project management, requirements management, etc., when a project is completed and then into the next project, and then make a demand under the responsibility of a start demand, such work really training people organization, coordination, but not the length of such capacity and enhance the working hours of linear growth, despite an increase in demand and project management experience, but it does not precipitate knowledge and experience in a particular area of expertise, with time the passage of time, more and more people will lose the initial enthusiasm and creativity, in fact, only in-depth study of personnel data services, data and models, end-to practice, to create a data table, is the greatest value creation, in order to make possible continuous innovation

Third, the role of the conversion. Data table team from the traditional role of supporting the gradual shift to operate, not only on the data in the business have to work hard to catch up with business people, people in Taiwan to gradually establish a right to speak for the business, not just accept the demand role, but also to be able to put forward a reasonable proposal, to bring new growth point for the business, such as data-driven marketing.

Fourth, for enterprise features. Taiwan is good when you are in-depth understanding of the business, products, systems, organizations, and not only to understand where today, but also to understand how the past is evolved, how the future will evolve. Only after understanding all things, in order to make good architecture design in Taiwan.


09

Ali in Taiwan Construction Methodology

640?wx_fmt=jpeg

Taiwan in the construction of the foundation agreement

That is, according to our understanding of the business, to sort out some basic agreement. For example, what is the business? What is the business identity? What is the boundary various business areas? What each of these areas to provide basic services it is? Further guidance on these ideas continue to establish business platform implementation of standards and business management and control standards.

Taiwan's infrastructure: centralized control unit

Is the operating platform, which is mainly decomposed by the protocol standard, the ability to map business needs structure, global business identity, business panorama, business metrics, and the like. It allows us to have a place to see the big picture, to control the details.


10

Ali in Taiwan: the organization is to ensure that the data in the table

640?wx_fmt=jpeg

Adam Smith published "Wealth of Nations", at the same time, Watt improved the steam engine, the social division of labor theory and Aioi accompanied the Industrial Revolution, the history of human civilization wrote an indelible mark.

Pyramid of bureaucracy, along with the underlying logic of industrial civilization become the core of the organization, in order to emphasize large-scale and efficient production of the industrial age, even as the arm so that the emphasis on military organization, hierarchy (bureaucracy) is guarantee a top-down command to get efficient organizational structure strong execution.

With the advent of the Internet age, consumer demand is a great release, mass production of the industrial age has been challenged, turned to "mass customization" production mode transformation, the traditional bureaucracy is built on a large scale on the basis of production, and therefore facing hoc mode conversion, the transition to self-organize flat.

The real difficulty is the reconstruction of the station building on the organization, which is often intentionally or unintentionally avoided everyone's.

Success in Taiwan strategy can achieve matching technology architecture and organizational structure, it is a not around the past, but the threshold must be crossed. Ali share from established Division, Haier one single one, parallel functions, efforts to reconstruct the recent organizational structure Tencent we are concerned about these companies made in this regard.


11
article Summary of Key Points

1, Ali panorama data table . Ali data table on the composition of architecture, presents a "four vertical and three horizontal" structure, the underlying infrastructure from Ali cloud platform.
2, Ali "double-Taiwan" common support "small and medium-sized Taiwan + Front" architecture . The service station and a data table complementary support each constructed from a battle with strong base and rear artillery radar array.
3, Ali data sets OneData system . Group data to establish a common layer of OneData system, from design, development, deployment and use of a standardized and unified data protection caliber, full realization of asset data link management, providing standard data output.
4, the data table PasS layer Dataphin . PasS layer Dataphin products such as engine-like existence, down to the number of warehouse planning, from the output themed service.
5, analysis Quick BI help enterprise data cloud . It can give front-line operational staff intelligence analysis tools, real do the "Operation Data" let the data generated value.
6, Ali framework for large data capacity . Data in Taiwan greatly enriched and perfected Alibaba large data center, OneData, OneID, OneService and matures on to become CEO, down to front-line staff consensus methodology system.
7, Ali data sets of four stages of evolution . Databases, data warehouses, data platform, the station data.
8, the construction of step data table . Organizational structure upgrades, changes in working methods, the role of the conversion, for enterprise features.
9, Ali in Taiwan Construction Methodology. The station building and the foundation agreement, centralized control unit.

10, in Taiwan Ali organizations: data organization in Taiwan guarantee . Human three-pillar Ali, the organizational structure of the Ministry of Public Utilities upgrade.

Recent hot text recommended:

1. Ali, director of architecture at once thoroughly publicize architecture in Taiwan, page 13 PPT Detailed essence, the proposed collection!

2. Studies Ali in Taiwan, we must learn the most valuable part: the station construction methodology!


"Technology leadership community" to join the best-selling book "technology management summit" author Huang Zhekeng, CSDN College, after six months carefully polished, launched the "Give technical people management section 20 stresses" course, hope that through this course explained in detail, you can quickly establish a "technical management" capability maps to enhance management knowledge, technological breakthroughs person's career development bottleneck , opened a new chapter in your workplace technology management .

640?wx_fmt=png

This course is based on the Chinese culture "statecraft" ideological essence, combined with Western management philosophy , proposed management practice of "three axes": Zhengxintai, Xiuji, Daren , from the "Heart" and "moves" to help people understand the essence of management.

The first 100 readers, hand price of only 79 yuan!

640?wx_fmt=png

 -End- 

640?wx_fmt=jpeg

The authors wanted to, 100 large Internet cafes exchange of learning?

Adding "technology leadership community"

Press next Fanger Wei code scanning, add little sister Emma Assistant

Marked "plus group", later she will pull you into the community group

640?wx_fmt=jpeg


Past the wonderful tweets:

1. Male, 40, director, unemployed, middle-aged, you would like to eventually be able to leave a decent

2 . Studies Ali in Taiwan the most valuable: the station construction methodology!

3. After 90 beauty CEO wanted to find a CTO, I gave her a Technical Manager

4. I spent 10 years, from Shenzhen factory Yuan sister to the Google program

5. do the technical director of it. Write code? Why do not you come to write

Published 165 original articles · won praise 954 · views 320 000 +

Guess you like

Origin blog.csdn.net/yellowzf3/article/details/100082435