Digital transformation, what data people should pay attention to

Welcome to pay attention to the WeChat public account: Xiaoyang's data station, to see more valuable content~

|0x00 The ongoing digital transformation

This is a digital age, no matter what position you are in or what industry you are in, we are all involved. The slogan of "digital transformation" has been chanted for a long time, but at the end of 2020, two things happened, but this matter suddenly accelerated.

One is "passive digital transformation" and the other is "active digital transformation".

"Passive Digital Transformation"

A mature bilateral market is a prerequisite for business prosperity. A place that can efficiently aggregate consumers and producers can multiply its commercial value by the effect of scale. The Internetization process of China in the past two decades has given birth to a key word: "digital platform", which accelerates the standardization process of the industry by continuously making offline products online, and the more standardized the transaction methods, in turn The more prosperous the industry can be. This is true for the e-commerce industry, as is the advertising industry.

So what kind of industry can give birth to a "digital platform"? Obviously, the industry that can standardize goods or services, the easier it is to carry out "digital transformation", which in turn will give birth to a very huge duration. Typical industries are travel and food delivery.

Due to the difficulty of standardization, some industries have not entered the vision of giants before. As the overall growth of the Internet industry is weak, they urgently need to open up a new period of time, so "community group buying" has been targeted. "Community Group Buying" standardizes the product supply chain, logistics, warehousing, and order distribution through the giants’ mature e-commerce platform. With the humane middle role of the "group leader", it can experience the young people like buying food in the past. Extremely bad things were quickly standardized, eliminating the cumbersome links of bargaining, walking, and picking vegetables, and efficiently aggregated consumers and direct producers, and improved the efficiency of the distribution of fresh products. As a result, the traditional industry has once again realized the "dimension reduction blow".

Despite the country’s criticism, who can easily give up such an important traffic portal.

"Active Digital Transformation".

Coca-Cola recently launched a small program on WeChat. Although the process is not turbulent, the meaning is different, because this is Coca-Cola's first online platform in China. Traditional giants cannot sit still. If you are interested, you can search for "Coca-Cola+" in WeChat. Note that you must bring a "+" sign.

Different from the impression of "selling Coke", Coca-Cola not only did not sell drinks this time, but cooperated with other brands to sell various peripheral cultural products such as cultural and creative, home furnishing, luggage, jewelry and so on. Over the years, Coca-Cola has always used social communication as an important strategy for it, linking with consumers through the penetration of more scenarios.

Coca-Cola has three changes worth noting this time:

  1. Provide customization of IP products;
  2. Cooperate with multiple brands to create an IP image;
  3. The live broadcast has become an important traffic label entry.

Bringing the two popular labels of this year, IPization and live broadcasting, to the Internet is enough to show Coca-Cola's sensitivity to commercial changes. In the past, Coca-Cola lived in advertisements; now, Coca-Cola lives in the "digital platform".

Therefore, whether it is "active" or "passive", everyone is desperately "digitizing" and "standardizing" desperately.

|0x01 Data standardization is becoming more and more important

Going back to the technology itself, when we look at "digitalization", in fact, we should focus more on "standardization". Data people often have a position for themselves, called "assisted decision-making", but how can they be called "assisted" and do a few reports? Or run a few models?

Decision-making is controlled by the management, and what is examined is the decision-making power of people. Decision-making power is composed of the decision-maker's own vision, resources, connections, leadership, business sense, etc. Frontline employees need to undergo a longer period of polishing to have the ability to make decisions. Therefore, data practitioners cannot directly leap over stages. They need to go through the gradual experience of description, analysis, and decision-making assistance before they can grow into decision-makers.

Therefore, before "assisting decision-making", the first goal of the data man is to be able to explain things clearly and to "standardize" the data. Before grabbing business value, ask yourself whether the data in this field is standardized, and if not, what should we do. Even though this is a very basic thing, it is the core part of "digital transformation".

In recent years, we have had very mature methods for processing structured data, including synchronization of business databases, and "data embedding points" used to process logs. Different from the standardized data in the business database, how to design a standardized "data embedding" system is very important.

In terms of classification, "data buried points" are mainly divided into front-end buried points and back-end buried points.

The front-end burying point is to embed data collection code in the user end, such as APP, webpage, etc. Famous examples include the statistics SDK of Youmeng and the SPM code of Taobao. The front-end embedded point can collect the information of webpage visits, and it is easier to collect the user's behavior on the interface, such as which button is clicked, how long it stays on a certain page, and so on. The advantage of the front-end embedded point is that it can collect more comprehensive data and the content is richer, but the disadvantage is that the amount of collected data is larger, which increases the consumption of terminal traffic and the storage burden of the server, and it cannot respond to changes in demand in a timely manner.

The back-end burying point is to record a log on the server. When a user accesses an online product module, the server records the information of the visit. The back-end buried point is designed for the defects of the front-end buried point, and is mainly used to solve the problems of data timeliness and demand change efficiency.

Embedding is not a person’s development work, but requires the collaboration of upstream and downstream business links to complete, from the review of the burying requirements, to the formulation of the burying plan, to the development of the burying application, and finally the burying of the data Statistics, PD, data development, BI, front-end, back-end, and test students will all participate. Therefore, in the early stages of making a product, we need to consider the problem of burying points. If we wait until the product is online and then develop it, the data of the early version will not be collected, and it will be more intrusive to the business.

Talents who master the ability of data standardization will find their place in more and more "passive digital transformations" in the future.

For the specific design of the burial site, please check the "Seven Days of Data Burial Site" by Ju Shi Mudong.

|0x02 Unstructured data is seeking a breakthrough

In addition to the structured data we can see, there are 80% unstructured data. Their characteristics are: large amount of data, diverse formats, complex processing methods, and a high degree of non-standardization. These data include various office documents, pictures, audio, video, machine logs and other information.

From a data point of view, unstructured data has three very significant characteristics;

The first is the lack of a unified management method. Although structured data is very friendly to data development, data analysis and other positions, when we need to dig deeper into the content of the data, such as depicting a consumer's behavior data, we often need a lot of unstructured data as assistance. However, unstructured data does not have a unified management perspective, resulting in scattered distribution in various places. At this time, unstructured data cannot become an asset that generates value.

The second is the high development cost, because unstructured data often requires the access of algorithm students, and customized development is required for some characteristics of unstructured data, without forming a set of systematic technical capabilities, so it is very difficult to get started. Not many students can participate in data development.

Finally, the value of unstructured data has not been fully exploited. Unstructured data is more of a new perspective to supplement the content of structured data and provide incremental services to the original business. When we don't realize what unstructured data can do, its value is difficult to discover.

Standardizing data through algorithms will have a certain impact on existing modeling theories and development models. As a student of data development, learning some algorithms may be a career requirement in the future.

In October 2016, Gartner released the "Magic Quadrant for Distributed File Systems & Object Storage" white paper. In this white paper, Gartner expressed a point of view: the convergence trend of file and object storage. According to customers, this is a storage market for unstructured data (The markets for distributed file systems and object storage are merging. That is the reason Gartner is publishing a single Magic Quadrant on the combined segments —- it will eventually be one market. The distinctions between the two segments are slowly blurring, but the buyers are already treating it as one market.)

Massive unstructured data means mass storage, complex management and compliance requirements, and further improvement of big data analysis capabilities. Currently, whether it is AWS, Azure or Alibaba Cloud, it mainly provides tools and algorithms for unstructured processing, but does not provide solutions for the data itself.

Therefore, the standardization of unstructured data will become a popular direction as the competition for market segments becomes more intense.

|0xFF Standardization of business capabilities

In addition to some trends in the data itself, the understanding of domain models is also a requirement for future data talents.

Recalling the scene when you first used UML to design a system, you often learned the tools confidently, and when you can show your skills, you struggled for a long time about what you want to do and don't know how to implement the ideas in your mind.

In fact, this is the field in which I am engaged, and my understanding of abstract concepts is not adequate. Many students often use our data model for dimensional modeling to understand the domain model. In fact, some concepts of technology are brought into the business, which will lead to deviations in their own understanding.

The domain model focuses not on the obvious features of technical features such as scalability and functionality, but on how to clearly express business semantics through the explicitness of the model. In other words, being able to understand and see is the first goal, and how to achieve it is the second consideration.

Technical students often have big heads when doing some presentations. This means that they cannot make a clear domain model of what they do, and they are not clear about where they are and the value they can bring to the business.

According to Robert’s view in "Clean Architecture", the domain model is the core, and the data model is the technical detail. The reason why these two models are easily confused is that they both emphasize the concept of entities and relationships. Confusion in design.

Indeed, a good data model should be easy to expand. After all, changing the database or modifying the business process, but a large-scale system project involves a lot of work. But no matter what, the domain model is domain-oriented. It should be as specific as possible and as clear as possible. Explicit expression of business semantics is its primary task, and scalability is the second. The data model is oriented to data storage and should be as scalable as possible.

In the past, we used the dimensional model to run the Internet business, but in the face of traditional industries with more diversified and personalized business complexity, especially the manufacturing industry, it is not easy to be able to clarify the business. This is why in recent years "domain model" has been brought up again, but "dimensional modeling" is no longer popular. Because the times have changed and "digital transformation" has arrived, we are required to transform more traditional industries instead of staying in the one-acre three-quarters of the original business for deep cultivation.

Most people will not continue to work in one position. There will always be times when they switch to other industries to look for opportunities. When digital tools are very well done and various cloud facilities drastically lower the barriers to development, right The understanding and abstract ability of business knowledge are the most important criteria for distinguishing the capabilities of data persons.

Going back to the beginning of this article, the giants are looking for industry after industry that can be standardized to promote the construction of "digital platform". Pinduoduo is forced to switch to the grocery shopping business, but if we think about it carefully, if we do not grasp The methodology of abstract business is incapable of being "passively transformed", will it be the same as programmers in the software era in the past, who will bear the consequences of being "optimized"?

In any case, learning to standardize structured data, exploring and learning the development of non-standardized data, and mastering abstract methods for business capabilities are all things that data people should pay attention to in the era of "digital transformation".

Guess you like

Origin blog.csdn.net/gaixiaoyang123/article/details/112250799