Knowledge Graph Helps New Infrastructure Construction

Knowledge Graph Helps New Infrastructure Construction

Guide : Good morning, everyone. The topic of this sharing is knowledge graphs to help new infrastructure, and building a new generation of data intelligence infrastructure based on knowledge graphs. The main introduction is:

  • Introduction to New Infrastructure
  • Knowledge Graph Helps New Infrastructure Construction
  • Intelligent Data Governance Based on Knowledge Graph
  • Cognitive intelligence middle platform based on knowledge graph
  • Industrialization practice

▌Introduction to New Infrastructure

  1. New Infrastructure: Proposal and Development

  1. New infrastructure: construction content

What are the main contents of the new infrastructure we mentioned above?

It mainly includes the construction of seven parts, including 5G infrastructure, big data center, artificial intelligence, industrial Internet, high-speed railway and intercity rail transit, UHV, and new energy vehicle charging piles. Today we mainly focus on the two topics of big data center and artificial intelligence .

  1. New Infrastructure: Big Data Center

In 2019 , the number of data centers in China is about 74,000, accounting for about 23% of the total number of global data centers. However, the number of large and super-large data centers is small and the regional distribution is very uneven . All parties tend to adopt large-scale construction In order to avoid blind construction and repeated investment, large-scale and large-scale data centers will become the general trend. It is predicted that by 2030, the scale of the global data-native industry will account for 15% of the total economic volume, while China's total data volume will exceed 4YB, accounting for 30% of the global data volume. Driven by the new infrastructure policy, China's big data industry will It will usher in better development opportunities, and the big data center will also become a new connotation of national competitiveness in the new round of global competition.

  1. New Infrastructure: Artificial Intelligence

Artificial intelligence is the core driving force of a new round of industrial transformation, which will promote the transformation and upgrading of trillions of digital economy industries. The State Council's "New Generation Artificial Intelligence Development Plan" pointed out that by 2025, the scale of China's core artificial intelligence industry will exceed 400 billion yuan, and the scale of related industries will exceed 5 trillion yuan. Artificial intelligence is the commanding heights of a new round of technological competition, which is crucial to economic growth and national security.

Let's analyze that artificial intelligence first includes hardware, algorithms, data and knowledge at the basic layer. Secondly, vision, voice, natural language processing, big data governance, etc. at the technical level. Upward is the platform and system, including the basic AI framework, technology open platform, AI middle platform, etc., and the top layer is the application layer, including applications in finance, medical care, education and other industries. As shown below:

With the advancement of new infrastructure, the country also regards data, technology, and knowledge as more important strategic resources, and improves it to a mechanism that participates in the distribution of factors of production according to contributions, just like sound labor, capital, and land. In March this year, the Central Committee of the Communist Party of China and the State Council put forward the "Opinions on Building a More Perfect Factor Market Allocation System and Mechanism" at the meeting. Scientific research, technology, and management talents fully reflect the value of technology, knowledge , management, data and other elements. Provide better policy support for the development of big data and artificial intelligence industries.

▌Knowledge map helps new infrastructure construction

Next, let's take a look at how knowledge graphs can help new infrastructure.

  1. DIKW model

First of all, let's take a look at the DIKW model. Everyone should be familiar with this model. It presents a bottom-up pyramid shape, followed by data, information, knowledge, and finally wisdom, as shown in the following figure:

Then, based on our understanding of the DLKW model, we can understand it from different perspectives. First of all, from the DIKW model, we can see the process of data utilization from data -> knowledge -> wisdom. Secondly, we can also look at it from a technical perspective and we can associate several keywords. Data corresponds to big data, knowledge corresponds to knowledge graphs, and wisdom corresponds to artificial intelligence. Then finally, from the perspective of new infrastructure, we can combine the construction of big data centers, the construction of large-scale intelligent data centers, that is, the construction of large-scale knowledge graphs and the construction of artificial intelligence infrastructure. We understand the process of refining from data to knowledge to smart applications from different perspectives , as shown in the following figure:

  1. AI is evolving towards "cognitive intelligence"

Artificial intelligence, like the DIKW model introduced earlier, also presents a pyramid model, from the lowest level of computing intelligence -> perceptual intelligence -> cognitive intelligence -> general intelligence. So the artificial intelligence we see at present is still at the level of perception. To put it simply, the ability to listen, speak, see and recognize has been basically realized, but it still cannot have the ability of understanding, thinking and explaining that humans have. This is also what many experts have pointed out that the current artificial intelligence is still a flawed artificial intelligence, and it needs to be further developed to become an artificial intelligence with the ability to think and understand. This is also the next step that artificial intelligence is currently entering and must enter. The stage is cognitive intelligence. For example, Academician Zhang Bo of Tsinghua University once said, "Our current basic methods of artificial intelligence are flawed. We must move towards AI with understanding. This is the real artificial intelligence. Human intelligence cannot be learned from him through simple big data learning." Come out, what should I do then? It’s very simple, with knowledge, it gives him the ability to reason and make decisions.”

From the State Council's "New Generation Artificial Intelligence Development Plan", it has clearly proposed the key tasks of establishing a new generation of artificial intelligence key common technology systems, with special emphasis on the study of cross-media unified representation, key understanding and knowledge mining, knowledge map construction and learning, and knowledge evolution. and reasoning techniques. Key words related to "cognition" and "knowledge map" were repeatedly mentioned in this year's major project approval guidelines, and the construction of industry knowledge maps in the fields of information platforms, finance, customer service, education, industry, and medical care was encouraged.

  1. The Foundations of Cognitive Intelligence: Combining Symbols and Connections

From the national level, we attach so much importance to cognitive intelligence, so how should we build the foundation of cognitive intelligence? There are two core technologies, one is symbolism and the other is connectionism, which is also the main development of artificial intelligence. of the two schools of thought. A simple understanding can be considered that the main development of symbolism at this stage is knowledge graphs, and the main development of connectionism at this stage is deep neural networks. On the one hand, we need each of them to achieve the cognitive intelligence we need. At the same time, we also need to combine them, which is the combination of symbolism and connectionism that we often hear now, such as graph embedding and graph neural network. And representation learning based on knowledge graph.

  1. Knowledge Graph: The Cornerstone of Realizing Cognitive Intelligence

In the previous brief summary of the basis of our cognitive intelligence, in this section we will focus on the cornerstone of cognitive intelligence—knowledge graphs. If knowledge is the ladder of human progress, knowledge graph is the ladder of AI progress. This is the core meaning of knowledge graph for AI. With the knowledge map, it can allow machines to better understand data and at the same time allow machines to better explain phenomena. Since the knowledge map was proposed in 2012, it has been widely used in search and industry applications.

  1. Knowledge Graph Helps Artificial Intelligence Applications

What are the applications of knowledge graphs in powering artificial intelligence?

Including the search we mentioned earlier, it also includes chatbots, decision support, personal assistants, smart hardware, smart homes, and more.

  1. Knowledge Graph Helps New Infrastructure Construction

How does the knowledge map help our new infrastructure?

To put it simply, it can be divided into two parts. First, build the infrastructure of a new generation of intelligent data centers; second, help cognitive intelligence build artificial intelligence infrastructure, thereby building AI upper-layer applications.

The above is our general introduction of knowledge graphs to help new infrastructure. Next, we will introduce in more detail how to build the infrastructure of a new generation of intelligent data centers from the above two aspects, how to help cognitive intelligence build artificial intelligence infrastructure, and build the upper layer of AI application.

▌Intelligent data center based on knowledge map

  1. Big Data Center Construction - Data Governance

First, let me introduce the most important part of big data center construction, big data governance. Big data governance has experienced more than ten years of development from the generation of big data to now, and it contains a lot of technologies and systematic engineering guidance. Big data governance specifically includes the following categories: metadata management, master data management, data quality, business processes, data architecture, data standards, data life cycle, data security, etc. At the same time, many standards have emerged, not as good as the data governance framework of the national standard GB/T 34960, which includes top-level design, data governance environment, data governance domain, and data governance process. The above is the technology of data governance and the framework of data governance. For details, please refer to the following figure:

  1. Pain points that need to be improved and improved in data governance

Although we can see from the previous section that data governance already has a complete data governance technology and framework, data governance still faces pain points that need to be improved and improved. details as follows:

  1. Low utilization of unstructured data

Unstructured data is rarely considered in data governance, but unstructured data or semi-structured data will account for an increasing proportion at present.

  1. Different types of multimodal data are difficult to fuse

Multimodal data has not yet been deeply fused

  1. Correlation information between data is not effectively utilized
  2. Lack of a business-oriented flexible model
  3. Insufficient support for intelligent applications
  1. Data Governance Based on Knowledge Graph

In the previous section, we summarized the pain points encountered in the current big data governance summary, and the areas that need to be improved. We propose a data governance solution based on knowledge graphs. Generally speaking, the construction of knowledge graphs deeply refines and extracts knowledge, and then builds intelligent applications to further enhance the value of data. So how to achieve it?

First, define a unified knowledge representation model based on the classic big data governance framework, including concepts, entities, attributes, relationships, events, business rules, linking multimodal data, etc., to uniformly represent data.

Second, with the above-mentioned unified data representation and storage model, we can perform further knowledge extraction on this structured and semi-structured data, so that the computer can further understand it, including entity recognition , attribute extraction, event extraction, etc.

Third, with the basic knowledge organization method of the above-mentioned entities, the relationship between entities can be extracted and the relationship between data and knowledge can be established.

Fourth, at the same time, we also use related technologies such as ontology mapping, entity matching, and knowledge linking to integrate knowledge at a higher level to form a unified knowledge map.

Fifth, build intelligent applications after landing through unified knowledge graph storage. Including semantic retrieval, intelligent question answering, graph association analysis, decision analysis, etc.

Knowledge Graph Data Governance

Above we talked about the overall framework of knowledge graph governance. Next, we will conduct an expanded analysis of knowledge graph governance from unified representation and modeling, knowledge extraction, multi-strategy information extraction, deep semantic fusion, and polymorphic storage.

1 ), unified representation and modeling

Using a unified representation model, it can be summed up as a whole, including concepts, entities, attributes, relationships, events, business rules, and multimodal data that are linked and associated with elements in the knowledge graph by means of links. The following is an example. Trump belongs to the concept of a person. He is the current president of the United States, and his nationality is the United States. This is a relationship. Then an event, such as the first acknowledgment of the deterioration of the epidemic on July 25, 2020, is an event, which includes the time, place, person, etc. of the event. So the business rules, for example, if the United States closes the Chinese consulate in the United States, then China will also take reciprocal countermeasures. This is a specific business rule.

2 ), knowledge extraction

When faced with knowledge extraction of structured data and semi-structured data, it includes conversion and graph mapping of structured data, and information extraction of plain text data.

3 ), multi-strategy information extraction

A multi-strategy extraction model is used to extract information from semi-structured and unstructured data. The processing of unstructured data tasks can be divided into named entity recognition, relation extraction, attribute extraction, event extraction, anaphora resolution, and rule mining. So what strategy does the multi-strategy extraction method adopt? First, the corpus is automatically generated from structured information or existing knowledge bases through remote supervised learning, and then the model of unstructured data is trained to finally achieve large-scale data extraction. There are guests to share later in this piece, so I won’t make an introduction here. A Method of Multi-Strategy Extraction

4 ), deep semantic fusion

After extracting semi-structured and unstructured data, then we will implement deep semantic fusion of knowledge or data. The following mainly introduces from four aspects:

First, ontology alignment, to realize the mapping and alignment of concepts and relationships in different source ontologies, and to achieve fusion at the schema level.

Secondly, entity alignment, for each entity in the heterogeneous data source knowledge base, find out the same entity in the real world. For example, two entities describe the same entity, or two events describe the same event

Thirdly, relationship discovery, discovering the relationship between entities, establishing the relationship between data and knowledge, leveraging the deep value of data.

Finally, entity linking, which semantically associates multimodal data with knowledge in the knowledge graph to form a multimodal knowledge graph.

Through the above steps, we can achieve deep semantic fusion.

5 ), polymorphic storage

Through the above data representation, structured data and semi-structured data extraction and fusion, a unified knowledge map can be formed, and the previously acquired knowledge can be stored through the polymorphic knowledge map storage engine. Specifically, we will use graph database as the core and combine multiple storage engines to realize the storage of various types of data, including the above-mentioned record data, document data, multi-modal media data, and index data. Store through a variety of storage mechanisms, thus forming a knowledge graph polymorphic storage engine with graph database as the core.

  1. Intelligent data governance platform based on knowledge graph

After going through the previous steps, we have formed an intelligent data governance platform based on knowledge graphs. In view of the challenges faced by the current data governance mentioned above, knowledge fusion is carried out through unified knowledge representation and unified knowledge extraction to form a unified knowledge map and polymorphic knowledge storage, which provides unified knowledge consumption for the upper layer, and then realizes intelligence. Top-level intelligent applications such as question answering, intelligent retrieval, and intelligent recommendation.

  1. Federal Intelligent Data Center Based on Blockchain and Secure Multi-Party Computing

Due to the problem that data is scattered in different places, and the same company is distributed in different departments or branches, then we propose a federal intelligent data center based on blockchain and secure multi-party computing. The core technology and the most popular technology include district Block chain, secure multi-party computing and federated learning, etc. to achieve the purpose of consensus, sharing, encryption, collaborative computing, and traceability of remote data.

▌Cognitive intelligence middle platform based on knowledge map

  1. Centralization of knowledge graph cognitive intelligence

As mentioned above, we realize the in-depth governance of data through the knowledge map, so as to realize the intelligent application of the upper layer. Through this idea, we have done related practices in many fields. After introducing the knowledge map to make the knowledge map, it is not to replace the classic one. Big data governance, on the contrary, is an enhancement of classic big data to further enhance the value of data, so most of us may also encounter the problems faced by big data governance mentioned earlier. We just use knowledge map technology to further mine the value of data. In the process of practice, we will also find that there are still some problems. For example, from the perspective of application, the construction period of big data governance is long, the construction of graph is difficult, and the reuse rate is low. But from the perspective of users, more is Out of the box, in other words, it is a way to quickly build a business to meet business requirements. Through our actual combat in multiple industries, we have also passed the concept of data middle platform, and we have proposed a cognitive intelligent middle platform based on knowledge graphs. . We all know that one of the main purposes of the middle platform is to quickly realize the construction of business data.

  1. Ideas for Centralization of Knowledge Graph Cognitive Intelligence

Aiming at the cognitive intelligence middle platform based on the knowledge map mentioned above, it is expected to achieve data agility and application development agility. We can introduce them one by one from the following three parts. First, micro-service components through high abstraction to improve reusability. Second, pre-build, the goal is to use it out of the box, and third, enable users to build applications by themselves through business orchestration.

1) Micro-service of middle-end components

The micro-service of middle-end components, on the one hand, is aimed at the componentization and micro-service of big data governance, and at the same time, it is also aimed at the capabilities in the knowledge map, such as only modeling, knowledge acquisition, knowledge fusion, knowledge storage, etc. Facilitate unified data management. On the other hand, on the basis of data governance, we provide intelligent application components, including unified retrieval, intelligent recommendation, etc.

2 ) Centralized pre-construction

Direct use and deep inspiration of pre-built models

Use directly

      The pre-built model is directly used for NLP and NLU-related tasks in the process of knowledge graph construction and application.

deep inspiration

      Bert uses a large amount of data (weak annotation) and complex models to reduce the support of high-quality corpus and form a general language model.

PlantData: Starting from structured data in multivariate heterogeneous data, it uses the idea of ​​remote supervision to automatically generate training data, and iteratively generates models for industry applications.

What else can we do with pre-build? We can pre-build data patterns, pre-build knowledge bases, pre-build business applications, pre-build models and algorithms, and pre-build business scenarios. This is the main content of China-Taiwanized pre-build.

     3 ) Middle-end business arrangement

Through the micro-service of the front components and a large amount of pre-built data and knowledge, the business can be arranged on these basis, allowing users to quickly realize the form of the application.

  1. Cognitive intelligence middle platform architecture based on knowledge graph

A data governance platform based on knowledge graphs, as well as the thinking and transformation of China-Taiwanization, including the micro-service of components, the pre-construction of China-Taiwanization, the arrangement and application of China-Taiwan business, and finally we realized the knowledge-map-based The overall structure of the cognitive intelligence platform, please refer to the following figure for details:

▌Industrialization practice

Based on the data governance platform and cognitive intelligence middle platform, we have realized this new paradigm of application construction of large and medium platforms and small front desks in practice. The focus is on the construction of the intelligent middle platform, which will put the previous data management, collation, knowledge mining and the construction of the middle platform at the center of industrialization practice. We can do a lot of work before facing user scenarios, including work at the technical level, accumulation at the data level, and accumulation at the model and application level. After waiting for the user's specific scenario, a customer business application can be quickly built on this basis.

  1. Practical Steps of Cognitive Intelligence Mid-Taiwan Industrialization

  1. Financial risk control middle platform based on graph mining and complex reasoning

The paradigm of cognitive intelligence construction based on knowledge graphs can be used in the financial field to pre-build data, pre-build knowledge, and pre-realize our data governance for some public data or third-party data. On this basis, we can go to Construct knowledge graphs, such as enterprise-oriented knowledge graphs, patent-oriented knowledge graphs, and industrial chain-oriented knowledge graphs. With these, we can also pre-build models and applications for the financial industry, including risk profiling models, genealogy analysis models, and supply chain risk transmission models. When implementing an application, it is only necessary to introduce customer-related data and customer-related scenarios to build corresponding business models, fine-tune the algorithm for its data and scenarios, and finally quickly build its specific application based on the business orchestration engine .

  1. Intelligence analysis platform for event analysis and complex reasoning

Also in the field of intelligence analysis, we have implemented such an intelligence analysis platform. Also to pre-train our data, graphs, models and applications, end users can build our smart applications through our business orchestration workbench. At the customer's site, we can quickly introduce the customer's data, and quickly realize the customer's application.

  1. Other application scenarios

Based on such a middle platform, we can also apply it to other application scenarios, such as our insurance consulting and product recommendation robots in the insurance industry. In the field of e-commerce, we can provide semantic search for e-commerce. Applications in some complex fields, such as the field of discipline inspection, realize the reasoning of implicit relationships through reasoning based on the previous data and knowledge. Similarly, we can also implement the orchestration of some applications through the business orchestration engine. The application of other scenarios is as follows:

This article comes from sorting out the ppt shared by Mr. Hu Fanghuai in the DATAFUN community.

Guess you like

Origin blog.csdn.net/jinhao_2008/article/details/108142886
Recommended