The most comprehensive knowledge map explanation on the entire Internet!

What is a knowledge graph

Knowledge Graph Standardization White Paper Definition: Knowledge Graph describes concepts, entities and their relationships in the objective world in a structured form, expresses Internet information into a form closer to the human cognitive world, and provides a better way to The ability to organize, manage and understand the vast amounts of information on the Internet.

Simply put, the knowledge graph is composed of nodes (points) and edges (edges). Each node represents an entity. Entities can refer to people, things, and things in the objective world. Each edge represents a relationship, and relationships can express different entities. connections between. In essence, knowledge graph can be understood as a semantic network stored in a graph structure.

Insert image description here

Background of the birth of knowledge graph

Knowledge graph originated in the 1950s, and its development can be roughly divided into three stages. The first stage (1950-1977) was the enlightenment period of knowledge graphs. During this period, the symbolic logic of document indexing was proposed and gradually became a common method for studying the development of contemporary science. The second stage (1977-2012) is the growth period of knowledge graphs. During this stage, semantic networks developed rapidly, and the study of knowledge ontology became an important field of computer science. During this period, major projects such as WordNet, Cyc, and Hownet appeared. A large-scale artificial knowledge base makes it easier to exchange knowledge between computers and between computers and people. The third stage (2012-present) is the prosperous period of knowledge graph. In 2012, Google took the lead in proposing the concept of knowledge graph (KG). Google improved search engine performance and enhanced user search experience through knowledge graph technology. , and also opened a chapter in the modern knowledge graph.

Currently, with the advent of the big data era, the amount of data is growing exponentially, and knowledge graphs are also transforming from academic circles to generalized large-scale knowledge graphs suitable for modern enterprises. With the vigorous development of artificial intelligence technology, key technical difficulties in knowledge graphs such as underlying graph database storage and large-scale deployment of computing power have been solved to a certain extent. Outside the field of search engines, knowledge graph technology has become a hot technology in e-commerce, medical care, finance, energy and other fields, solving core pain points in the industry's production links.

Expression of knowledge graph

As mentioned above, the essence of a knowledge graph is a semantic network, in which nodes represent entities and edges represent semantic relationships between entities. The basic logical structure is divided into a pattern layer and a data layer. The pattern layer is above the data layer and is the core of the knowledge graph. It stores refined knowledge data models, including entities, relationships, attributes and other hierarchical structures. The data layer is mainly composed of factual data information, that is, the real information of the real world, usually using the "entity-relationship-entity" or "entity-attribute-attribute value" triple as the basic expression.

Currently, the two main graph data models for representing knowledge graphs are RDF graphs and attribute graphs. The following will explain the expressions, differences, and limitations of the two models.

The full name of RDF is Resource Description Framework. It was originally designed in the context of the Semantic Web. It is a data model that describes resources in the form of subject-predicate-object triples. The figure below is an example of an RDF diagram. When using an RDF graph model to represent a knowledge graph, you need to first build a data dictionary and define the metadata items for data modeling {metadata items mainly include two types: class and property, class refers to a collection of object instances, and property is divided into two subtypes : One represents the attributes of the class, and the other represents the relationship between multiple classes}. For example, to use RDF to describe a book, the RDF data dictionary needs to define the author, book title, number of pages, publication time, language type, etc. that a book contains. After the definition is completed, the specific book data is mapped into it. Therefore, the definition of the RDF data dictionary itself is an RDF Graph Schema. With a complete Schema, it is convenient for users to map real-world knowledge into the graph.
Insert image description here

In an attribute graph, vertices represent entities, and edges represent relationships between entities. Attributes serve as a key-value pair. Both vertices and edges support attributes. The following figure is an example of an attribute graph. The expression content is consistent with the RDF graph above. The characters "Wu Jing" and The movie "Changjin Lake" is used as the vertex, and the movie starring in it is used as the edge. The character vertex has attribute "gender" and "age" information, and the movie vertex has attribute "release time" and "box office" information. When using an attribute graph model to represent a knowledge graph, you need to build the graph model first, define the point-edge structure and attribute information of the graph model, and then map the data into it. When business personnel need to adjust the graph model in the face of demand changes, they can only adjust the edges and attributes without rewriting the graph model structure.

Insert image description here
In the academic field, static data with a fixed structure is often stored and standardized interfaces are provided. RDF Schema can be reused to achieve open sharing of data and avoid duplication of work by personnel. But in other industry fields, RDF graphs have limitations. Since there is no reusable data dictionary, it is very expensive to develop a new RDF data set, and there is no concept of label or type on the vertices of RDF. The attributes are passed through another It is completed by subject, predicate and object. When the business needs to add attributes, the RDF graph needs to modify the edge structure to add attributes. The graph model will change, which may easily cause the previous query statements to be unavailable and highly intrusive to the business. However, in the attribute graph On the graph model, modifications can be made directly without affecting the normal use of the business. For example, if you need to add the "role" attribute to the side of the movie you are starring in, just add the attribute directly to the side of the attribute map, as shown in the figure below.
Insert image description here
Since RDF graphs do not support setting attributes on edges, edges of the same type are the same and edges will be used repeatedly. If we simply add the role relationship between "Wu Jing" and "Changjin Lake" next to "starring in movies", the same attribute will be added to all predicates of "starring in movies". In RDF, the conventional method is to create a new vertex "ex:xxx" to represent the statement, as shown in the figure below.

Insert image description here
It can be seen that when adding attributes to the RDF graph, it will change the original graph model structure. A query that can be completed in one hop requires more than 2 hops to complete. Considering that the current industry knowledge graph is developing in the direction of large data scale, multiple real-time changes, and complex business models, the knowledge graph structured with RDF graph model faces development bottlenecks, and the operation and maintenance cost after deployment is high, and the attribute graph is The knowledge graph expressed by the model is gradually recognized by customers.

Wide application of knowledge graph

As mentioned above, Google uses knowledge graph technology to optimize search engine performance and greatly improve user search accuracy. In addition, large-scale knowledge graph technology has been widely used in various industries.

In the financial field, knowledge graphs provide functions such as extraction, fusion, analysis, inference, and decision-making of financial knowledge, and open up isolated multi-source data in the financial field. Through technologies such as data extraction, information extraction, semantic disambiguation, knowledge fusion, and knowledge processing, Build a financial knowledge graph to realize credit card anti-fraud, risk prediction, intelligent marketing and other applications in smart finance. For example, the knowledge graph constructs a credit card anti-fraud relationship graph based on major fraud elements such as mobile phone numbers, contact numbers, IP addresses, devices, and application documents. Fraud gangs may share information such as IPs, mobile phone numbers, and devices in consideration of the cost of crime. Based on these Established rules determine fraud and identify potential fraudulent users to provide early warning.

In the industrial field, with the advent of the big data era, more and more traditional industrial fields are ushering in digital transformation. Through in-depth analysis of relevant parameters in the production process, the knowledge graph calculates the determinants that are strongly related to the product yield, builds a curve model of the results based on the influencing factors, and applies the optimal solution to the final production. In addition, knowledge graphs also have application scenarios in the industrial field such as optimizing supply chains, improving production processes, and reducing equipment failure rates.

In the energy field, the modern power grid is a smart grid based on the physical power grid and combined with advanced sensor technology, information technology, data analysis technology, computer control technology, etc. It should meet the regional electricity demand, optimize power allocation, ensure the flexibility and stability of power supply, and ensure that users' power consumption is safe, reliable and economical. The knowledge graph is used to integrate information such as the power transmission relationship between substations within the dispatch scope, the equipment wiring relationship within the substation, and the equipment wiring relationship within the power plant, and combine it with the real-time operating status of the power grid to build a power grid digital twin map to achieve optimal emergency response from a global perspective. Functions such as power restoration strategy, cross-business data connection, equipment defect early warning, and impact scope analysis.

In the social field, social networks have become the fastest growing Internet application since they appeared on the Internet. I believe that we have received a lot of public opinion information in daily life, and may have also been a network keyboard warrior. In a social environment, users are not only the recipients of information, but also the producers, processors, and disseminators of information. Social users follow each other through Ways to form a huge user relationship network, such as Twitter-2010. The knowledge graph uses the massive information in social networks to build an association graph to realize functions such as social information analysis, recommendation of interested users, and early warning of network public opinion. For example, the knowledge graph can construct an interest graph based on the user's search habits, consumption habits, entertainment habits, etc., and accurately segment people or organizations with specific hobbies, thereby recommending people, things, and things of interest to the user. In short video software and streaming media, we will always find videos that we are interested in, and low-relevance content appears very rarely. This is the knowledge graph making recommendations based on your preferences, thereby increasing user stickiness.

In the retail field, unlike the seller's market in the past, today's e-commerce model is a buyer's market. How does the e-commerce platform select dozens of products that users are interested in from a large number of products to meet users' personalized shopping needs and become a retail market? The problem of domain product recommendation. The e-commerce knowledge graph starts from user needs, integrates users' browsing habits, purchase history, social behavior and other data, analyzes potential user groups for each category of goods, realizes intelligent recommendations and precise marketing, and provides buyers with a good shopping experience. It also maximizes the interests of merchants.

The above is an introduction to the broad application scenarios of knowledge graphs. The official website of Chuanglin Technology provides demo demonstrations of graphs such as credit card application anti-fraud and power grid intelligent dispatching. Interested readers can log on to the official website to explore on their own. Of course, knowledge graphs are also widely used in fields such as medical care, government affairs, education, and public security. We will explain the application cases of knowledge graphs in detail based on specific implementation scenarios from the perspectives of entity modeling, data mapping, visual display, and business analysis.

Current status of knowledge graph industry

With the continued development of the digital economy and the maturity of deep learning technology and NLP technology, the industrialization of knowledge graphs has become the focus of the current market layout. According to the "2022 China Knowledge Graph Industry Research Report" released by iResearch, the core market size of knowledge graph is expected to reach 10.7 billion yuan in 2021, and by 2026, the corresponding scale will exceed 29.6 billion yuan, with a compound annual growth rate of 2021-2026 The average growth rate has reached 22.5%. Finance and public security, two major knowledge map-related industries, are the main driving forces of the market scale, and the industry scale is showing a trend of rapid development. In the future, with the further advancement of digital government and the maturity of the industry, the government knowledge graph will also become one of the important driving forces in the market.

Based on the content of current research reports, the main difficulties in knowledge graph construction lie in data governance, industry expert reserves, underlying graph database storage, algorithm production processes and performance that need to be improved, customer awareness that needs to be cultivated, and product packaging that needs to be optimized. Overcoming the above-mentioned difficulties in building knowledge graphs will help ensure the authenticity and reliability of data from the source, while also cultivating comprehensive talents for the industry. Upgrading the underlying graph database storage method, improving algorithm performance, and optimizing product usability will also help The growth of the knowledge graph industry.

Speaking of the upgrade of the storage method of the underlying graph database, the current development of graph technology has entered the Graph3.0 period. The native graph database in this period is characterized by fast computing, high scalability, and intelligence. Since the graph database uses native graph storage, data is stored directly in the graph structure at the bottom layer, and query optimization is performed on the graph structure data at the algorithm layer, which can achieve low data expansion and high algorithm performance. The current domestic graph database products represented by Graph3.0 include Galaxybase , which adopts a native graph storage architecture, which to a certain extent solves the problem of underlying graph database storage in the knowledge graph construction process.

Knowledge graph development trends

The future is the era of cognitive intelligence. Perceptual intelligence is like the limbs, and cognitive intelligence is like the brain. The brain can perform knowledge extraction and business scenario reasoning and analysis on information, improving the understanding and analysis capabilities of AI. Among them, the knowledge graph will play a key role in breaking the situation, providing cognitive intelligence with insights into implicit relationships and logic, and empowering business decisions. At the same time, knowledge graph, as the underlying technology in the era of cognitive intelligence, will also usher in rapid development.

We believe that as the amount of data increases exponentially and changes are fleeting, the knowledge graph of the future must not only be larger, but also faster to seize opportunities and create value for enterprises. As the underlying pillar of the knowledge graph, graph databases should continue to optimize storage and computing performance to prepare for upcoming demands.

Of course, no technology is perfect, and technological integration is also the trend of future industry development. Learning from each other's strengths and compensating for weaknesses will also better serve the application of knowledge graphs, allowing them to continuously improve in polishing, and replicate successful experiences to create more solutions.

Guess you like

Origin blog.csdn.net/qq_41604676/article/details/133135168
Recommended