Dry goods: AI empowered pharmaceutical industry development case

Editor's guide: The pharmaceutical industry tends to have long R&D cycles, low success rates and high R&D costs. This is also a curse that has always plagued pharmaceutical companies. The rapid development of AI technology has brought changes to many industries. The pharmaceutical industry also benefits from the technological dividends brought by AI, solving industry pain points and improving development efficiency. In this article, the author introduces a case of AI empowering the development of the pharmaceutical industry.

This past weekend, I was invited to give a speech on the topic of "AI Empowers the Development of the Pharmaceutical Industry".

In this speech, I gave you a detailed explanation of the two cases we have done before. What is unexpected is that the audience interested in medicine is far higher than I expected.

From the perspective of the industrial Internet, the future is the market for B-end products, and it is also an important moment for traditional enterprises to transform into informatization. We need to conduct an in-depth analysis of the future product layout in conjunction with future development, and do a good job of the top-level structure in order to embrace the future changes in the product landscape.

I am a product manager in the pharmaceutical industry, so in this speech, I mainly discussed the direction of the pharmaceutical industry. It includes the definition of the language of the pharmaceutical industry and symbolic OCR technology, and the application of knowledge graphs in the pharmaceutical industry and drug information fields.

There is still a lot of work that can be done in the pharmaceutical industry, which is also sorted out in my speech. From early drug discovery, upstream process, downstream process, production quality control and other aspects, we still need to explore.

I am here to give you a detailed explanation of the main content of my speech.

I won’t go into details about the avatar and introduction. So far, there are no related symptoms of hair loss.

The core of the Industrial Internet lies in the three cornerstones of data collection, data connection, data calculation and processing, commonly known as terminal, pipe, and cloud. Among them, "end" mainly talks about device interconnection, "management" mainly talks about internal and external interconnection, and "cloud" mainly emphasizes cloud computing and edge computing.

Many people think that drug development is unpredictable. In fact, these are far from the public. The rise of the Industrial Internet is to break away from the public's cognition and bring technology into the industry.

The theme of this time is the research and development case of the pharmaceutical industry, so there will naturally be a lot of knowledge in the pharmaceutical industry. How specific AI can accelerate the environment of the pharmaceutical industry, we must understand where we can accelerate. The following figure describes in detail some of the content in each application scenario.

For partners without a medical background, it may be a little more difficult to understand, but you can understand how to decompose specific industrial links.

First, we explain through the case of industrial language.

Industrial language is the same as we usually speak, but it is used as a way of communication in industry. This language form has three basic elements: science, industry, and versatility.

The case of molecular image recognition can be interpreted as an industrial language extraction technology. Through AI technology, the molecular formulas in the literature can be extracted at one time and the recognition effect can be achieved.

This technology has many application scenarios: First, it can search for molecular structure. For documents with legal effect such as patents, through our technology, we can obtain the protection boundary of all the protected compounds of the patent at one time, which greatly saves money. The labor cost of patent analysts improves efficiency.

The algorithm of this project is divided into three stages: molecular location discovery, using the target detection Mask RCNN related technology; atom and bond identification, using open CV related technology; atom and bond representation, using statistical relationship learning probability map method.

Among them, the expression of the chemical formula uses the MOL file format.

The Markov logic network is to soften the inference formula in the form of probability. This method is actually a kind of inference. When the number of inferred nodes is increased, it becomes a form of a network, which is a probability graph.

The second case is our project in the industrial knowledge graph. The most important thing about the industrial knowledge graph is to structure the knowledge in industrial production and infer the nodes that constitute the knowledge graph.

Knowledge graph is currently a more popular AI technology system, but when it is used in industry, it is necessary to clarify the specific usage scenarios and specific links where the technology can be used.

As shown in the figure below, in the process of medical knowledge retrieval, investment landmarks, drug redirection, clinical path evaluation, etc., the relevant technologies of the knowledge map can be used to meet these needs.

The map construction process is roughly divided into several steps as described in the following figure:

Generally speaking, an enterprise has three types of data sources that should participate in the process of constructing the graph.

  1. Enterprise internal data: During the production process, the enterprise will elaborate a lot of empirical content data, which is closely integrated with the business and can be used as the data source of the component map.
  2. Externally public data: This type of data may be in the form of some knowledge bases and widely exist on the Internet, such as industry-related industry information, open source databases, etc.
  3. External payment data: If the company has funds to purchase a part of the payment data, then this part of the data has a very regular data structure and can be incorporated into the graph for construction.

The drug knowledge map can screen which drugs currently on the market have curative effects on the new coronavirus. Application scenarios such as these are of far-reaching significance to the entire pharmaceutical industry and the health of the whole people. Thank you! !

Guess you like

Origin blog.csdn.net/weixin_42137700/article/details/114072603