The technical principles and extended applications of artificial intelligence, machine learning and deep learning

In recent years, the upsurge of artificial intelligence has swept across, and it is affecting our lives in all walks of life. It seems that as long as you put on AI, even the most difficult things can be solved. So what exactly is AI? How does it work? How should it be imagined and applied?

AI, or artificial intelligence, is a technology capable of simulating human intelligence. It can process large amounts of data and can learn, think, and solve problems. AI can be applied in various fields, such as self-driving cars, speech recognition, image recognition, etc. It is conceivable that AI will play an important role in more fields in the future, such as medical diagnosis, financial risk assessment, robotics, etc.

insert image description here

artificial intelligence

The full name of AI is Artificial Intelligence, translated as artificial intelligence, which is a technology that human beings have dreamed of for a long time. Its origins date back to 1950, when scientist Alan Turing first posed the question of whether machines could think in his essay "Computers and Intelligence". This marked the birth of a new field of artificial intelligence and triggered people's unlimited imagination of AI.

Today AI has affected all walks of life, it can process large amounts of data, and can learn, think, and solve problems. AI has a wide range of applications in areas such as self-driving cars, speech recognition, and image recognition. Artificial intelligence has become an indispensable part of human life and an important force to promote human progress.

insert image description here

According to Alan Turing's theory, to judge whether a machine can think, it must pass a classic simulation game. This game became known as the Turing Test. In this test, there is a questioner C who keeps asking questions to machine A and human B in different rooms at the same time. As long as C cannot tell who is a computer and who is a human from A and B, we can claim that the machine in this room is a thinking machine. The Turing Test is a classic challenge that helps us evaluate the progress and performance of artificial intelligence.

Human beings have been working hard to develop machines or algorithms that can pass the Turing test.

In 1997, the most advanced IBM Deep Blue computer defeated the chess champion Garry Kasparov. Although it looked very powerful, it actually just exhausted all possibilities and selected the optimal steps from them, just like GPS navigation same system.

However, in the complex real world, this brute force method obviously cannot be applied to most situations. In order to apply AI to everyday life, more efficient methods need to be found. The accumulation of human wisdom is a good reference direction. The past few decades have seen the rapid development of artificial intelligence, which has become an important tool in many fields, from speech recognition to image recognition to natural language processing. AI has been widely used in our daily life, from smartphones to smart homes, it is changing the way we live.

machine learning

The core of human intelligence is experience, through continuous learning and recording experience to adjust the cognition of the outside world. In this way, when we encounter similar situations, we can easily use past experience to predict and deal with the unknown future.

The foundation of human wisdom lies in accumulating experience and adjusting the understanding of the outside world through continuous learning and recording information. In this way, when we face similar situations, we can easily use past experience to analyze and deal with the unknown future. In order to reduce the amount of information that needs to be remembered and processed, humans are good at classifying and labeling similar content. Can we use these experiences, that is, historical data, for machine learning, automatically discover event characteristics, and establish a correlation model between results and events, so as to predict future values ​​or automatically classify and make decisions?

insert image description here

Predicting numerical values ​​is a common problem that can be solved by finding a mathematical linear relationship between event characteristics and outcomes. For example, assuming that in a certain location, a 10-square-meter house is sold for 1 million yuan, and another 20-square-meter house is sold for 2 million yuan, we can use this information to deduce that the transaction price and the average price are approximately equal to each other. Ping 100,000 relationship.

This information can be used to predict house prices using linear regression. And when there are more and more transaction information, you can also use techniques such as T-degree decline and Gradient Descent to find a regression line that fits all the data, and then obtain a model that uses the average to predict house prices. This is the so-called linear regression. Method Linear Regression.

Automatic classification is a widely studied problem and many different algorithms are available. One of them is the logistic regression method. When faced with non-obvious classification problems, we can project the relationship between features and outcomes onto a logistic curve between 0 and 1. Using this approach, we can obtain a model that maps arbitrary values ​​to appropriate categories. This is the logistic regression method Logistic Regression.

Decision Tree Decision Tree is a machine learning algorithm that uses the relationship between features and classification results to build a tree structure from historical data. This tree structure is full of "if this, then that" decision paths that map different features to their appropriate categories. The random forest algorithm is improved on the basis of decision trees. It builds multiple decision trees by randomly selecting some features, and uses voting to get the final result. In this way, the importance of a single feature can be avoided from being over-amplified and causing bias. This is the Random Forest algorithm Random Forest.

The similar concept goes a step further, and the boosting tree algorithm is further improved on the basis of the random forest algorithm. It strategically constructs multiple decision tree models, allowing important features to have higher weights, resulting in higher accuracy. This algorithm is also called GBDT (Gradient Boosting Decision Tree).

Knearest Neighbors, or KNN for short, based on existing historical data, directly compares the feature similarity between new data and historical data, and selects the closest k data to determine the classification of new data. This algorithm is very simple and easy to understand, and it is usually used for classification and regression problems. Its advantage is that it does not require training data, only some historical data is needed to make predictions, and it has high accuracy in the case of data imbalance.

Bayesian classification Naive Bayesian Classifier is a classification algorithm based on Bayesian theorem and the assumption of feature conditional independence. It predicts the class of a new data by calculating the probability of the corresponding class under each feature condition. This algorithm is simple and easy to implement, and it performs well when dealing with large amounts of text data. However, it may not be as accurate as other algorithms due to the assumption that features are independent of each other. When dealing with data sparsity, Laplace smoothing can be used to improve its accuracy.

Support Vector Machine, or SVM for short, is a statistical learning method for classification and regression. Its basic idea is to find an optimal hyperplane in the training data and divide the data into two categories. It divides the data by finding the largest interval, which makes the classification better. SVMs perform well in high-dimensional spaces and also work well with linearly inseparable data.

So if the information we have has never been classified, is there still a way to automatically group them?

K-means clustering is an unsupervised learning algorithm used to group unlabeled data. It groups by selecting K random center points, and then makes each group of data points more and more similar by iteratively adjusting the center point and each group of data points. Finally, K categories are obtained, and each category is a similar data point. This algorithm is often used in data mining and data analysis to help us discover the underlying structure of data.

deep learning

All of the above are algorithms that use data to construct models when there is historical data, so what if there is no historical data?

insert image description here

Reinforcement Learning, or RL for short, is a machine learning method whose purpose is to allow agents or agents to learn how to obtain the greatest reward in an environment through continuous attempts. Its core idea is to learn the optimal strategy through continuous interaction. Through continuous trial and feedback, the agent's strategy is adjusted so that it can obtain higher rewards in the future. In different application scenarios, reinforcement learning can achieve very good results, such as robot control, game artificial intelligence, financial transactions, etc.

When choosing an algorithm, we need to consider whether there is a standard answer for historical data, as well as the characteristics and assumptions of each algorithm. In supervised learning, we can use linear regression or logistic regression for classification, while in unsupervised learning, we can use K-means algorithm for automatic clustering. For situations where there is no historical data, we can use reinforcement learning to train the model. Most importantly, we need to choose the most suitable algorithm according to the specific situation.

Machine learning is not limited to some specific types of problems or application scenarios. When choosing an algorithm, many factors need to be considered, including data volume, model performance, and accuracy. Some people have even developed standard operating procedures Standard Operating Procedures referred to as SOP to simplify the selection process of the algorithm. However, in higher-level and more complex application scenarios, more advanced methods are needed to improve the performance of the model. Therefore, the development of machine learning is still being explored and improved.

Human beings have been exploring how to make machines simulate the intelligence of the human brain. With the development of machine learning, we found that the intelligence of the human brain mainly comes from the interconnection between hundreds to thousands of neurons. Can we use similar concepts to make machines simulate the intelligence of the human brain? This is our current research topic.

The appearance of the neural network theory led to the birth of "New World Network". This theory further evolved into the later deep learning, Deep Learning. This theory draws on the operating mechanism of neurons in the human brain and simulates it in the form of digital logic. We call it the "perceptron", the Perceptron. It contains M input X, a bias Bias, after multiplying and summing the weight W, and then passing an activation function to simulate the potential generation mechanism of brain neurons. Finally, the degree to which this node is activated is output and passed to the next layer of Perceptron. Since most of the problems to be solved in reality will not have simple linear solutions, we usually choose nonlinear activation functions, such as Sigmoid, Tanh, Relu, etc.

By connecting multiple Perceptrons, a Deep Learning model architecture can be formed. In order to train this model, we need to input the data into the model one by one, perform forward propagation, calculate the output of the model, and bring the standard answer into the loss function Loss Function to calculate the difference between the two. Then use the optimization function (such as Gradient Descent) to carry out Backward Propagation, with the goal of reducing the difference, and adjust the weight in each Perceptron. When the amount of data is large enough and the difference between the model output and the standard answer is small enough, it means that the model has been trained successfully and can be used.

The concept of Deep Learning seems simple, but to realize this concept requires a lot of data, a lot of computing power and easy-to-use software.

Therefore, it was not until after 2012 that these three conditions were met that Deep Learning began to experience explosive growth. In the field of computer vision, we can use Convolutional Neural Network, CNN, to extract features such as the edge shape of the image through a small range of filters, and then pass these meaningful features to the Deep Learning model to effectively identify images objects in . In terms of image or artistic style, we can use the Generative Adversarial Network (GAN) for short to achieve it through two Deep Learning models competing with each other. Generating model Generator will generate fake data, while discriminant model Discriminator will judge whether the data is real or not. When the fake data generated by the generative model can be distinguished by the discriminative model, it means success. Such an approach can achieve high results in imitating images or artistic styles.

insert image description here

Some face-changing apps, or AI-generated transformations, use Natural Language Processing (NLP) technology for short. These technologies usually use recurrent neural network Nature Language Processing (RNN for short) to process sequential data to achieve the effect of sequential short-term memory. The more advanced long-term short-term memory neural network Long Short Term Memory (LSTM) technology is used to improve the long-term memory effect of RNN. Transformer is a new and efficient solution that uses the Attention mechanism to allow the model to process directly on the key parts. This mechanism is not only used in natural language processing, but also has good results in the field of computer vision.

In 2020, GPT-3 with 175B model parameters has been able to automatically generate articles and program codes, and answer questions, even with a quality no less than that of humans.

In the future, as the number of model parameters continues to grow exponentially, the practical application effect of this type of model is even more exciting. In addition to the two major fields of computer vision and natural language processing mentioned above, Deep Learning is in various fields. There are also amazing results in the field.

In 2017, Alphago, relying on deep learning and reinforcement learning technology, defeated the world chess king Ke Jie. This is a typical example of artificial intelligence surpassing human intelligence in some fields.

In 2020, Alphago's R&D team DeepMind solved the long-standing problem of protein molecular folding in the field of biology, which will help to better understand disease mechanisms, promote the development of new drugs, help agricultural production, and improve the earth's ecological environment.

Self-driving car technology is also advancing, and the accident rate is far lower than that of human drivers. In the field of medicine, the accuracy of AI in certain diagnostic aspects has also reached a level that exceeds that of humans. Applications such as unmanned stores and "China Skynet" are no longer new topics.

Summarize

Looking back at the 1950 Turing question, can machines think? We may still not be able to give a definite answer. However, today's human beings already have more technological achievements than they did back then, and are closer to this dream. The current artificial intelligence technology is like a child who is constantly learning and growing. It can see, hear, and speak, and can make accurate judgments on specific issues, even surpassing human cognitive abilities in the past.

insert image description here

However, they are still far behind when dealing with philosophical, emotional, ethical and moral issues. Humans and machines have their own strengths. Humans are good at thinking and innovating, but they have limited physical strength and sometimes make mistakes. The machine is good at memory and calculation, can answer specific questions stably and efficiently, and does not stop for 24 hours. Therefore, in this wave of artificial intelligence, the ideal strategy is for humans and machines to fully cooperate and play to their respective strengths. Humans can outsource low-level, repetitive, and trivial tasks to machines, freeing up time and energy to realize dreams, think about the meaning of life, and focus on solving important problems, thereby improving the overall human level.

Guess you like

Origin blog.csdn.net/qq_20288327/article/details/128660451