AI introductory knowledge and data collation

Most of the artificial intelligence we talk about now refers to machine learning. Now I will introduce some concepts and learning routes about machine learning.

1. Artificial intelligence concept

Artificial intelligence, machine learning and deep learning

1. Artificial
intelligence. There are many opinions about the introduction of artificial intelligence on the Internet, but if
Insert picture description here
you want to know more about it in the original book of "Introduction to Computer Science" , you can find Du Niang or find some papers on AI review, here I won’t go into it in depth, after all, this blog is for our novices.

2. Machine learning
Machine learning is a multi-disciplinary interdisciplinary, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. Specializing in the study of how computers simulate or realize human learning behaviors in order to acquire new knowledge or skills, and reorganize the existing knowledge structure to continuously improve its own performance. It is the core of artificial intelligence and the fundamental way to make computers intelligent. Its application has spread to various branches of artificial intelligence, such as expert systems, automatic reasoning, natural language understanding, pattern recognition, computer vision, intelligent robots and other fields. One of the most typical ones is the knowledge acquisition bottleneck problem in expert systems, which people have been trying to overcome by using machine learning methods.

3. Deep learning
Deep learning refers to a multi-layer artificial neural network and the method of training it. A layer of neural network will take a large number of matrix numbers as input, take weights through nonlinear activation methods, and generate another data set as output. This is just like the working mechanism of a biological neural brain. Through the appropriate number of matrices, multiple layers of organization are linked together to form a neural network "brain" for precise and complex processing, just like people recognizing objects and marking pictures.

Deep learning is a new field developed from artificial neural networks in machine learning. The so-called "depth" in the early days refers to neural networks with more than one layer. However, with the rapid development of deep learning, its connotation has exceeded the scope of traditional multi-layer neural networks and even machine learning, and has gradually developed rapidly in the direction of artificial intelligence.

The above also said that it was specially for Xiaobai, and the style of painting changed too fast.

However, I know that there is a highly praised article on the difference between the three and the connection and explanation. You can check it out, https://www.zhihu.com/question/57770020 . If you are interested, you can check it out. The writing is really good. of.

Pattern recognition

1. What is pattern
recognition ? Pattern recognition refers to a basic human intelligence. In daily life, people often perform "pattern recognition". With the advent of computers in the 1940s and the rise of artificial intelligence in the 1950s, people certainly hope to use computers to replace or expand part of human brain work. (Computer) pattern recognition developed rapidly in the early 1960s and became a new subject.

Pattern recognition refers to the process of processing and analyzing various forms of information (numerical, literal and logical relations) that characterize things or phenomena to describe, identify, classify and explain things or phenomena . It is information science And an important part of artificial intelligence.
  
Pattern recognition is to study the automatic processing and interpretation of patterns by computer using mathematical techniques. The environment and objects are collectively called "patterns." With the development of computer technology, it is possible for humans to study complex information processing processes. An important form of the process is the recognition of the environment and objects by living organisms.

The main research directions of pattern recognition are image processing and computer vision, speech and language information processing, brain network group, brain-like intelligence, etc., to study the mechanism of human pattern recognition and effective calculation methods.

I also recommend a blog here, which also introduces the pattern recognition very vividly, https://blog.csdn.net/eternity1118_/article/details/51105659

2. Arrangement and summary of several common pattern recognition algorithms I
won't introduce them here, after all, this blog only introduces some basic knowledge.
A blog is also recommended here: https://blog.csdn.net/scyscyao/article/details/5987581?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase&depth_1-utm_source=distribute.pc_relevant.none -task-blog-BlogCommendFromMachineLearnPai2-1.nonecase

Prediction task

In layman's terms, unknown information is obtained through known information. For example, in the banking industry, the personal credit information of customers is used to evaluate personal loan insurance.

In the prediction task, we want the model to be as accurate as possible. On the contrary, the form of the prediction model f may be a black box model (that is, the model itself cannot be explained well or is not clear. We are more concerned about the input and output. It does not try to examine its internal structure), as long as we can improve our prediction accuracy, we will agree to achieve the goal. It is generally believed that the neural network model belongs to the black box model. For example, a set of neural network models with autonomous learning capabilities developed by Google X Lab a few years ago can find photos with kittens from 10 million pictures. Here, the input is these tens of millions of pictures, and the output is the recognition of these pictures.

There is another control task here, so I won’t introduce it again. Here is an article on Zhihu that is well written, https://www.zhihu.com/question/45536799

Recommendation algorithm

https://blog.csdn.net/App_12062011/article/details/85414969

distributed

Distributed computing is a research direction in computer science. It studies how to divide a problem that requires huge computing power into many small parts, then distribute these parts to multiple computers for processing, and finally integrate these calculation results Get up and get the final result. Distributed network storage technology is to store data in multiple independent machines and equipment. The distributed network storage system adopts a scalable system structure, uses multiple storage servers to share the storage load, and uses location servers to locate storage information. This not only solves the bottleneck problem of a single storage server in the traditional centralized storage system, but also improves the reliability of the system Availability, availability and scalability.

To put it simply, the data is distributed to multiple servers for data training to meet the computing power of multiple data.

Fault tolerance

It is the ability of the system to still operate normally when some components (one or more) fail.
Each server may have accidental data loss, and the system can still keep running.

Collaborative filtering

In simple terms, collaborative filtering is to use the preferences of a group with similar interests and common experience to recommend information that users are interested in. Individuals give a considerable degree of response (such as a score) to the information through a cooperative mechanism and record it to achieve the purpose of filtering. To help others filter information, the response is not necessarily limited to those of particular interest, and the record of particularly uninteresting information is also very important.

2. Obtain data and feature engineering

data set

Training set: used to fit the model and train the classification model by setting the parameters of the classifier.
Validation set: used to determine model hyperparameters and select the best model.
Test set: only used for performance evaluation of the trained optimal function.

Discrete data, continuous data

Discrete variables refer to discrete variables whose values ​​can only be calculated in natural numbers or integer units. For example, the number of companies, the number of employees, the number of equipment, etc., can only be counted by the number of measurement units. The value of this variable is generally counted. Obtained.

Variables that can be arbitrarily valued within a certain interval are called continuous variables, and their values ​​are continuous, and two adjacent values ​​can be divided infinitely, and then an infinite number of values ​​can be taken. For example, the size of the production part, the height of the human body, Weight, chest circumference, etc. are continuous variables, and their values ​​can only be obtained by measurement or measurement.

Features, feature vectors, samples, labels

A feature is the abstract result of an object or a set of object characteristics. Features are used to describe concepts. Any object or group of objects has many characteristics. People abstract a certain concept based on the common characteristics of the objects, and the concept becomes a characteristic. In mathematics, feature is a generalization of classical feature function in local domain.
Simply put: the feature is the input variable, that is, the x variable in simple linear regression.

Feature vector:
https://blog.csdn.net/woainishifu/article/details/76418176

The label is what we want to predict, the y variable in simple linear regression. The label can be the future price of wheat, the animal species shown in the picture, the meaning of the audio clip, or anything.

Sample: Specimen is a part of the individuals observed or investigated, and the population is all of the research objects.

Recursion, iteration, parallelism

Recursion refers to the behavior of a function continuously calling itself. For example, programmatically output the famous Fibonacci sequence. (Linear recursion and tail recursion.)

Iterate refers to visiting each item in the list one by one in a certain order. For example, the for statement.
Iteration can only correspond to collections, lists, arrays, etc. You cannot iterate over the execution code.

Parallel: multithreading

PCA, dimensionality reduction

It is a common data analysis method, often used for dimensionality reduction of high-dimensional data, and can be used to extract the main feature components of the data.

https://blog.csdn.net/zouxiaolv/article/details/100590725

Distribution, topic distribution, tail distribution

LDA processing document subject distribution:
https://www.jianshu.com/p/67ec0762e07a

Long-tailed distribution: Except for the normal distribution, all others are long-tailed distributions.

Feature selection

The main purpose of feature selection: dimensionality reduction, reducing the difficulty of learning tasks, and improving the efficiency of the model

https://blog.csdn.net/hren_ron/article/details/80914491

Common method:
https://blog.csdn.net/qq_33876194/article/details/88403394

There are also vectors, scalars, polynomials, normalization, positive samples, negative samples, category collections, items to be classified, sequences, vector sequences, independent variables, dependent variables, word segmentation, word segmentation, feature extraction, etc. One introduced.

3. Model training

Gradient descent

Gradient descent is a kind of iterative method that can be used to solve least squares problems (both linear and nonlinear). When solving the model parameters of the machine learning algorithm, that is, the unconstrained optimization problem, Gradient Descent is one of the most commonly used methods, and another commonly used method is the least squares method. When solving the minimum value of the loss function, the gradient descent method can be used to solve it step by step to obtain the minimized loss function and model parameter values. Conversely, if we need to solve the maximum value of the loss function, then we need to use the gradient ascent method to iterate. In machine learning, two gradient descent methods have been developed based on the basic gradient descent method, namely stochastic gradient descent method and batch gradient descent method.

Representation learning

Convert raw data into a form that can be effectively developed by machine learning. It avoids the trouble of manually extracting features, allowing the computer to learn how to use features while also learning how to extract features: learning how to learn.

Supervised and unsupervised learning

https://blog.csdn.net/GodDavide/article/details/102677973

BP neural network

Back-propagation neural network
https://blog.csdn.net/robert_chen1988/article/details/99237827

Overfitting, underfitting

Over-fitting: the training set has high accuracy, but the test set has a particularly low accuracy; under-fitting: both the training set and the test set perform poorly

Here only introduce some of the more common basic terms in machine learning.

The blog I wrote this is also a summary and a plan for my own learning route.

to be continued…

Guess you like

Origin blog.csdn.net/weixin_45696161/article/details/106560612