[Study Notes, Interview Preparation] Summary of key points of the Machine Learning Xigua book and reference answers to after-class exercises - Chapter 1

directory address

Chapter 1 Introduction

1.1 Introduction

Machine learning: improving the performance of the system itself through calculation and use of experience.

Learning algorithm: An algorithm that generates a "model" from data on a computer.

Model/Learner: The results learned from the data. Instantiation of a learning algorithm given data and parameter space.

1.2 Basic terminology

Data set: A collection of data records.

Sample/instance: a data record.

Attribute/Feature: A matter that reflects the nature of an object in some aspect.

Attribute value: The value of the attribute.

Attribute space/sample space/input space: a space spread by attributes.

A specific example of stretching is: arbitrarily "stretching" or "extending" two line segments (or vectors) in a plane so that they become two sides of a plane triangle formed by them.
Suppose there are two vectors v 1 = [ 1 , 2 ] \boldsymbol{v_1}=[1,2]v1=[1,2 ]v 2 = [ 3 , 4 ] \boldsymbol{v_2}=[3,4]v2=[3,4 ] , we can get a vector on the plane by linearly combining them:
v 3 = av 1 + bv 2 = [ 1 2 ] a + [ 3 4 ] b \boldsymbol{v_3}=a\boldsymbol{v_1 }+b\boldsymbol{v_2}=\begin{bmatrix} 1 \\ 2 \end{bmatrix}a + \begin{bmatrix} 3 \\ 4 \end{bmatrix}bv3=a v1+bv2=[12]a+[34]b
insidea , ba,ba,b is a real coefficient. At this time,v 1 \boldsymbol{v_1}v1and v 2 \boldsymbol{v_2}v2Spread into a plane, v 3 \boldsymbol{v_3}v3is a new vector formed by the linear combination of these two vectors, also in this plane. If we choose the right aaa andbbb , you can letv 3 \boldsymbol{v_3}v3Located at any point on this plane, thereby "expanding" the entire plane.

Dimensionality: Each example is described by the dimensionality attribute.

Learning/training: The process of learning a model from data.

Training data: Data used in the training process.

Training sample: A sample of training data.

Training set: A collection of training samples.

Hypothesis: The learned model corresponds to a certain pattern.

For example, if we use a linear regression model to predict the price of a house, our hypothesis might be about a linear relationship between the price of the house and various attributes of the area (e.g., house size, floor, location, etc.).

Ground-truth: the underlying patterns of the data.

Prediction: Inference results.

The training set also counts

label: the result of the example.

Example: A sample with labeled information.

Example means to demonstrate.

Label space/output space: the space formed by labels.

Classification: Predicting discrete values.

Regression: Predicting continuous values.

Binary classification: classification of 2 categories.

Positive class: a class of the second classification.

Negative class: Another class of binary classification.

Multi-class classification: Classification of multiple categories.

at least 2

Testing: Use the learned model to predict.

Testing sample: The sample to be predicted.

Clustering: Divide the training set into several groups.

Cluster: A group of clusters.

Supervised learning: training with labeled information.

Unsupervised learning: training on unlabeled information.

Generalization: The ability of a learned model to apply to new samples.

Distribution: The probability of all possible values ​​of a random variable within a certain range.

Independent and identically distributed (iid): Each sample is independently sampled from the distribution.

Enter one sample at a time, regardless of other information, so.

1.3 Hypothesis space

Generalization: From the specific to the general.

Induction: The process of generalization.

Specialization: general to special.

Deduction: The process of specialization.

Inductive learning: the induction of “learning from examples”.

Concept: Narrow inductive learning requires rules learned from training data.

Example: A good melon is from XXX

Fit: The process of fitting a model through training data.

Hypothesis space: the set of all hypotheses.

It can just be something people considerInsert image description here

Version space: The set of all hypotheses that are consistent with the training data.

Insert image description here

1.4 Inductive preferences

Inductive bias: The preference of a machine learning algorithm for certain types of hypotheses during the learning process.

Insert image description here

Occam's razor principle: If there are multiple hypotheses that are consistent with observations, choose the simplest one.

No Free Lunch Theorem (NFL) Theorem: The expected performance of all algorithms is the same on all possible data distributions.
(It is necessary to consider that the inductive preference of the learning algorithm itself must match the problem)

Insert image description here

Formula (1.1):
Insert image description here

Understanding: Refer to this article.
Intuitively, it is a specific formula for finding expectations.

Formula (1.2):
Insert image description here

Understanding: Refer to the Pumpkin Book.
Intuitively, when the real distribution is uniform and random, any algorithm is just guessing.

1.5 Development History

1.6 Application status

1.7 Reading materials

exercise

Reference here

Guess you like

Origin blog.csdn.net/qq_51945248/article/details/129860339