CHANG machine learning first mission punch

What is Learning Machine:
Machine Learning (Machine Learning) is a technique for data analysis, computer Church innate intellectual human and animal activity. At the same time, the machine learning is the artificial intelligence means may be approximated as Looking for a function from data.
Machine learning is divided into three steps:
Step1: DEFINE A SET function of
Step2: goodness of function
Step3 Pick Best function The
central limit theorem
center limit Theorem is the study of random variables and limit distribution under what conditions the issue normal distribution.

Central Limit Theorem Central Limit Theorem: set from the mean [mu], and variance σ ^ 2;
any one generally from a sample in an amount of n samples, and when n is sufficiently large, the sampling distribution of the sample mean approximately with mean [mu], variance It is a normal distribution of σ ^ 2 n /.
Central limit theorem can be summarized in the following two sentences:

1) the average of a sample of any of it will be about equal to the average in the population.

2) No matter what the overall distribution of any of the overall sample mean will surround the overall average value, and a normal distribution.
The role of the central limit theorem

1) In the absence of an overall way to get all the data, we can estimate the total sample
If we have a certain right to take samples mean and standard deviation, we may get the mean and standard deviation of the estimate.
For example, if you are the leader of Beijing Xicheng District, Xicheng District wants to schools where teaching quality assessment.
At the same time, you do not believe in each school's exam results, therefore it is necessary to conduct sample tests for each school is 100 students were randomly selected to participate in a similar exam tests. As the leader in charge of education, do you think reference only 100 student performance on the quality of teaching the whole school is feasible to judge it?
The answer is feasible. Central limit theorem tells us that a correct extraction of samples does not vary significantly from the groups they represent. That is, the sample results (100 students randomly selected test scores) can well reflect the entire population (all students of a school test performance).
2) According to the overall mean and standard deviation of a sample to determine whether overall
if we have specific information of a general, as well as a sample of data, we can infer whether the sample is a sample of one of the group .
By normal distribution central limit theorem, we can calculate an overall probability sample belongs to is how much. If the probability is very low, then we can confidently say that the sample does not belong to that group. This is also the principle of hypothesis testing statistical probability.
Theorem expression
Here Insert Picture Description
normal distribution

Normal:
normal (Normal distribution), also known as Gaussian distribution (Gaussiandistribution), when subject to a random variable X [mu] is the mathematical expectation, variance σ ^ 2
Gaussian distribution, referred to as N (μ, σ ^ 2) . Probability density function which determines its normal position to a desired value of μ, the standard deviation σ determines the amplitude distribution. We usually refer to a standard normal distribution is μ = 0, σ = a normal distribution.
When μ = 0, σ = 1, to become a normal distribution standard normal distribution N (0,1). The probability density function is:
Here Insert Picture Description
Here Insert Picture Description

Characteristics of normal distribution density function is: [mu] symmetric about, and the maximum value at the [mu], in the positive (negative) infinity to 0, the inflection point μ ± σ, the intermediate shape exhibits low high sides, the image is a bell curve in the x-axis direction.

Maximum likelihood estimation
maximum likelihood estimation (MLE) proposed by Gauss in 1821 for a normal distribution. Here Insert Picture Description
Here Insert Picture Description
Maximum likelihood estimation of the general solving process:

(1) write the likelihood function;

(2) Taking the logarithm of the likelihood function, and analyzed;

(3) the derivative;

(4) the likelihood equation solution
features maximum likelihood estimation:

    1.比其他估计方法更加简单;

    2.收敛性:无偏或者渐近无偏,当样本数目增加时,收敛性质会更好;

    3.如果假设的类条件概率模型正确,则通常能获得较好的结果。但如果假设模型出现偏差,将导致非常差的估计结果。

Local optimum and global optimum:
local optimization in general, tend to work out is not necessarily the best, it is possible there is a big error, and often global optimal solution is the right solution.
Gradient descent
https://blog.csdn.net/pengchengliu/article/details/80932232
Here Insert Picture Description
Code Example:
https://blog.csdn.net/huahuazhu/article/details/73385362
Reference documents:
1. Central Limit Theorem, HTTPS: //www.zhihu.com/question/22913867/answer/250046834 .
2 maximum likelihood estimation, https://blog.csdn.net/zengxiantao1994/article/details/72787849 .
3 loss function, HTTPS: // Blog. csdn.net/wjlucc/article/details/71095206 .

Guess you like

Origin blog.csdn.net/JiangLongShen/article/details/90106303