Linear model of machine learning is the simplest, most basic model results, are often used in classification, regression and other learning tasks.

Regression and classification differences:

- Regression: prediction value is a continuous real number;
- Category: predictive value of discrete categories of data.

1. Linear regression model to do the task ---- linear regression method, common loss is a function of the mean square error, the goal is to minimize the loss function. The following is a mean square error expression:

Then based **method to solve the model least square method referred to as the mean square error.**

Least squares idea: Looking for a super plane so that the training data set for all sample points to Hyperplane distance and minimum.

## to sum up:

Shortcomings and improvement: is the use of ultra-linear regression plane to fit all the training data, but if the data is not linear distribution relationship, the results obtained are less linear model fit (ps: when you owe fit enough features to learn ). If the problem of poor fitting of two ways:

The first method: dig more features, such as combinations between different features, but doing so would make the model more complicated, but also a good feature selection is not a simple matter;

The second method: by modifying the linear regression, the method of this time there is a "locally weighted linear regression (LWR of)", which allows us to add new features without the premise, was obtained similar results. The method is simply modified loss function:

However, LWR also inadequate. The biggest drawback is the space overhead is relatively large, the linear regression model, the optimal solution when the trained parameters, you can get the new data to predict the output, but the LWR parameters in addition to retaining the optimal solution, but also retain all of training data, in order to obtain training data for each new data corresponding to the weight value.

2. The linear model to classify learning ---- Logistics Regression: The basic idea is to construct a rational hyperplane in space, the space is divided into two sub-regional control, each category in the plane of one side.

该算法一般采用的是Sigmoid函数：它可以将输入数据压缩到0到1的范围内，得到的结果不是二值输出，而是一个概率值，通过这个数值，可以查看输入数据分别属于0类或属于1类的概率。

特别地，以上两种线性模型，都是广义线性模型的特殊形式。