Linear Regression with multiple variables - Features and polynomial regression

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程，第五章《多变量线性回归》中第32课时《特征和多项式回归》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正，使其更加简洁，方便阅读，以便日后查阅使用。现分享给大家。如有错误，欢迎大家批评指正，在此表示诚挚地感谢！同时希望对大家的学习能有所帮助。

You now know linear regression with multiple variables. In this video (article), I wanna tell you a bit about the choice of features that you have and how you can get different learning algorithm, sometimes very powerful ones by choosing appropriate features. And in particular I also want to tell you about polynomial regression which allows you to use the machinery of linear regression to fit very complicated, even very non-linear functions.

Let's take the example of predicting the price of the house. Suppose you have two features, the frontage of the house and the depth of the house. So, here's the picture of the house we're trying to sell. So, the frontage is defined as this distance, is basically the width or the length of how wide of your lot is of this part of land you own, and the depth of the house is how deep your property is, so this is a frontage, that is a depth. So you have two features called frontage and depth. You might build a linear regression model like this where frontage is your first feature $x_{1}$ and depth is your second feature $x_{2}$ , but when you're applying linear regression, you don't necessarily have to use just the features $x_{1}$ and $x_{2}$ you're given. What you can do is actually create new features by yourself. So, if I want to predict the price of a house, what I might do instead is decide that what really determines the size of the house is the area or the land area that I own. So, I might create a new feature. I'm just gonna call this feature $x$ , which is $frontage*depth$ . This is a multiplication symbol. It's a frontage times depth because this is the land area that I own, and I might then select my hypothesis as that using just one feature which is my land area. Because the area of a rectangle is the product of the lengths of the sides. So, depending on what insight you might have into a particular problem, rather than just taking the features frontage and depth that we happen to have started off with, sometimes by defining new features you might actually get a better model.

Closely related to the idea of choosing your features is this idea called polynomial regression. Let's say you have a housing price data set that looks like this. Then there are a few different models you might fit to this. One thing you could do is fit a quadratic model like this. It doesn't look like a straight line fits this data very well. So maybe you want to fit a quadratic model like this where you think the price is a quadratic function and maybe that'll give you, you know, a fit to the data that looks like that. But then you maybe decide that quadratic model doesn't make sense because of a quadratic function, eventually this function comes down. We don't think housing prices should go down when the size goes up too high. So then maybe we might choose a different polynomial model and choose to use instead a cubic function, and what we have now a third-order term and we fit that, maybe we get this sort of model, and maybe the green line is a somewhat better fit to the data because it doesn't eventually come back down. So how do we actually fit a model like this to our data? Using the machinery of multivariant linear regression, we can do this with a pretty simple modification to our algorithm. The form of the hypothesis we know how to fit. It looks like this where we say $h_{\theta }(x)=\theta _{0}+\theta _{1}x_{1}+\theta _{2}x_{2}+\theta _{3}x_{3}$ . And if we want to fit this cubic model that I have boxed in green, what we're saying is that to predict the price of a house, it's $\theta _{0}$ plus $\theta _{1}$ times the size of the house plus $\theta _{2}$ times the square size of the house. So this term is equal to that term. And then plus $\theta _{3}$ times the cube of the size of the house. It's that third term. In order to map these two definitions to each other, well, the natural way to do that is to set the first feature $x_{1}$ to be the size of the house, and set the second feature $x_{2}$ to be the square of the size of the house, and set the third feature $x_{3}$ to be the cube of the size of the house. And, just by choosing my three features this way and applying the machinery of linear regression, I can fit this model and end up with a cubic fit to my data. I just want to point out one more thing, which is that if you choose your features like this, then feature scaling becomes increasingly important. So if the size of the house ranges from one to a thousand, so, you know from one to a thousand square feet, say, then the size squared of the house will range from one to one million, that's the square of a thousand, and your third feature $x_{3}$ , which is the size cubed of the house, will range from one to ten to the nine ( $10^{9}$ ), and so these three features take on very different ranges of values, and it's important to apply feature scaling if you're using gradient descent to get them into comparable ranges of values.

Finally, here's one last example of how you really have broad choices in the features you use. Earlier we talked about how a quadratic model like this might not be ideal because, you know, maybe a quadratic model fits the data okay, but the quadratic function goes back down and we really don't want, right, housing prices that go down, to predict that, as the size of housing increases. But rather than going to a cubic model there, you have, maybe, other choices of features and there are many possible choices. But just to give you another example of a reasonable choice. Another reasonable choice might be to say that the price of a house is $\theta _{0}+\theta _{1}(size)+\theta _{2} \sqrt{(size)}$ . So the square root function is this sort of function, and maybe there will be some value of $\theta _{1}$ , $\theta _{2}$ and $\theta _{3}$ , that will let you take this model and fit the curve that looks like that, and goes up, but sort of flattens out a bit and doesn't ever come back down. And, so, by having insight into, in this case, the shape of a square root function, and, into the shape of the data, by choosing different features, you can sometimes get better models.

In this video (article), we talked about polynomial regression. That is, how to fit a polynomial, like a quadratic function, or a cubic function, to your data. We also talked about the idea, that you have a choice in what features to use, such as that instead of using the frontage and the depth of the house, maybe, you can multiply them together to get a feature that captures the land area of a house.

In case this seems a little bit bewildering, that with all these different feature choices so how do I decide what features to use. Later in this class, we'll talk about some algorithms for automatically choosing what features to use, so you can have an algorithm look at the data and automatically choose for you whether you want to fit a quadratic function, or a cubic function, or something else. But, until we get to those algorithms, for now I just want you to be aware that you have a choice in what features to use, and by designing different features you can fit more complex functions to your data than just fitting a straight line to the data and in particular you can fit polynomial functions as well and sometimes by appropriate insight into the features you can get a much better model for your data.

<end>

王彩旗

发布了41 篇原创文章 · 获赞 12 · 访问量 1306

私信关注

Linear Regression with multiple variables - Features and polynomial regression

猜你喜欢