Machine learning day09 (how to set learning rate α, feature engineering, polynomial regression)

1. Learning curve graphs for common mistakes (top two)
insert image description here

  • When the learning curve graph about the number of iterations appears wavy or upwardly increasing, it indicates that the gradient descent algorithm has made an error
  • This situation can be caused by the learning rate α being too large, or the code has a bug

2. Common debugging methods:

  • Choose a very, very small learning rate α to see if the learning curve is still wrong, that is, it increases after a certain iteration, that is, it appears wavy or upwardly increasing.
  • If the learning curve is normal, keep decreasing. Indicates that the previous learning rate α is too large. If the learning curve is still wrong, it means that the code has a bug
  • Note: Setting the learning rate α to a very, very small number is only for debugging, and does not mean that the α at this time is the most effective learning rate α for the gradient descent algorithm. Because when the learning rate is too small, the gradient descent algorithm may need to iterate many times to converge.

3. To sum up, select the value of learning rate α sequentially as shown in the figure
insert image description here

  • Starting from 0.001, multiply by 3 in turn, select the value of α, until you find a too small α value, let the gradient descent algorithm converge after many iterations, and find a too large α value, so that the gradient descent algorithm cannot converge, that is The learning curve is wrong. Finally choose an alpha value as large as possible, or choose a value slightly smaller than the largest possible alpha value. Usually, the value of α chosen in this way is an appropriate learning rate

4. Feature Engineering
insert image description here

  • Use knowledge or intuition to design new features, usually by transforming or combining original features, to make it easier for learning algorithms to make accurate predictions
  • Defining new features based on insights about the application scenario, rather than just adopting the features you happen to have, may lead to better models. It can not only fit straight lines to data, but also fit curves and nonlinear functions.

5. Polynomial regression
insert image description here

  • Linear regression is not suitable for all models, sometimes we need to fit a curve to our data
  • As shown in the figure, a new polynomial regression algorithm is proposed using the idea of ​​multiple linear regression and feature engineering, which can fit curves and nonlinear functions
  • If we use x, x², x³ as features and we are using gradient descent algorithm, then feature scaling is necessary because the range of possible values ​​​​between features is too large
    insert image description here
  • We can also use x, √x as features, perhaps without feature scaling when the range of possible values ​​of the features is small.
  • To sum up: Using feature engineering and multiple linear regression, we can obtain a variety of models, and which one to use will be learned later.

Guess you like

Origin blog.csdn.net/u011453680/article/details/130344229