【week2】Linear Regression with Multiple Variables

Octave不同环境下的配置

  • Installing Octave on GNU/Linux

Installing Octave on GNU/Linux We recommend using your system package
manager to install Octave.

On Ubuntu, you can use:

sudo apt-get update && sudo apt-get install octave On Fedora, you can
use:

sudo yum install octave-forge Please consult the Octave maintainer’s
instructions for other GNU/Linux systems.

“Warning: Do not install Octave 4.0.0”; checkout the “Resources”
menu’s section of “Installation Issues”.

  1. Octave Resources
    At the Octave command line, typing help followed by a function name displays documentation for a built-in function. For example, help plot will bring up help information for plotting. Further documentation can be found at the Octave documentation pages.

  2. MATLAB Resources
    At the MATLAB command line, typing help followed by a function name displays documentation for a built-in function. For example, help plot will bring up help information for plotting. Further documentation can be found at the MATLAB documentation pages.

一、Multivariate Linear Regression

1.Multiple Features

Linear regression with multiple variables is also known as “multivariate linear regression”.
We now introduce notation for equations where we can have any number of input variables.
在这里插入图片描述
在这里插入图片描述

2.Gradient Descent for Multiple Variables

在这里插入图片描述
The following image compares gradient descent with one variable to gradient descent with multiple variables:
在这里插入图片描述

3.Gradient Descent in Practice

  • I - Feature Scaling

We can speed up gradient descent by having each of our input values in roughly the same range. This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.

The way to prevent this is to modify the ranges of our input variables so that they are all roughly the same. Ideally:
在这里插入图片描述
These aren’t exact requirements; we are only trying to speed things up. The goal is to get all input variables into roughly one of these ranges, give or take a few.

Two techniques to help with this are feature scaling and mean normalization. Feature scaling involves dividing the input values by the range (i.e. the maximum value minus the minimum value) of the input variable, resulting in a new range of just 1. Mean normalization involves subtracting the average value for an input variable from the values for that input variable resulting in a new average value for the input variable of just zero. To implement both of these techniques, adjust your input values as shown in this formula:
在这里插入图片描述

Where μ_iμ i ​ is the average of all the values for feature (i)
and s_is i ​ is the range of values (max - min), or s_is i ​
is the standard deviation.

Note that dividing by the range, or dividing by the standard
deviation, give different results. The quizzes in this course use
range - the programming exercises use standard deviation.

  • II - Learning Rate

Debugging gradient descent. Make a plot with number of iterations on the x-axis. Now plot the cost function, J(θ) over the number of iterations of gradient descent. If J(θ) ever increases, then you probably need to decrease α.

Automatic convergence test. Declare convergence if J(θ) decreases by less than E in one iteration, where E is some small value such as 10^{−3}10
−3
. However in practice it’s difficult to choose this threshold value.
在这里插入图片描述
在这里插入图片描述
To summarize:

If α is too small: slow convergence. 如下图的B

If α is too large: may not decrease on every iteration and thus
may not converge. 如下图的C

在这里插入图片描述
eg:函数没有下降反而上升,说明α偏大。B下降得比较缓慢,所以α值设置得偏小,所以正确的选择是B。

4. Features and Polynomial Regression

We can improve our features and the form of our hypothesis function in a couple different ways.

We can combine multiple features into one. For example, we can combine x1,x2 into a new feature x3 by taking x1*x2

在这里插入图片描述

  • Polynomial Regression

要注意进行缩放
在这里插入图片描述

二、Computing Parameters Analytically

1.Normal Equation正规方程

Gradient descent gives one way of minimizing J. Let’s discuss a second way of doing so, this time performing the minimization explicitly and without resorting to an iterative algorithm. In the “Normal Equation” method, we will minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero. This allows us to find the optimum theta without iteration. The normal equation formula is given below:
在这里插入图片描述
There is no need to do feature scaling with the normal equation.
The following is a comparison of gradient descent and the normal equation:
在这里插入图片描述

2.Normal Equation Noninvertibility正规方程不可逆

在这里插入图片描述

三、Octave使用

1.eg1余弦与正弦函数

>> t = [0:0.01:0.98]
>> y1 = sin(2*pi*4*t);
>> plot(t,y1)
>> y2 = cos(2*pi*4*t);
>> plot(t,y2)
>> hold on;
>> plot(t,y2)
>> plot(t,y1,'b')
>> xlabel('time')
>> legend('sin','cos')
>> ylabel('value')
>> title('my plot')
>> print -dpng 'myPlot.png'

在这里插入图片描述
在这里插入图片描述
plot函数的一些格式:

 Format arguments:

     linestyle

          '-'  Use solid lines (default).
          '--' Use dashed lines.
          ':'  Use dotted lines.
          '-.' Use dash-dotted lines.

     marker

          '+'  crosshair
          'o'  circle
          '*'  star
          '.'  point
          'x'  cross
          's'  square
          'd'  diamond
          '^'  upward-facing triangle
          'v'  downward-facing triangle
          '>'  right-facing triangle
          '<'  left-facing triangle
          'p'  pentagram
          'h'  hexagram

     color

          'k'  blacK
          'r'  Red
          'g'  Green
          'b'  Blue
          'y'  Yellow
          'm'  Magenta
          'c'  Cyan
          'w'  White

2.eg2限定位置画图

>> figure(1);plot(t,y1)
>> figure(2);plot(t,y2)
>> subplot(2,2,1)
>> plot(t,y2);
>> subplot(2,2,4)
>> plot(t,y1);
>> axis([0 0.5 -1.2 1.2])		%改变范围,y的取值范围是-1.2~1.2,x的取值范围是0~0.5
>> print -dpng 'q.jpg'
>>clf;

在这里插入图片描述
在这里插入图片描述

3.eg3可视化矩阵

>> A = magic(5)
A =

   17   24    1    8   15
   23    5    7   14   16
    4    6   13   20   22
   10   12   19   21    3
   11   18   25    2    9

>> imagesc(A);

在这里插入图片描述

>> colorbar

在这里插入图片描述

>> colormap gray;

在这里插入图片描述

>> imagesc(magic(15)),colorbar,colormap gray

在这里插入图片描述

4.eg4控制语句

  • for语句
>> v = ones(1,10)
v =
   1   1   1   1   1   1   1   1   1   1
>> for i = 1:10,v(i) = v(i) + i,end
%incices = 1:10
% for i = 1:10 与 for i = indices 等价
v =
    2    3    4    5    6    7    8    9   10   11
  • while+break+if,else语句
>> i = 1;
>> while true,
		v(i) = 9999;
		i = i+1;
		if i == 6
			break;
		elseif i == 5
			disp('The value is one');
		else
			disp('The value is zero');
		end
	end
>> v
v =
   9999   9999   9999   9999   9999      7      8      9     10     11

5.eg5调用函数(可以有多个返回值)

ps:文件名称要和函数名一致,以下我用的函数名称为‘mytestfunction’
在这里插入图片描述
验证成功:

>> addpath('E:\Octave\workspace') %添加搜索路径
>> cd 'E:\Octave'				  %改变路径
>> [a,b,c] = mytestfunction(3,4)
a =  9
b =  16
c =  144
>>

ps:代价函数的定义
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_44751294/article/details/109240994
今日推荐