Octave不同环境下的配置
- Installing Octave on GNU/Linux
Installing Octave on GNU/Linux We recommend using your system package
manager to install Octave.On Ubuntu, you can use:
sudo apt-get update && sudo apt-get install octave On Fedora, you can
use:sudo yum install octave-forge Please consult the Octave maintainer’s
instructions for other GNU/Linux systems.“Warning: Do not install Octave 4.0.0”; checkout the “Resources”
menu’s section of “Installation Issues”.
-
Octave Resources
At the Octave command line, typing help followed by a function name displays documentation for a built-in function. For example, help plot will bring up help information for plotting. Further documentation can be found at the Octave documentation pages. -
MATLAB Resources
At the MATLAB command line, typing help followed by a function name displays documentation for a built-in function. For example, help plot will bring up help information for plotting. Further documentation can be found at the MATLAB documentation pages.
一、Multivariate Linear Regression
1.Multiple Features
Linear regression with multiple variables is also known as “multivariate linear regression”.
We now introduce notation for equations where we can have any number of input variables.
2.Gradient Descent for Multiple Variables
The following image compares gradient descent with one variable to gradient descent with multiple variables:
3.Gradient Descent in Practice
We can speed up gradient descent by having each of our input values in roughly the same range. This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.
The way to prevent this is to modify the ranges of our input variables so that they are all roughly the same. Ideally:
These aren’t exact requirements; we are only trying to speed things up. The goal is to get all input variables into roughly one of these ranges, give or take a few.
Two techniques to help with this are feature scaling and mean normalization. Feature scaling involves dividing the input values by the range (i.e. the maximum value minus the minimum value) of the input variable, resulting in a new range of just 1. Mean normalization involves subtracting the average value for an input variable from the values for that input variable resulting in a new average value for the input variable of just zero. To implement both of these techniques, adjust your input values as shown in this formula:
Where μ_iμ i is the average of all the values for feature (i)
and s_is i is the range of values (max - min), or s_is i
is the standard deviation.Note that dividing by the range, or dividing by the standard
deviation, give different results. The quizzes in this course use
range - the programming exercises use standard deviation.
Debugging gradient descent. Make a plot with number of iterations on the x-axis. Now plot the cost function, J(θ) over the number of iterations of gradient descent. If J(θ) ever increases, then you probably need to decrease α.
Automatic convergence test. Declare convergence if J(θ) decreases by less than E in one iteration, where E is some small value such as 10^{−3}10
−3
. However in practice it’s difficult to choose this threshold value.
To summarize:
If α is too small: slow convergence. 如下图的B
If α is too large: may not decrease on every iteration and thus
may not converge. 如下图的C
eg:函数没有下降反而上升,说明α偏大。B下降得比较缓慢,所以α值设置得偏小,所以正确的选择是B。
4. Features and Polynomial Regression
We can improve our features and the form of our hypothesis function in a couple different ways.
We can combine multiple features into one. For example, we can combine x1,x2 into a new feature x3 by taking x1*x2
要注意进行缩放
二、Computing Parameters Analytically
1.Normal Equation正规方程
Gradient descent gives one way of minimizing J. Let’s discuss a second way of doing so, this time performing the minimization explicitly and without resorting to an iterative algorithm. In the “Normal Equation” method, we will minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero. This allows us to find the optimum theta without iteration. The normal equation formula is given below:
There is no need to do feature scaling with the normal equation.
The following is a comparison of gradient descent and the normal equation:
2.Normal Equation Noninvertibility正规方程不可逆
三、Octave使用
1.eg1余弦与正弦函数
>> t = [0:0.01:0.98]
>> y1 = sin(2*pi*4*t);
>> plot(t,y1)
>> y2 = cos(2*pi*4*t);
>> plot(t,y2)
>> hold on;
>> plot(t,y2)
>> plot(t,y1,'b')
>> xlabel('time')
>> legend('sin','cos')
>> ylabel('value')
>> title('my plot')
>> print -dpng 'myPlot.png'
plot函数的一些格式:
Format arguments:
linestyle
'-' Use solid lines (default).
'--' Use dashed lines.
':' Use dotted lines.
'-.' Use dash-dotted lines.
marker
'+' crosshair
'o' circle
'*' star
'.' point
'x' cross
's' square
'd' diamond
'^' upward-facing triangle
'v' downward-facing triangle
'>' right-facing triangle
'<' left-facing triangle
'p' pentagram
'h' hexagram
color
'k' blacK
'r' Red
'g' Green
'b' Blue
'y' Yellow
'm' Magenta
'c' Cyan
'w' White
2.eg2限定位置画图
>> figure(1);plot(t,y1)
>> figure(2);plot(t,y2)
>> subplot(2,2,1)
>> plot(t,y2);
>> subplot(2,2,4)
>> plot(t,y1);
>> axis([0 0.5 -1.2 1.2]) %改变范围,y的取值范围是-1.2~1.2,x的取值范围是0~0.5
>> print -dpng 'q.jpg'
>>clf;
3.eg3可视化矩阵
>> A = magic(5)
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> imagesc(A);
>> colorbar
>> colormap gray;
>> imagesc(magic(15)),colorbar,colormap gray
4.eg4控制语句
- for语句
>> v = ones(1,10)
v =
1 1 1 1 1 1 1 1 1 1
>> for i = 1:10,v(i) = v(i) + i,end
%incices = 1:10
% for i = 1:10 与 for i = indices 等价
v =
2 3 4 5 6 7 8 9 10 11
- while+break+if,else语句
>> i = 1;
>> while true,
v(i) = 9999;
i = i+1;
if i == 6
break;
elseif i == 5
disp('The value is one');
else
disp('The value is zero');
end
end
>> v
v =
9999 9999 9999 9999 9999 7 8 9 10 11
5.eg5调用函数(可以有多个返回值)
ps:文件名称要和函数名一致,以下我用的函数名称为‘mytestfunction’
验证成功:
>> addpath('E:\Octave\workspace') %添加搜索路径
>> cd 'E:\Octave' %改变路径
>> [a,b,c] = mytestfunction(3,4)
a = 9
b = 16
c = 144
>>
ps:代价函数的定义