Octave configuration in different environments
- Installing Octave on GNU/Linux
Installing Octave on GNU/Linux We recommend using your system package
manager to install Octave.On Ubuntu, you can use:
sudo apt-get update && sudo apt-get install octave On Fedora, you can
use:sudo yum install octave-forge Please consult the Octave maintainer’s
instructions for other GNU/Linux systems.“Warning: Do not install Octave 4.0.0”; checkout the “Resources”
menu’s section of “Installation Issues”.
-
Octave Resources
At the Octave command line, typing help followed by a function name displays documentation for a built-in function. For example, help plot will bring up help information for plotting. Further documentation can be found at the Octave documentation pages. -
MATLAB Resources
At the MATLAB command line, typing help followed by a function name displays documentation for a built-in function. For example, help plot will bring up help information for plotting. Further documentation can be found at the MATLAB documentation pages.
一、Multivariate Linear Regression
1.Multiple Features
Linear regression with multiple variables is also known as “multivariate linear regression”.
We now introduce notation for equations where we can have any number of input variables.
2.Gradient Descent for Multiple Variables
The following image compares gradient descent with one variable to gradient descent with multiple variables:
3.Gradient Descent in Practice
We can speed up gradient descent by having each of our input values in roughly the same range. This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.
The way to prevent this is to modify the ranges of our input variables so that they are all roughly the same. Ideally:
These aren’t exact requirements; we are only trying to speed things up. The goal is to get all input variables into roughly one of these ranges, give or take a few.
Two techniques to help with this are feature scaling and mean normalization. Feature scaling involves dividing the input values by the range (i.e. the maximum value minus the minimum value) of the input variable, resulting in a new range of just 1. Mean normalization involves subtracting the average value for an input variable from the values for that input variable resulting in a new average value for the input variable of just zero. To implement both of these techniques, adjust your input values as shown in this formula:
Where μ_iμ i is the average of all the values for feature (i)
and s_is i is the range of values (max - min), or s_is i
is the standard deviation.Note that dividing by the range, or dividing by the standard
deviation, give different results. The quizzes in this course use
range - the programming exercises use standard deviation.
Debugging gradient descent. Make a plot with number of iterations on the x-axis. Now plot the cost function, J(θ) over the number of iterations of gradient descent. If J(θ) ever increases, then you probably need to decrease α.
Automatic convergence test. Declare convergence if J(θ) decreases by less than E in one iteration, where E is some small value such as 10^{−3}10
−3
. However in practice it’s difficult to choose this threshold value.
To summarize:
If α is too small: slow convergence. As shown in figure B below
If α is too large: may not decrease on every iteration and thus
may not converge. 如下图的C
eg: The function does not fall but rises, indicating that α is too large. B drops slowly, so the value of α is set too small, so the correct choice is B.
4. Features and Polynomial Regression
We can improve our features and the form of our hypothesis function in a couple different ways.
We can combine multiple features into one. For example, we can combine x1,x2 into a new feature x3 by taking x1*x2
Pay attention to zoom
二、Computing Parameters Analytically
1.Normal Equation
Gradient descent gives one way of minimizing J. Let’s discuss a second way of doing so, this time performing the minimization explicitly and without resorting to an iterative algorithm. In the “Normal Equation” method, we will minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero. This allows us to find the optimum theta without iteration. The normal equation formula is given below:
There is no need to do feature scaling with the normal equation.
The following is a comparison of gradient descent and the normal equation:
2.Normal Equation Noninvertibility
Three, Octave use
1.eg1 cosine and sine functions
>> t = [0:0.01:0.98]
>> y1 = sin(2*pi*4*t);
>> plot(t,y1)
>> y2 = cos(2*pi*4*t);
>> plot(t,y2)
>> hold on;
>> plot(t,y2)
>> plot(t,y1,'b')
>> xlabel('time')
>> legend('sin','cos')
>> ylabel('value')
>> title('my plot')
>> print -dpng 'myPlot.png'
Some formats of the plot function:
Format arguments:
linestyle
'-' Use solid lines (default).
'--' Use dashed lines.
':' Use dotted lines.
'-.' Use dash-dotted lines.
marker
'+' crosshair
'o' circle
'*' star
'.' point
'x' cross
's' square
'd' diamond
'^' upward-facing triangle
'v' downward-facing triangle
'>' right-facing triangle
'<' left-facing triangle
'p' pentagram
'h' hexagram
color
'k' blacK
'r' Red
'g' Green
'b' Blue
'y' Yellow
'm' Magenta
'c' Cyan
'w' White
2.eg2 limited position drawing
>> figure(1);plot(t,y1)
>> figure(2);plot(t,y2)
>> subplot(2,2,1)
>> plot(t,y2);
>> subplot(2,2,4)
>> plot(t,y1);
>> axis([0 0.5 -1.2 1.2]) %改变范围,y的取值范围是-1.2~1.2,x的取值范围是0~0.5
>> print -dpng 'q.jpg'
>>clf;
3.eg3 visualization matrix
>> A = magic(5)
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> imagesc(A);
>> colorbar
>> colormap gray;
>> imagesc(magic(15)),colorbar,colormap gray
4.eg4 control statement
- for statement
>> v = ones(1,10)
v =
1 1 1 1 1 1 1 1 1 1
>> for i = 1:10,v(i) = v(i) + i,end
%incices = 1:10
% for i = 1:10 与 for i = indices 等价
v =
2 3 4 5 6 7 8 9 10 11
- while+break+if,else语句
>> i = 1;
>> while true,
v(i) = 9999;
i = i+1;
if i == 6
break;
elseif i == 5
disp('The value is one');
else
disp('The value is zero');
end
end
>> v
v =
9999 9999 9999 9999 9999 7 8 9 10 11
5.eg5 call function (can have multiple return values)
ps: The file name must be consistent with the function name. The function name I used below is'mytestfunction' and the
verification is successful:
>> addpath('E:\Octave\workspace') %添加搜索路径
>> cd 'E:\Octave' %改变路径
>> [a,b,c] = mytestfunction(3,4)
a = 9
b = 16
c = 144
>>
ps: The definition of the cost function