Programming jobs ex1: Linear Regression

First, warm-up exercises

Requirements: generating a matrix of 5 5 *

warmUpExercies function:

A = function warmUpExercise ()% A function is defined as warmupexercise 
A = Eye (5 );% Eye () identity matrix, i.e., the function of the function generating unit matrix of 5 5 * 
end

transfer:

fprintf('Running warmUpExercise ... \n');
fprintf('5x5 Identity Matrix: \n');
warmUpExercise()    %调用该函数

fprintf('Program paused. Press enter to continue.\n');
pause;

Output:

 

 Second, the drawing (Plotting)

According to the address given in the data (the first column of the population of the city, as the profit corresponding to the second city, a negative number indicates a loss) to draw scatter plots to better selection of restaurants: Requirements

plotData function:

plotData function (X, y) 
Plot (X, y, ' RX ' , ' MarkerSize ' , 10 ); 
ylabel ( ' Profit in $ 10,000 S ' );% sets the y-axis labels 
the xlabel ( ' Population in City of 10,000 S ' ) ;% set x-axis labels 
end

* MarkerSize: marker size, the default is 6

* Rx: red cross

 

transfer:

fprintf ( ' the Plotting the Data ... \ n- ' ) 
data = Load ( ' ex1data1.txt ' );% loading data file 
X- = data (:, . 1 ); Y = data (:, 2 );% First Column assigned to all the elements of X, the second column is assigned to all of the elements Y 
m = length (Y);% defined and m is the number of training samples
 
plotData (X, y);% the X, Y function as calling plotData 

fprintf ( ' Program . Press Enter to Continue paused \ n-. ' ); 
PAUSE;

Output:

 

Third, the cost function and gradient descent

Linear regression goal is to minimize the cost function J (θ):

 Is used here to represent the error square method, the smaller the error representative of the better fit.

 

Assuming that h (x) is given by a linear model:

 Parameters of the model is [theta] by adjusting the value of [theta] to minimize the cost function, is a method of gradient descent, in this algorithm, [theta] will be updated every iteration

 

 With each step of the gradient descent, the parameter [theta] near the lowest cost J (θ) is the optimum value of [theta]

* And  = different, := it represents while updating (simultaneously update), is simply the first calculation of  θ into a temporary variable, and finally all  θ are calculated over an assignment back together again, for example,

temp1 = theta1 - (theta1 - 10 * theta2 * x1) ;
temp2 = theta2 - (theta1 - 10 * theta2 * x2) ;
theta1 = temp1 ;
theta2 = temp2 ;

 

1 an increase in the data, because the cost function [theta] 0 coefficient is 1 , and the parameter is initialized to 0, the learning rate α is initialized to 0.01

Matrix representation of words similar to:

= X-[ones (m, 1 ), Data (:, 1 )];% was added to a 1 x first column 
Theta = zeros ( 2 , 1 );% initialization fit parameter 

Iterations = 1500 ;% iterations 
Alpha = 0.01 ; learning rate is set to 0.01%

 == "calculates a cost function J (convergence detection)

function J = computeCost(X, y, theta)
m = length(y);
J = 0;
J = sum((X*theta-y).^2)/(2*m);
end

Call cost function:

fprintf('\nTesting the cost function ...\n')
% compute and display initial cost 
J = computeCost(X, y, theta);  % 调用代价函数J
fprintf('With theta = [0 ; 0]\nCost computed = %f\n', J);
fprintf('Expected cost value (approx) 32.07\n');

% further testing of the cost function
J = computeCost(X, y, [-1 ; 2]);  % 调用代价函数J
fprintf('\nWith theta = [-1 ; 2]\nCost computed = %f\n', J); 
fprintf(
'Expected cost value (approx) 54.24\n');
fprintf(
'Program paused. Press enter to continue.\n');
pause;

operation result:

 It can be seen that, when taken theta [0; 0] than when taking the cost function [1; 2] small cost function, described [0; 0] better.

 

gradientDescent.m-- run gradient descent:

Description: A good way to gradient descent is working verification is to check the value of J and check it with each step reduction. gradientDescent.m code calls computeCost at each iteration and print the value of J. Suppose implemented correctly and gradient descent computeCost, the J value should not be increased, and should converge to a stable value at the end of the algorithm.

In gradient descent, each iteration is executed following this update:

 

Gradient descent function:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by 
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta. 
    %
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCost) and gradient here.
    %

    theta = theta-alpha*(1/m)*X'*(X*theta-y);

    % ============================================================

    % Save the cost J in every iteration    
    J_history(iter) = computeCost(X, y, theta);

end

end

 Gradient descent function call:

fprintf ( ' \ nRunning Gradient Descent ... \ n- ' )
 % RUN gradient descent of running a gradient descent function 
theta = gradientDescent (X-, Y, theta, Alpha, Iterations);

 % Print Screen theta to the theta value output 
fprintf ( ' Theta found by gradient descent of: \ n- ' );% gradient descent function calculated Theta 
fprintf ( ' % F \ n- ' , Theta); 
fprintf ( ' the expected Theta values (approx) \ n- ' );% the desired Theta 
fprintf ( ' -3.6303 \ n-1.1664 \ n-\ n- ' );

operation result:

 

 The resulting parameters are drawn with MATLAB and predict profits 35,000 and 70,000 population:

% Plot Linear Fit The linear fit straight line drawn 
HOLD ON; % Keep current Previous Plot visible pattern holding means, i.e., the sample points on the image remains 
Plot (X-(:, 2 ), X-Theta *, ' - ' ) 
Legend ( ' Training Data ' , ' Linear Regression ' )% create a legend label 
HOLD OFF % Don ' T overlay the any More Plots ON the this Figure
 
% Predict values for Population sizes of 35 , 000 and 70 , 000 
predict1 = [ . 1 , 3.5 of ] * Theta ;
fprintf('For population = 35,000, we predict a profit of %f\n',...
    predict1*10000);
predict2 = [1, 7] * theta;
fprintf('For population = 70,000, we predict a profit of %f\n',...
    predict2*10000);

fprintf('Program paused. Press enter to continue.\n');
pause;

operation result:

 

 Note the difference between A * B * A. * B, and the former for matrix multiplication, element by element multiplication of the latter

 

Fourth, the visual cost function J

This code has been given

fprintf('Visualizing J(theta_0, theta_1) ...\n')

% Grid over which we will calculate J
theta0_vals = linspace(-10, 10, 100);
theta1_vals = linspace(-1, 4, 100);

% initialize J_vals to a matrix of 0's
J_vals = zeros(length(theta0_vals), length(theta1_vals));

% Fill out J_vals
for i = 1:length(theta0_vals)
    for j = 1:length(theta1_vals)
      t = [theta0_vals(i); theta1_vals(j)];
      J_vals(i,j) = computeCost(X, y, t);
    end
end


% Because of the way meshgrids work in the surf command, we need to
% transpose J_vals before calling surf, or else the axes will be flipped
J_vals = J_vals';
% Surface plot
figure;
surf(theta0_vals, theta1_vals, J_vals)
xlabel('\theta_0'); ylabel('\theta_1');

% Contour plot
figure;
% Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100
contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20))
xlabel('\theta_0'); ylabel('\theta_1');
hold on;
plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 2);

operation result:

 

Guess you like

Origin www.cnblogs.com/vzyk/p/11528304.html