Andrew Ng's Coursera Machine Leaning Coding Hw 2

Andrew Ng’s Coursera Machine Leaning Coding Hw 2

Author: Yu-Shih Chen
December 21, 2018 4:17AM

Intro：
本人目前是在加州上大学的大二生，对人工智能和数据科学有浓厚的兴趣所以在上学校的课的同时也喜欢上一些网课。主要目的是希望能够通过在这个平台上分享自己的笔记来达到自己更好的学习/复习效果所以notes可能会有点乱，有些我认为我自己不需要再复习的内容我也不会重复。当然，如果你也在上这门网课，然后刚好看到了我的notes，又刚好觉得我的notes可能对你有点用，那我也会很开心哈哈！有任何问题或建议OR单纯的想交流/单纯想做朋友的话可以加我的微信：y802088

Week 3 Coding Assignment

大纲：

Sigmoid
Compute cost (without regularization) and Gradient (without reg)
Perdict Function
Compute cost (with reg) and Gradients (with reg)

Sigmoid

这个section主要是写一个能算sigmoid的function。


function g = sigmoid(z)
%SIGMOID Compute sigmoid function
%   g = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly 
g = zeros(size(z));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).

g = 1./(1 + exp(-z));

% =============================================================

end

有很多种写法，注意不要改变Input的矩阵的大小（适当的使用’.’）

Compute cost (without regularization) and Gradient (without reg)

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%
% Note: X: 100 x 3, y: 100 x 1

z = X * theta; % 100 x 1
h_x = sigmoid(z); % 100 x 1
J = (-y' * log(h_x) - (1-y)' * log(1-h_x))/m

% calculate gradient
grad = X' * (h_x - y) / m
% =============================================================
end

这其实就是为了之后的fminunc写的function。我们要算出2个东西：J（误差值）和gradient（partial derivative也就是slope）。还是一样，注意矩阵之间的转换和理解原式子要的是什么。 剩下的就是套公式了

Predict Function

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic 
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a 
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)

m = size(X, 1); % Number of training examples

% You need to return the following variables correctly
p = zeros(m, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters. 
%               You should set p to a vector of 0's and 1's
%
predictions = sigmoid(X * theta);

for i = 1 : m
    if predictions(i) >= 0.5
        p(i) = 1
    else
        p(i) = 0
    end
end
% =========================================================================
end

这个section主要就是去用得到的theta（老师已经帮我们用我们上个section写的function运行过fminunc了）来预测新的值。threshold在0.5，所以只要把预测的值放到矩阵里，然后所有大于等于0.5的就等于1，否则等于 0。

Compute cost (with reg) and Grardient (with reg)

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta

% X is 118 x 28, y is 118 x 1, theta 28 x 1
z = X * theta; % 118 x 1
h_x = sigmoid(z); % 118 x 1

J_unreg = (-y' * log(h_x) - (1-y)' * (log(1 - h_x)))/m;
J = J_unreg + (lambda * sum(theta(2:end,:).^2))/(2*m);
grad(1) = (X(:,1)' * (h_x - y))./m; % 1 x 1
grad(2:end) =  (X(:,2:end)' * (h_x - y)./m) + (lambda .* theta(2:end)./m); % 27 x 1

% =============================================================
end

如果完全理解了’Compute cost (without regularization) and Gradient (without reg)‘ 的section是怎么写的，那这个section就会很简单。J的regularization就是后面加个式子而已。grad就是第一个theta0跟之前的公式一样，剩下的加regularization。当然，也可以将theta0设为0，然后用一个式子直接写出来而不是像我这样分成两个式子。

总结：这次的功课是学习如何训练一个logistic regression的learning model。了解fminunc的运行方式很重要（虽然这里不用写，老师帮你写了），不然自己implement的时候就不知道该怎么做了。然后就是，理解式子！其他就没啥了。