1. Logistic regression
1. Background: Use logistic regression to predict whether a student will be admitted to a university.
2. First visualize the data, the code is as follows:
pos = find(y== 1 ); % find the vector of ordinal numbers of passed students neg = find(y== 0 ); % find the serial number vector of students who failed plot(X(pos, 1 ),X(pos, 2 ), ' k+ ' , ' LineWidth ' , 2 , ' MarkerSize ' , 7 ); % use + to plot by student hold on; plot(X(neg, 1 ),X(neg, 2 ), ' ko ' , ' MarkerFaceColor ' , ' y ' , ' MarkerSize ' , 7 ); % Use o to plot failed students % Put some labels hold on; % Labels and Legend xlabel('Exam 1 score') ylabel('Exam 2 score') % Specified in plot order legend('Admitted', 'Not admitted') hold off;
3. The implementation of the sigmoid function, the code is as follows:
function g = sigmoid(z) %函数文件名为sigmoid.m %SIGMOID Compute sigmoid function % g = SIGMOID(z) computes the sigmoid of z. % You need to return the following variables correctly g = zeros(size(z)); temp = - z; temp=e.^temp; temp=temp+1; temp=1./temp; g=temp; end
4. The implementation code of the cost function is as follows:
function [J, grad] = costFunction(theta, X, y) % function name file name costFunction.m m = length(y); % number of training examples % You need to return the following variables correctly J = 1 /m*(-(y ' )*log(sigmoid(X*theta))-(1-y) ' *log( 1 -sigmoid(X* theta))); % Calculate the cost function grad = zeros(size(theta)); grad = 1 /m*X ' *(sigmoid(X*theta)-y); % find the gradient
end
5. Instead of the gradient descent optimization method fminunc(), the code is as follows:
% When the parameter GradObj is set to on, it tells the function fminunc() that our cost function costFunction() can return the cost value and gradient value, and the function fminunc() can directly use the gradient value for calculation options = optimset( ' GradObj ' , ' on ' , ' MaxIter ' , 400 ); % Run fminunc to obtain the optimal theta % This function will return theta and the cost [theta, cost] = ... fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);
6. Use the calculated θ i value for prediction, the prediction function is as follows:
function p = predict(theta, X) m = size(X, 1 ); % Number of training examples p = zeros(m, 1 ); p =floor(sigmoid(X*theta).* 2 ); % because using The floor() function is used, so the function value should be doubled
2. Regularized logistic regression
1. Feature Mapping: Use two features (x 1 , x 2 ) to combine more features such as x 1 x 2 , x 1 2 , x 2 2 and so on. code show as below:
function out = mapFeature(X1, X2) degree = 6 ; out = ones(size(X1(:, 1 ))); for i = 1 :degree for j = 0 :i out (:, end+ 1 ) = (X1.^(ij)).*( X2.^ j); % A total of 27 items are generated, namely x1,x2,x1 2 ,x1x2,x2 2 ,x1 3 ,x1 2 x2,x1x2 2 ,x2 3 ,x1 4 ,x1 3 x2,x1 2 x2 2 ,x1x2 3 ,x2 4 , end % x15,x14x2,x13x22,x12x23,x1x24,x25,x16,x15x2,x14x22,x13x23,x1x25,x26 end end
2. Compute the normalized cost function and gradient in logistic regression:
function [J, grad] = costFunctionReg(theta, X, y, lambda) m = length(y); % number of training examples
J = 1 /m*(-(y ' )*log(sigmoid(X*theta) )-(1-y) ' *log( 1 -sigmoid(X*theta)))+( 1 /( 2 *m))*lambda*(sum(theta .^ 2 ) - theta( 1 )^ 2 ) ; % normalization without normalizing θ1 grad = zeros(size(theta) grad = 1/m*X'*(sigmoid(X*theta)-y)+lambda*theta/m; grad(1) = grad(1)-lambda*theta(1)/m; end