Logistic regression - Yes Or No

Logistic regression is to solve a classification problem. A piece of code is required to answer YES or NO. Categories such as spicy chicken mail, when the message came, you need to identify whether this message is spam.

A simple example

Borrow the Andrew Ngexample of student achievement and college applications to be told Logistic Regressionalgorithm. Suppose you have two classes of students at history and records are to be admitted, you need to predict whether a new batch of students will be admitted to university. Wherein the partial data are as follows:

exam1 exam2 Admission (0: failed; 1: passed)
34.62365962451697 78.0246928153624 0
30.28671076822607 43.89499752400101 0
35.84740876993872 72.90219802708364 0
60.18259938620976 86.30855209546826 1
79.0327360507101 75.3443764369103 1
45.08327747668339 56.3163717815305 0
61.10666453684766 96.51142588489624 1

Here exam1and exam2score as input to the model, ie X. Whether the output of the model is admitted, i.e. Y, whereinY\in{1, 0}

Assume functions

We define a function assume h_{\theta}(x), through this function to predict the probability of whether it will be admitted. So we hope that h_{\theta}(x)the range is [0, 1]. sigmoidFunction is a very high degree of matching function. The following is a sigmoidfunction of the image:

We used pythonto achieve this function:

import numpy as np

def sigmoid(z):
    g = np.zeros(z.size)
    g = 1 / (1 + np.exp(-z))
    return g
复制代码

We assume that g(\theta)is defined as:

g(\theta) = \theta_0 + \theta_1 * X_1 + \theta_2 * X_2 = \theta^TX

We can make h_{\theta}(x)defined as:

h_{\theta}(x) = h_{\theta}(g(\theta)) = 1 / (1 + e^{-\theta^TX})

Cost functionJ(\theta)

Given h(\theta)after definition, we can define J(\theta)the

J(\theta) = 1 / m \sum_{i=1}^mCost(h_{\theta}(x^i), y^i)

In order to find the global performing a gradient descent when the optimal solution, J(\theta)the function must be a convex function. So we can Costbe defined as follows:

Cost(h_{\theta}(x), y) = -log(h_{\theta}(x)) if y = 1
Cost(h_{\theta}(x), y) = -log(1 - h_{\theta}(x)) if y = 0

Finally obtained gradient descent algorithm requires the use of a differential to:

\frac{\partial}{\partial \theta_j}J(\theta) = 1 / m * \sum_{i=1}^m(h_{\theta}(x^{i}) - y^{i}) * x_{j}^{i}

Using pythonthe implemented as follows:

import numpy as np
from sigmoid import *


def cost_function(theta, X, y):
    m = y.size
    cost = 0
    grad = np.zeros(theta.shape)

    item1 = -y * np.log(sigmoid(X.dot(theta)))
    item2 = (1 - y) * np.log(1 - sigmoid(X.dot(theta)))

    cost = (1 / m) * np.sum(item1 - item2)

    grad = (1 / m) * ((sigmoid(X.dot(theta)) - y).dot(X))

    return cost, grad
复制代码

We used scipyto do a certain degree of optimization of the algorithm:

import scipy.optimize as opt
def cost_func(t):
    return cost_function(t, X, y)[0]


def grad_func(t):
    return cost_function(t, X, y)[1]


theta, cost, *unused = opt.fmin_bfgs(f=cost_func, fprime=grad_func,
                                     x0=initial_theta, maxiter=400, full_output=True, disp=False)
复制代码

By this method, it can be obtained in the model required to be used \theta. Thus the model training good.

Visualization

In order to facilitate law between the observation data, we can visualize the data out


def plot_data(X, y):
    x1 = X[y == 1]
    x2 = X[y == 0]

    plt.scatter(x1[:, 0], x1[:, 1], marker='+', label='admitted')
    plt.scatter(x2[:, 0], x2[:, 1], marker='.', label='Not admitted')
    plt.legend()

def plot_decision_boundary(theta, X, y):
    plot_data(X[:, 1:3], y)

    # Only need two points to define a line, so choose two endpoints
    plot_x = np.array([np.min(X[:, 1]) - 2, np.max(X[:, 1]) + 2])

    # Calculate the decision boundary line
    plot_y = (-1/theta[2]) * (theta[1]*plot_x + theta[0])

    plt.plot(plot_x, plot_y)

    plt.legend(['Decision Boundary', 'Admitted', 'Not admitted'], loc=1)
    plt.axis([30, 100, 30, 100])
    plt.show()
复制代码

The effect is something like this:

prediction

When calculated the \thetaafter of course we can use this to predict, so you can write predictfunctions

import numpy as np
from sigmoid import *


def predict(theta, X):
    m = X.shape[0]
    p = np.zeros(m)

    prob = sigmoid(X.dot(theta))
    p = prob > 0.5
    return p
复制代码

For Xthe data, when the probability is greater than 0.5the time, we can predict for the college application.

The above is the basic realization of logistic regression algorithm Logistic regression is a very basic and very powerful algorithm in the ML in the hope that this article for your help

Reproduced in: https: //juejin.im/post/5d05def26fb9a07efb69842c

Guess you like

Origin blog.csdn.net/weixin_34306446/article/details/93173270