Machine Learning Experiment 1

1. Title (pattern classification duda second edition Chinese version 64 pages)

(a) Assuming that the prior probabilities of the first two classes are equal (P(ω1) = P(ω2) = 1/2 P(ω1) = P(ω2) = 1/2, P(ω3) = 0) P( ω3) = 0), and only use the x1x1 eigenvalue to design a classifier to discriminate between these two categories. 
(b) Determine the empirical training error of the sample, that is, the percentage of misclassified points.  
(d) Now repeat the above steps using the two eigenvalues ​​x1x1 and x2x2. 
(e) Repeat the above steps with all 3 eigenvalues. 
(f) Discuss the conclusions reached. In particular, for a limited dataset, is it possible that the empirical error increases at higher data dimensionality?

Second, the experimental process

First write the function CH1_b of formula 69 in the book, and delete (d/)ln(2pi) from the title 1_b:

One thing to note is that the two multiplications of matrices in python are dot and *

np.dot(A, B): For a two-dimensional matrix, calculate the true matrix product, the definition of matrix multiplication in linear algebra. For a one-dimensional matrix, compute the inner product of the two.

In Python, there are two ways to multiply corresponding elements, one is np.multiply(), and the other is *.

Therefore, different multiplications are used according to the value of d.

Once you have this function, type the data from the book:

The result of Print w1,w2 is:

Then write function CH2:

d represents the dimensions of x1, x2, and x3, and w1 and w2 are the transposes of mat1 and mat2 after being intercepted according to d.

u1 and u2 are the mean values ​​of w1 and w2, respectively, and sigma1 and sigma2 are their respective covariance matrices. wrong1, wrong2 are the correct numbers calculated later.

Substitute the above data into the CH1_b function to compare the values ​​of Type 1 and Type 2, and calculate the respective error rates.

result:

Code (python2.7):

import math
import numpy as np
from numpy.linalg import cholesky
import matplotlib.pyplot as plt


def CH1_b(x,u,sigma,P,d):
	if(d==1):
		return -0.5*(x-u).T*sigma.I*(x-u)-0.5*math.log(np.linalg.det(sigma)) + math.log(P)
	else:
		return np.dot(x-u,np.dot(sigma.I,-0.5*(x-u).T))-0.5*math.log(np.linalg.det(sigma)) + math.log(P)
w1 = np.matrix('-5.01,-5.43,1.08,0.86,-2.67,4.94,-2.51,-2.25,5.56,1.03;\
				-8.12,-3.48,-5.52,-3.78,0.63,3.29,2.09,-2.13,2.86,-3.33;\
				-3.68,-3.54,1.66,-4.11,7.39,2.08,-2.59,-6.94,-2.26,4.33')
w2 = np.matrix('-0.91,1.30,-7.75,-5.47,6.14,3.60,5.37,7.18,-7.39,-7.50;\
				-0.18,-2.06,-4.54,0.50,5.72,1.26,-4.63,1.46,1.17,-6.32;\
				-0.05,-3.53,-0.95,3.92,-4.85,4.36,-3.65,-6.66,6.30,-0.31')
print w1
print w2

def CH2(mat1,mat2,d):
	w1 = mat1[0:d].T
	w2 = mat2[0:d].T
	u1 = np.mean(w1,axis = 0)
	u2 = np.mean(w2,axis = 0)
	sigma1 = np.matrix (np.cov (w1.T))
	sigma2 = np.matrix (np.cov (w2.T))
	wrong1 = 0
	wrong2 = 0
	for i in range(10):
		if(CH1_b(w1[i],u1,sigma1,0.5,d)<CH1_b(w1[i],u2,sigma2,0.5,d)):
			wrong1 = wrong1 + 1
		if(CH1_b(w2[i],u1,sigma1,0.5,d)>CH1_b(w2[i],u2,sigma2,0.5,d)):
			wrong2 = wrong2 + 1
	print '1:',wrong1
	print '2:',wrong2
	print (wrong1+wrong2)/20.0
	return
print 'x1:'
CH2(w1,w2,1)
print 'x1&x2:'
CH2(w1,w2,2)
print 'x1&x2&x3:'
CH2(w1,w2,3)




Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326484040&siteId=291194637