to sum up
- Naive Bayes method is essentially the probability estimate.
- Since the added strong assumptions each parameter independence condition input variables, such distribution parameter conditions is greatly reduced. At the same time accuracy is reduced.
- A problem is rather counter-intuitive on probability theory: three issues : As the host has limited his open door is a goat, that has been a prerequisite, the corresponding probability should be changed, on what specific formula not deduced. This problem has a relationship with a naive Bayes methods that are used in the prior probability.
- There are two ways to calculate the probability distribution.
- Maximum likelihood estimate:
- Bayesian estimation: can not guarantee that all the circumstances are present, it is then seek its conditional probability when coupled with a bias term, so that all the circumstances of the conditional probability on a limited set of training given is not zero.
- Maximum likelihood estimate:
- Next, the maximum likelihood method, and Bayesian estimation derivation. (Exercises content) Exercise content may be a tow, and dry Liao
- Here is the code implementation, code comments was quite clear
# Encoding = UTF. 8- Import PANDAS AS PD Import numpy AS NP from sklearn.model_selection Import train_test_split Import Time DEF binaryzation (Array): # binarization # after greater than 50 to 1, 0 to less than 50, binarized, parameter from the total amount of 255 down to 2 Array [Array [:,:] <= 50] = 0 Array [Array [:,:] > 50] =. 1 return Array DEF read_data (path): raw_data = pd.read_csv (path , header = 0) Data = raw_data.values imgs = Data [:,. 1 :] Labels = Data [:, 0] imgs = binaryzation (imgs) # selected data as 2/3 dataset, 1/3 as a test set tra_x, test_x, tra_y, test_y = train_test_split (imgs, labels, test_size = 0.33 , random_state = 2,019,526 ) return tra_x, test_x, tra_y, test_y DEF naive_Bayes (X, Y, class_num = 10, feature_len = 784, point_len = 2, Lamba =. 1 ): "" " : param X: learning data sets : param y: learning tags : param class_num: y number of classes : param feature_len: each feature vector of the training data how many dimensions : param point_len: each training set of feature vector data how many cases values : return: first posterior probability and conditional probability . "" " # here we use the Bayesian estimation method to do = prior_probabilitynp.zeros (class_num) conditional_probability = np.zeros ((class_num, feature_len, point_len)) # to yield the number corresponding to each of the calculated prior probability for I in Range (10 ): prior_probability [I] + = len (Y [Y [:] == i]) # calculate conditional probabilities for i in Range (class_num): # remove the label x the corresponding data point i data_ith = x [Y == i] for J in Range (feature_len): for K in Range (point_len): # remove j-1 as in the previous data, or to the point 0 conditional_probability [I, j, K] + = len (data_ith [data_ith [:, j] == K]) + Lamba conditional_probability [I,:, :] / = (prior_probability [I] + point_len * Lamba) prior_probability + = Lamba prior_probability / = (len (Y) + class_num * Lamba) return prior_probability, conditional_probability DEF Predict (PP, CP, Data, class_num = 10, feature_len = 784 ): "" " prediction : param PP: prior_probability : param CP: conditional_probability : param data: input data : return: predictors " "" pre_label = np.zeros(len(data)+1) for i in range(len(data)): max_possibility = 0 max_label = 0 for j in range(class_num): tmp=pp[j] for k in range(feature_len): tmp*=cp[j,k,data[i,k]] if tmp>max_possibility: max_possibility=tmp max_label=j pre_label[i]=max_label return pre_label if __name__ == '__main__': time_1=time.time() tra_x, test_x, tra_y, test_y = read_data('data/Mnist/mnist_train.csv') time_2=time.time() prior_probability, conditional_probability = naive_Bayes(tra_x, tra_y, 10, 784, 2, 1) time_3=time.time() pre_label = predict(prior_probability, conditional_probability, test_x, 10,784) cous=0 for i in range(len(test_y)): if pre_label[i]==test_y[i]: cous+=1 print(cous/len(pre_label)) time_4=time.time() print("time of reading data: ",int(time_2-time_1)) print("time of getting model: ",int(time_3-time_2)) print("time of predictting: ",int(time_4-time_3))
This is the time his accuracy and consumed
Reflection:
In fact, this low accuracy can be amended to modify the point_len big point should be able to improve the accuracy of the point, because in my code division for each pixel too big, greater than 50 to 1, less than 50 it is 0, it can be divided into three or more sections.