Classification prediction of wine dataset based on Naive Bayes - Machine Learning Experiment 4 - Naive Bayes

At the end of the article, you can also directly obtain experimental documents, codes and related data

Machine Learning Experiment 4 - Classification and Prediction of Wine Dataset Based on Naive Bayes
1. Define the Naive Bayes class in NaiveBayes.py,
insert image description here

2. Define the method in the class
(1) Data preprocessing
Among them, data_list is the sample set, ratio is the ratio of the training set to the sample set, and random is the random seed.
The data preprocessing part includes dividing the training set and the sample set, and converting the string to float or int.
insert image description here

(2) Perform Bayesian training
where x_train is the training sample feature, y_train is the training sample category, and class_num is the number of classification categories

insert image description here

(3) Classify the data to be tested according to the probability density function obtained from training, (multidimensional data)
where prob_num_dict is P (Xi/C), prob_dict: P (C), data: data to be tested, class_num: number of categories This
method Returns the predicted data category.
insert image description here

(4) Statistical accuracy rate
Among them, y_true: true category, y_predict: predicted category, this method returns the correct rate.
insert image description here

(5) According to the probability density function obtained from training, classify the data to be tested (one-dimensional),
where prob_num_dict: P (Xi/C), prob_dict: P (C), data: data to be tested, class_num: number of categories, this method Returns the predicted data category.
insert image description here

3. test.py
(1) First import the class and numpy defined just now
insert image description here

(2) Define a method to load data
insert image description here

(3) Define the output of test results
insert image description here

(4) Define the main function, call the previously defined function, and have the output accuracy rate, as well as the classification prediction of the test data.
insert image description here

4 Running results
It can be found that the correct rate of this training model for the test set is 0,94, and the classification prediction is made for a given set of samples, and the prediction result is category 3

insert image description here
insert image description here

Summary: In this experiment, I basically mastered the naive Bayesian classification model and applied the Bayesian network, as well as predicted wine classification.
In this experiment, the overall thinking is relatively smooth. After combining the experiment and the content of the book, I have a deeper understanding of the Naive Bayesian network. It is a simple but extremely powerful prediction model compared to other algorithms. modulo algorithm. It assumes that each input variable is independent. This assumption is very hard and is not satisfied in real life, but it is very effective for most complex problems.
Pay attention to the official account: Time wood
Reply: Bayesian
can get relevant codes, data, documents.


More university course experiment training can follow the official account to reply to related keywords

Guess you like

Origin blog.csdn.net/qq_43374681/article/details/118443100