Machine learning for the third time

1. The steps of machine learning

Data, model selection, training, testing, prediction

 

2. Install the machine learning library sklearn

pip list view version

python -m pip install --upgrade pip

pip install -U scikit-learn

 

pip uninstall sklearn

pip uninstall numpy

pip uninstall scipy

pip install scipy

pip install numpy

pip install sklearn

 https://scikit-learn.org/stable/install.html 

 

2. Import sklearn's dataset

from sklearn.datasets import load_iris

iris = load_iris()

iris.keys()

X = iris.data # Get its feature vector

y = iris.target # Get sample label

iris.feature_names # feature names

 

3.K Mean Algorithm

K-means is an iterative process. The algorithm is divided into four steps:

  (x,k,y)

1) Select K objects in the data space as the initial center, and each object represents a cluster center;

  def initcenter(x, k): kc

2) For the data objects in the sample, according to their Euclidean distances from these clustering centers, they are divided into the classes corresponding to the clustering centers (most similar) closest to them according to the criterion of the closest distance;

  def nearest(kc, x[i]): j

  def xclassify(x, y, kc):y[i]=j

3) Update the clustering center: take the mean of all objects in each category as the clustering center of the category and calculate the value of the objective function;

  def kcmean(x, y, kc, k):

4) Determine whether the values ​​of the clustering center and the objective function have changed. If it does not change, the result will be output. If it changes, return 2).

  while flag:

      y = xclassify(x, y, kc)

      kc, flag = kcmean(x, y, kc, k)

 

Refer to the official documentation: 

http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans

 

4. Homework:

1). The k-means clustering process of manual playing of playing cards:> 30 cards, 3 types

2). * K-means algorithm is independently written, clustering is performed on the iris petal length data, and displayed with a scatterplot. (Plus points)

3). Use sklearn.cluster.KMeans and iris petal length data for clustering and display with scatter plot.

 

 

 

 

 

 

4). Complete data of iris flowers are clustered and displayed with a scatterplot.

 

 

5). Think about what is used in the k-means algorithm?

Answer: Classify something, plan integrated processing, etc.!

Guess you like

Origin www.cnblogs.com/Gidupar/p/12715273.html