Learning diary (2.15 --- 2.16)

1. Data Standardization

Data normalization is to subtract the features characteristic data about the average value divided by the variance of this feature, the effect is to some numerical data have become near 0, convenient computing, while preserving features.
Import StandardScaler sklearn.preprocessing from
SS = StandardScaler ()
X = ss.fit_transform (X)

Level 3 headingHere Insert Picture Description

2.KNN cross validation algorithm

Here Insert Picture Description

3. Use jupter do wine quality relationship with the predicted harvest time

1. During programming, when the drawing library matplotlib.pyplot abscissa ordinate all designed otherwise English characters are not displayed in
Here Insert Picture Description
2.plt.scatter (data [1:, 0 ], data [1:, 1 ]) Python syntax is prepared according to FIG.
(Data [taken from the first row to the last row start, take the first 0], Data [1 lines starting from the last row to get, taking column 1])
Here Insert Picture Description
3.x_train, x_test , y_train, y_test = train_test_split (data [1:, 0], data [1:, 1], test_size = 0.3)
parsing:
. a first of which is derived from a training set and divide sklearn.model_selection.train_test_split test Suijia set
b.train_test_split cross-validation is used function, the function is a random sample from the scale to select test_data test_size and train_data
train_test_split (train_data, train_target, test_size, random_state)
train_data: the sample to be divided feature set

train_target: to be divided sample results

test_size: sample proportion, if it is, then that is an integer number of samples

random_state: it is the seed of random numbers.
Random number seed: This is actually a set of random numbers are numbers, when the need to repeat the test, a group guaranteed the same random number. For example, every time you fill in a random array of other parameters, like you get is the same. It does not fill the default value is False, that is, each sliced while the same ratio, but different segmentation results.

Generating a random number depends on the relationship between the seed and the random number seed comply with the following two rules:

Different Seed, produce different random number; the same seed, even if different instances also generates the same random number
to do next experiment a little deeper impression
Here Insert Picture Description
4.X_train = X_train [:, newaxis]
to increase a dimension data
Here Insert Picture Description
np.newaxis in use and functionally equivalent to None, in fact, an alias None of.

Here Insert Picture Description

Guess you like

Origin www.cnblogs.com/Eldq/p/12319712.html