We handle large volumes of raw data in the real world. Machine learning algorithms expected data format in some way before you start the training process. First, the sample data is defined as follows:
1 input_data=np.array([[5.1,-2.9,3.3],
2 [-1.2,7.8,-6.1],
3 [3.9,0.4,2.1],
4 [7.3,-9.9,-4.5]])
Binarization
Binarization threshold value greater than 1 into data, the data is less than the threshold value into zero.
1 #binarize data
2 data_binarized=preprocessing.Binarizer(threshold=2.1).transform(input_data)
3 print("\nBinarized data:\n",data_binarized)
Calling pre-built function output as follows:
1 Binarized data:
2 [[1. 0. 1.]
3 [0. 1. 0.]
4 [1. 0. 0.]
5 [1. 0. 0.]]