[Draft for personal use] Use Python for onehot encoding and recovery

Use numpy in python to perform onehot encoding of category features, and how to restore onehot encoding to category features
where: one_hot_label represents the features after onehot encoding, and label is the original category feature
Steps:

  • Generate a matrix with a diagonal of 1, n*n, where n represents the number of categories
  • input category
  • The result of onehot encoding
  • Restore the result of onehot encoding
import numpy as np

one_hot = np.eye(28) # 生成对角线为类别个数的矩阵,这里的例子为28个类别
label = np.array([1, 4, 8, 9, 5, 0]) # 输入类别(数字范围为:0-27)

# 进行onehot编码
one_hot_label = one_hot[label.astype(np.int32)] # 该方法即为选取上述生成的矩阵的第几行

# 恢复
label = [one_label.tolist().index(1) for one_label in one_hot_label] # 找到下标是1的位置
# 下面的程序打印了这个过程
# for one_label in one_hot_label:
#     print(one_label)
#     print('*'*50)
#     print(one_label.tolist()) # 输出为一维列表
#     print('-'*50)
#     print(one_label.tolist().index(1)) # 1在第几个数字第一个出现

Guess you like

Origin blog.csdn.net/qq_44319167/article/details/130362273