one-hot 编码 (字符串和数字类型的标签)

在用Pytorch 验证标签的时候,建议使用ont-hot 编码,这样比较容易实现既定的功能。

直接上代码:  字符串类型  本代码将hello world 换做了矩阵

from numpy import argmax
# define input string
data = 'hello world'
print(data)
# define universe of possible input values
alphabet = 'abcdefghijklmnopqrstuvwxyz '
# define a mapping of chars to integers
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
print("char_to_int:",char_to_int,len(char_to_int))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))
print("int_to_char:",int_to_char)
# integer encode input data
integer_encoded = [char_to_int[char] for char in data]
print(integer_encoded,len(integer_encoded))   ## 寻找数据对应的编码
# one hot encode
onehot_encoded = list()
for value in integer_encoded:
       letter = [0 for _ in range(len(alphabet))]
       letter[value] = 1
       onehot_encoded.append(letter)
   #    print(letter,len(alphabet)
print(onehot_encoded)
# invert encoding
inverted = int_to_char[argmax(onehot_encoded[0])]
print(inverted)

代码2: 数字类型   此方法不对,One-hot 标签 ,意味者一个标签只有一个数字为1 

此处的代码是将程序转化为二进制

label=[ 1, 2, 3, 4, 5]
print(label) 
ont_hot_encode=[]
for i in label:
     a=bin(i).replace('0b','')
     print(a.zfill(3))
     ont_hot_encode.append(a.zfill(3))
print(ont_hot_encode)
发布了234 篇原创文章 · 获赞 61 · 访问量 12万+

猜你喜欢

转载自blog.csdn.net/weixin_42528089/article/details/103903104