python version pit: md5 example (md5 python2 difference in the python3)

Examples of md5 (md5 difference python2 and python3 in), need friends can refer to the following: This article describes the python version pits
start

For some characters, python2 and python3 out of md5 encryption it is not the same.

# python2.7
pwd = "xxx" + chr(163) + "fj"
checkcode = hashlib.md5(pwd).hexdigest()
print checkcode # ea25a328180680aab82b2ef8c456b4ce
 
# python3.6
pwd = "xxx" + chr(163) + "fj"
checkcode = hashlib.md5(pwd.encode("utf-8")).hexdigest()
print(checkcode) # b517e074034d1913b706829a1b9d1b67

Will be, is the need to encode the string code differences in python3 proceed, if no error occurs:

checkcode = hashlib.md5(pwd).hexdigest()
TypeError: Unicode-objects must be encoded before hashing

This is because of the need to convert a string encrypted bytes type, 3 is the default encoding utf-8. I decodes utf-8.

analysis

If the string does not chr (163), then the two versions of the results are consistent, that is the problem in this chr (163):

# python2.7
>>> chr(163)
'\xa3'
 
# python3.6
>>> chr(163)
'\xa3'

Here chr illustrated by the results obtained are consistent, it will look into bytes Type:

# python2.7
>>> bytes(chr(163))
'\xa3'
 
# python3.6
>>> chr(163).encode()
b'\xc2\xa3'

python3, the num <128 when using chr (num) .encode ( 'utf-8') is obtained ascii hexadecimal character, and num> 128 when using chr (num) .encode ( 'utf-8') are obtained two ascii byte hexadecimal.

Solve
switch to latin1 encoding decode:

# python3.6
pwd = "xxx" + chr(163) + "fj"
checkcode = hashlib.md5(pwd.encode("latin1")).hexdigest()
print(checkcode)  # ea25a328180680aab82b2ef8c456b4ce

Extra
Why is latin1 encoding it. The answer was quite interesting.

Let me talk chr function, can be viewed help (chr):

chr(...)
  chr(i) -> Unicode character
  Return a Unicode string of one character with ordinal i; 0 <= i <= 0x10ffff.

Means that one character Unicode code which returns the specified position is represented by the left and right internal .python3 character, i.e. the type of use Unicode str. After the encoded into bytes will encode by type.

ascii encoding for each character encoding is a byte, but only 1-127. 128-255 over parts belong Extended ASCII, python3 the default ascii does not contain this section, so if you perform chr (163) .encode ( " ascii ") will throw an error 'ascii' codec can not encode character '\ xa3' in position 3: ordinal not in range (128)

Thus a need in the 128-255 contained the coding portion of the character, and the use of a fixed-size Byte encoding, such as ISO 8859-1, which is latin1. Cp1252 course, other encoding these characters are also included.

We recommend the python learning sites, click to enter , to see how old the program is to learn! From basic python script, reptiles, django, data mining, programming techniques, work experience, as well as senior careful study of small python partners to combat finishing zero-based information projects! The method has timed programmer Python explain everyday technology, to share some of the learning and the need to pay attention to small details

Published 34 original articles · won praise 13 · views 40000 +

Guess you like

Origin blog.csdn.net/haoxun06/article/details/104504642