Python crawler encrypted serial 9-JS The "salt", ajax request

A, JS encryption of the "salt"

1.salt Properties "salt": used for cryptography, such as our bank card password is six, but after the fact in the bank system, we enter a password, the original password will add several characters to form more difficult to crack password. This process we call "salt."

 

"""

处理JS加密

"""

import time,random

​

def getSalt():

    """

    salt公式:"" + ((new Date).getTime() + parseInt(10 *Matn.rnandom(),10))

    :return:

    """

    salt = int(time.time()*1000) + random.randint(0,10)

​

    return salt

​

def getMD5():

    import hashlib

    md5zhi = hashlib.md5()

​

    md5zhi.update(v.encoding="uft-8")

    sign = md5zhi.hexdigest()

​

    return sign

if __name__ == "__main__":

    getSalt()

getMD5()

 

Two, ajax request

1. The asynchronous request;

2. there will url, request method, you may have data

3. The format is generally used json

4. Case: crawling IMDb

 

"""

爬取豆瓣电影排行榜

"""

from urllib import request

import json

url = "https://movie.douban.com/typerank?type_name=%E5%89%A7%E6%83%85&type=11&interval_id=100:90&action="

rsp = request.urlopen(url)

data = rsp.read().decode()

​

data = json.loads(data)

print(data)

Here incorrect report, because there are anti watercress reptile mechanism, we modify the code to Python disguised as a browser to access

 

"""

爬取豆瓣电影排行榜

"""

from urllib import request

import json

url_u = "https://movie.douban.com/typerank?type_name=%E5%89%A7%E6%83%85&type=11&interval_id=100:90&action="

headers = {

    "User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.116 Safari/537.36"

}

url = request.Request(url_u,headers=headers)

rsp = request.urlopen(url)

data = rsp.read().decode()

​

print(data)

Third, the source

Reptitle9_1_JSEncryption.py

Reptitle9_2_ajaxResponse.py

https://github.com/ruigege66/PythonReptile/blob/master/Reptitle8_1_JSEncryption.py

https://github.com/ruigege66/PythonReptile/blob/master/Reptitle9_2_ajaxResponse.py

2.CSDN:https://blog.csdn.net/weixin_44630050

3. Park blog: https: //www.cnblogs.com/ruigege0000/

4. Welcomes the focus on micro-channel public number: Fourier transform public personal number, only for learning exchanges, backstage reply "gifts" to get big data learning materials

 

 

Guess you like

Origin www.cnblogs.com/ruigege0000/p/12343908.html