Example one: Jingdong commodity page crawling
import requests url = "https://item.jd.com/100004770237.html" try: r = requests.get(url) r.raise_for_status() r.encoding = r.apparent_encoding print(r.text[:1000]) except: print("爬取失败")
Example 2: Amazon product page crawling
import requests url = "https://www.amazon.cn/dp/B071HXVPXG/ref=lp_659039051_1_2?s=books&ie=UTF8&qid=1580353560&sr=1-2" try: kv = {'user-agent' :'Mozilla/5.0'} r = requests.get(url , headers = kv) r.raise_for_status() r.encoding = r.apparent_encoding print(r.text[1000:2000]) except: print("爬取失败")
Three examples: 360 Baidu search keywords submitted
import requests keyword = "python" try: kv = {'q' : keyword} r = requests.get("http://www.so.com/s",params = kv) print(r.request.url) r.raise_for_status() print(len(r.text)) except: print("爬取失败")
Note: Search engine keywords submitted Interface
Baidu keywords Interface: http: //www.baidu.com/s wd = keyword?
360 Keywords Interface: http: //www.so.com/s q = keyword?
Four examples: network picture of crawling and storage
Requests Import Import os url = "http://img1.3lian.com/2015/w7/97/d/25.jpg" # Set the storage location and name of the crawling pictures, names can use a picture of the original names can also be Customizing the root = "E: // // Python" path = + url.split the root ( '/') [-. 1] the try: IF Not os.path.exists (the root): os.mkdir (the root) IF Not os.path.exists (path): r = requests.get (url) with Open (path, 'wb') AS f: f.write (r.content) f.close () Print ( "document saved successfully.") the else: Print ( "file already exists") the except: Print ( "crawling failed")
Examples of five: IP address attribution of automatic query
import requests url = "http://m.ip138.com/ip.asp?ip=" try: r = requests.get(url+'202.204.80.112') r.raise_for_status() r.encoding = r.apparent_encoding print(r.text[-500:]) except: print("爬取失败")