Request a web page using the request packet Python distortion Solution

 When using the requested page requests, page information is sometimes returned distortion, the following code

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36'
}
def get_all(url,key):
    params = {
        'keyword':key,
        'enc':'utf-8'
    }
    response = requests.get(url=url,params=params,headers=headers)

    with open('jd.html','w',encoding='. 8-UTF ' ) AS F: 
        f.write (response.text) 



IF  the __name__ == ' __main__ ' : 
    Key = INPUT ( ' entering the search: ' ) 
    URL = ' https://search.jd.com/Search? ' 
    get_all (URL, Key)

Part of the return;

< Div class = "p-p-name name-type-2" > 
            < a target = "_ blank" title = "æ ???? å ° ¼ è¯'ç ?? ?? and ?? ¢ TAE ?? ¤ç å¤ ?? ?? · å £ é © ç ?? ???? æ½®æμ ?? ç ?? ç ?? and ?? ¢ · å £ «å ¢ ?? ???? is Tae? ? ¤â ???? ?? é ¢ ¼ ?? and ?? ²ä¿®èº "大ç º ?? ?? ???? å ¢ ?? is æ½®ç is ???? ?? å¹'å|ç TAE ?? ???? ?? ???? ¤è¡ £ æ · ç ?? and ?? £ 430E "?? and ?? m²" href = "// item.jd. com / 51029271063.html " onclick =" searchlog (1,51029271063,8,1, '', 'flagsClk = 1077936264') " > 
                < em > æ ???? å ° ¼ ?? and ?? è¯'ç Tae ¢ ?? ?? ¤ < font class = "skcolor_ljg" > ç ?? · </ font >å¤ ?? å £ é © ç ?? ???? æ½®æμ ?? ç ?? ç ?? and ?? ¢ · å £ «å ¢ ?? ???? is Tae ¤â ?? ?? ?? and ?? ¢ ¼ ?? and ?? ²ä¿®èº "大ç º ?? ???? ?? å is æ½®ç ¢ ?? ???? ???? is å¹'å Tae |ç ???? ?? ¤è¡ £ æ ???? < Font class = "skcolor_ljg" > ç ?? · is £ ?? </ Font > 430E "?? and ?? M² </ em > 
                < i class =" code-words " id =" J_AD_51029271063 " > </ i > 
            </ a > 
        </ div >

Solutions and thinking process;

Code;

DEF get_all (URL, Key): 
    the params = {
         ' keyword ' : Key,
         ' ENC ' : ' UTF-. 8 ' 
    } 
    Response = requests.get (URL = URL, the params = the params, headers = headers)
     # print out the requested page Back encoding 
    Print (response.encoding)
     # response.apparent_encoding is by analyzing the content of the coding, here. 8-URF 
    Print (response.apparent_encoding)
     # transcoding 
    content = response.text.encode (response.encoding). decode (response.apparent_encoding)
     Print (Content) 
    with Open (' Jd.html ' , ' W ' , encoding = ' UTF-. 8 ' ) AS F: 
        f.write (Content) 



IF  the __name__ == ' __main__ ' : 
    Key = INPUT ( ' entering the search: ' ) 
    URL = ' HTTPS : //search.jd.com/Search? ' 
    get_all (url, Key)

Console output (section);

E: \ anaconda \ python.exe E: / practice / final phase of / 0808 / jd.py 
enter a search term: Men 
ISO -8859-1 
UTF -8 
<! DOCTYPE HTML> 
<HTML> 
<head> 
<Meta HTTP-equiv = " the Content-the Type " Content = " text / HTML; charset = UTF-. 8 " /> 
<Meta HTTP-equiv = " X--the UA-Compatible " Content = " IEs = Edge " > 
<Meta name = " the renderer " Content = " WebKit " > 
<Meta HTTP-equiv = "Cache-Control" content="max-age=300" />
<link rel="dns-prefetch" href="//search.jd.com" /><link rel="dns-prefetch" href="//item.jd.com" /><link rel="dns-prefetch" href="//list.jd.com" /><link rel="dns-prefetch" href="//p.3.cn" /><link rel="dns-prefetch" href="//misc.360buyimg.com" /><link rel="dns-prefetch" href="//nfa.jd.com" /><link rel="dns-prefetch" href="//d.jd.com" /><link rel="dns-prefetch" href="//img12.360buyimg.com" /><link rel="dns-prefetch" href="//img13.360buyimg.com" /><link rel="dns-prefetch" href="//static.360buyimg.com" /><link rel="dns-prefetch" href="//csc.jd.com" /><link rel="dns-prefetch" href="//mercury.jd.com" /><link rel="dns-prefetch" href="//x.jd.com" /><link rel="dns-prefetch" href="//wl.jd.com" /><title>男装 - 商品搜索 - 京东</title><meta name="Keywords" content="男装,京东男装" /><meta name="description"find similar items 260,867 men in Jingdong, which contains the" men "and other types of goods men."= Content" /><script>
window.loadFa_toJson_data={query:'%E7%94%B7%E8%A3%85'};
window.jdpts={};jdpts._st=new Date().getTime();window.pageConfig={
    closeJpg : 1,
    compatible: false,
    searchType: 0,
    jdfVersion: '2.0.0',
    floatnav: 1,
    price_pdos_off: 0,
    actName: '',
    pSource: 'search_pc',
    queryParam: {
        c1: 0,
        c2: 1342,
        c3: 0,
        brand: '',
        price: '',
        keyword: '男装',
        page: '1'
    }
};
window.searchUnit={
    resizeOnebox: function(g,f,j){var g=parseInt(g),i=typeof f,h=typeof j;if(!isNaN(g)){if("string"==i&&f!=""&&g>0){$("#J_oneBoxFrame_"+f).css("height",g+10);h=="function"&&j()}else{if(i=="undefined"||i=="function"){$("#virtualWareIFrame").css("height",g>0?g+10:0);i=="function"&&f()}}}},
    resizeShopbox: function(e,d){var f=0;switch(e){case 1:case 2:f=145;break;case 3:f=75;break;case 4:f=80;break;default:break}f&&$("#shopboxIFrame").css("height",f).show();typeof(d)=="string"&&(new Image().src=d)},
    coupon: {}};
window.QUERY_KEYWORD='男装';
window.REAL_KEYWORD='男装';
</script>
<link type="text/css" rel="stylesheet" href="//misc.360buyimg.com/??jdf/1.0.0/unit/ui-base/5.0.0/ui-base.css,jdf/1.0.0/unit/shortcut/5.0.0/shortcut.css,jdf/1.0.0/unit/global-header/5.0.0/global-header.css,jdf/1.0.0/unit/myjd/5.0.0/myjd.css,jdf/1.0.0/unit/nav/5.0.0/nav.css,jdf/1.0.0/unit/shoppingcart/5.0.0/shoppingcart.css,jdf/1.0.0/unit/global-footer/5.0.0/global-footer.css,jdf/1.0.0/unit/service/5.0.0/service.css,jdf/1.0.0/unit/global-header-photo/5.0.0/global-header-photo.css,jdf/1.0.0/ui/area/1.0.0/area.css" />
<link type="text/css" rel="stylesheet" href="//misc.360buyimg.com/product/search/1.0.7/css/search.css" />
<script type="text/javascript" src="//misc.360buyimg.com/??jdf/1.0.0/unit/base/5.0.0/base.js,jdf/lib/jquery-1.6.4.js,product/module/es5-shim.js"></script>
<script>
window.SEARCH = {
    cid: 1349,
    ui_ver: '1.0.7',
    c_category: 1342,
    p_category: 0,
    enable_adv: 1,
    enable_prom_adwords: 1,
    enable_prom_flag: 1,
    enable_price: 1,
    enable_stock: 2,
    enable_yyk: 0,
    lottery_code: '',
    is_correct_hash: function(e){var a=["keyword","brand_id","activity_id","coupon_batch","ecard_id"];for(var c=0,b=a.length;c<b;c++){var d=new RegExp("(^|\\?|&)"+a[c]+"=([^&]*)(\\s|&|$)");if(d.test(e)){return true}}return false},
    get_real_hash:function () {var a=window.location.hash.substr(1);if(a&&$.browser.mozilla){return location.href.substr(location.href.indexOf("#")+1)}else{return a}}
};
(function(a,b){var c=b.get_real_hash();if(b.is_correct_hash(c)){a.location.href=a.location.pathname+"?"+c;return false}else{if(a.self!=a.top||$.browser.msie&&$.browser.version<=9){var f=null,e=function(){var d=$(a).width();return 1210>d?$("html").removeClass():$("html").removeClass().addClass(d>=1210&&1390>d?"resp01":"resp02"),true};e();$(a).resize(function(){clearTimeout(f),f=setTimeout(e,20)})}}})(window,SEARCH);
</script>
</head>
<body>
<!--shortcut start-->
<div id="shortcut-2014">
    <div class="w">
        <ul class="fl">
            <li id="ttbar-home"><i class="iconfont">&#xe608;</i><a href="//www.jd.com/" target="_blank">京东首页</a></li>
            <li class="dorpdown" id="ttbar-mycity"></li>
        </ul>

done。

Guess you like

Origin www.cnblogs.com/nmsghgnv/p/11322287.html