Python simulates login to Sina Weibo

1 Introduction

From the simulated login, we can see the technical level of the companies and the importance they attach to security. I have done the simulated login ( link ) of Douban before, and it is OK to make a post request directly. It is easy. But on Sina Weibo, this method does not work at all, Sina Weibo is simple! ! ! All kinds of encryption, all kinds of jumps, and the login process is annoying!!! After referring to many blog posts and after countless failures, I finally successfully logged in! (●'◡'●)


2. Analysis of the login process

I have been using the chrome browser, so I also used chrome to watch the login process when I started to simulate login, but! I didn't get a lot of useful information. After that, I saw a lot of recommended fiddler capture packages on the Internet, but it was too messy, and I didn't really analyze the software. Then I turned to Firefox, and I found a miracle. I saw the whole process of logging in. , the response of the web page is also easy to view (the chrome response can not be seen). Therefore, Firefox is still used for complex login analysis!

Well, next is the moment to witness the miracle, let's see what process the login has gone through:

First, after entering the username, a pre-login will be performed at: http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=ZW5nbGFuZHNldSU0MDE2My5jb20%3D&rsakt=mod&checkpin=1&client= ssologin.js(v1.4.18)&_=1443156845536 , by response ( sinaSSOController.preloginCallBack({"retcode":0,"servertime":1443156842,"pcid":"gz-e88b75a929252baec7c12c741985eaa45627","nonce":"2L "pubkey": "EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245A87AC253062882729293E5506350508E7F9AA3BB77F4333231490F915F6D63C55FE2F08A49B353F444AD3993CACC02DB784ABBB8E42A9B1BBFFFB38BE18D78E87A0E41B9B8F73A928EE0CCEE1F6739884B9777E4FE9E88A1BBE495927AC4A799B3181D6442443", "rsakv": "1330428213", "showpin": 0, "exectime":16}) ), we can get four useful variables, servertime, nonce, pubkey and rsakv .

The user name encryption of Sina Weibo currently uses the Base64 encryption algorithm , and the encryption algorithm of the login password of Sina Weibo uses RSA2 . This is the key point of the simulated login. It is necessary to create an rsa public key first. The two parameters of the public key are both Sina Weibo. Given a fixed value, the first parameter is the pubkey in the first step of logging in, and the second parameter is '10001' in the js encrypted file ( updated for netizens' questions: this is actually in the response of ssologin.js ) . These two values ​​need to be converted from hexadecimal to decimal, 10001 to decimal to be 65537, and then add servertime and nonce to encrypt again.

After finishing the preparatory work, you can see what data is required for login, switch to post request, the URL is: http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.18 ) to view the submitted form data, as shown below:

The main data that needs to be submitted:

su: username after base64 encryption

servertime/nonce/rsakv pre-login to get it

sp is the password for encrypted transfer


After the form data is obtained, it is natural to submit the form data. Do you think this is fine? So naive !

What is returned after submission is not the Weibo personal homepage, but a redirection code, which is probably like this:

Pay attention to the red line part, if it is retcode=0, it means success , otherwise there will be problems in the previous process==

Get the redirected URL through regular expressions , submit the request and you're done!



3. Technical points

3.1 Several Libraries

cookielib:The cookie module defines classes for abstracting the concept ofcookies, an HTTP state management mechanism. It supports both simple string-onlycookies, and provides an abstraction for having any serializable data-type ascookie value. 用来保存cookies.

urllib2:The urllib2 module defines functions and classes which help in openingURLs (mostly HTTP) in a complex world — basic and digest authentication,redirections, cookies and more. 用来发送请求获取网页数据,与cookielib配合可以利用cookie访问.

json:Json is a lightweight data interchange format inspired by javascript object literal syntax(although it is not a strict subset of JavaScript ).

3.2 正则表达式

正则表达式是用于处理字符串的强大工具,拥有自己独特的语法以及一个独立的处理引擎,效率上可能不如str自带的方法,但功能十分强大.

具体可以参考:Python正则表达式指南 


4.源代码

# -*- coding: utf-8 -*-
########################
#author:Andrewseu
#date:2015/9/23
#login weibo
########################

import sys
import urllib
import urllib2
import cookielib
import base64
import re
import json
import rsa
import binascii
#import requests
#from bs4 import BeautifulSoup

#新浪微博的模拟登陆
class weiboLogin:
    def enableCookies(self):
            #获取一个保存cookies的对象
            cj = cookielib.CookieJar()
            #将一个保存cookies对象和一个HTTP的cookie的处理器绑定
            cookie_support = urllib2.HTTPCookieProcessor(cj)
            #创建一个opener,设置一个handler用于处理http的url打开
            opener = urllib2.build_opener(cookie_support, urllib2.HTTPHandler)
            #安装opener,此后调用urlopen()时会使用安装过的opener对象
            urllib2.install_opener(opener)

    #预登陆获得 servertime, nonce, pubkey, rsakv
    def getServerData(self):
            url = 'http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=ZW5nbGFuZHNldSU0MDE2My5jb20%3D&rsakt=mod&checkpin=1&client=ssologin.js(v1.4.18)&_=1442991685270'
            data = urllib2.urlopen(url).read()
            p = re.compile('\((.*)\)')
            try:
                    json_data = p.search(data).group(1)
                    data = json.loads(json_data)
                    servertime = str(data['servertime'])
                    nonce = data['nonce']
                    pubkey = data['pubkey']
                    rsakv = data['rsakv']
                    return servertime, nonce, pubkey, rsakv
            except:
                    print 'Get severtime error!'
                    return None
                

    #获取加密的密码
    def getPassword(self, password, servertime, nonce, pubkey):
            rsaPublickey = int(pubkey, 16)
            key = rsa.PublicKey(rsaPublickey, 65537) #创建公钥
            message = str(servertime) + '\t' + str(nonce) + '\n' + str(password) #拼接明文js加密文件中得到
            passwd = rsa.encrypt(message, key) #加密
            passwd = binascii.b2a_hex(passwd) #将加密信息转换为16进制。
            return passwd

    #获取加密的用户名
    def getUsername(self, username):
            username_ = urllib.quote(username)
            username = base64.encodestring(username_)[:-1]
            return username

     #获取需要提交的表单数据   
    def getFormData(self,userName,password,servertime,nonce,pubkey,rsakv):
        userName = self.getUsername(userName)
        psw = self.getPassword(password,servertime,nonce,pubkey)
        
        form_data = {
            'entry':'weibo',
            'gateway':'1',
            'from':'',
            'savestate':'7',
            'useticket':'1',
            'pagerefer':'http://weibo.com/p/1005052679342531/home?from=page_100505&mod=TAB&pids=plc_main',
            'vsnf':'1',
            'su':userName,
            'service':'miniblog',
            'servertime':servertime,
            'nonce':nonce,
            'pwencode':'rsa2',
            'rsakv':rsakv,
            'sp':psw,
            'sr':'1366*768',
            'encoding':'UTF-8',
            'prelt':'115',
            'url':'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack',
            'returntype':'META'
            }
        formData = urllib.urlencode(form_data)
        return formData

    #登陆函数
    def login(self,username,psw):
            self.enableCookies()
            url = 'http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.18)'
            servertime,nonce,pubkey,rsakv = self.getServerData()
            formData = self.getFormData(username,psw,servertime,nonce,pubkey,rsakv)
            headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0'}
            req  = urllib2.Request(
                    url = url,
                    data = formData,
                    headers = headers
            )
            result = urllib2.urlopen(req)
            text = result.read()
            print text
            #还没完!!!这边有一个重定位网址,包含在脚本中,获取到之后才能真正地登陆
            p = re.compile('location\.replace\([\'"](.*?)[\'"]\)')
            try:
                    login_url = p.search(text).group(1)
                    print login_url
                    #由于之前的绑定,cookies信息会直接写入
                    urllib2.urlopen(login_url)
                    print "Login success!"
            except:
                    print 'Login error!'
                    return 0

            #访问主页,把主页写入到文件中
            url = 'http://weibo.com/u/2679342531/home?topnav=1&wvr=6'
            request = urllib2.Request(url)
            response = urllib2.urlopen(request)
            text = response.read()
            fp_raw = open("e://weibo.html","w+")
            fp_raw.write(text)
            fp_raw.close()
            #print text
            
wblogin = weiboLogin()
print '新浪微博模拟登陆:'
username = raw_input(u'用户名:')
password = raw_input(u'密码:')
wblogin.login(username,password)

5.结果截图

根据提示输入用户名和密码:

主页文件:


and then? 随心所欲的做自己喜欢的事~~~~~~~~~~~~~~~~~~




生命不息,奋斗不止!






Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325405367&siteId=291194637
Recommended