Teach you to monitor educational system in python reptile, fast search results one step!

Teach you to monitor educational system in python reptile, fast search results one step!
These days, large and small exam a few courses, educational system and no results notification feature, eager to know how much to hang the door, so I wrote this script.

Design:
design concept is very simple, first of all the existing results are processed into a list collection, then the timing crawling pages of search results educational system, crawling results also treated to a collection list, if the length is increased newList , to identify the part of the increase, and by E-mail me.

Script effect:
Server:

Email notification:


code show as below:

Import datetime
 Import Time
 from email.header Import Header
 Import Requests
 Import Re
 Import smtplib
 from email.mime.text Import MimeText
 from BS4 Import BeautifulSoup 

DEF listener ():
     # Here I landed way through simulated landing 
    # generally filled in here a username with password 
    # but our school back to user names and passwords are encrypted 
    # by requesting data encryption observe browser with page source guess school background of 
    the data = {
         # for school security reasons given here is not encrypted the way the 
        ' encoded ' :' Xxxxxxxxxxxxxxxxxxx ' 
    } 
    the session = requests.Session () 
    session.post ( ' http://jwc.sgu.edu.cn/jsxsd/xk/LoginToXk ' , Data = Data)
     # requests all grades 2019-2020-1 term 
    = R_DATA {
         ' kksj ' : ' 2019-2020-1 ' ,
         ' kcxz ' : '' ,
         ' KCMC ' : '' ,
         ' xsfs ' : ' All '
    }
    r Session.post = ( ' http://jwc.sgu.edu.cn/jsxsd/kscj/cjcx_list ' , Data = R_DATA)
     # of crawling back encapsulating data 
    Soup = the BeautifulSoup (r.text, ' html.parser ' )
     # return the existing list of accomplishments 
    oldList = toList (Soup) 
    max = len (oldList)
     # here with an infinite loop timed crawling results page analysis of whether the distribution of new achievements in 
    the while (True):
         # POST with the get method can not be used indiscriminately, or else the data will be error 
        R & lt session.post = ( ' http://jwc.sgu.edu.cn/jsxsd/kscj/cjcx_list ' , Data = R_DATA) 
        Soup = the BeautifulSoup (r.text, 'lxml ' )
         # Print (soup.prettify ()) 
        length = len (soup.find_all (String the re.compile = ( ' 2019-2020-1 ' ))) -. 1
         Print ( " course_length: " , length)
         IF (R & lt == 200 .status_code and length =! 0):
             IF (length> max):
                 # query results list a new 
                newlist = toList (Soup)
                 # get the difference two lists, the difference is that the new results 
                diflist = compareTwoList (oldList, newlist)
                oldList = newlist
                IF diflist == '' : 
                    the send ( " unkowned Error " , " unkowned Error " )
                 the else :
                     # There are new achievements, and send E-mail me 
                    the send ( ' you have have new new Course, sorce !! ' , diflist) 
                max = length
             Print ( ' Last time running WAS: ' , datetime.datetime.now ())
             # timing of action, 500s check once 
            the time.sleep (500 )
         the else :
             # Send e-mail to disconnect the Print ( "HAD disconnected ...") 
            the send ( " your Server IS disconnected !!! " , " your Server IS disconnected !!! " )
             BREAK 

DEF the send (title, msg): 
    mail_host = ' smtp.qq.com ' 
    # your qq mailbox name, not .com 
    mail_user = ' your qq mailbox name, not .com ' 
    # password (part mailbox for an authorization code) 
    mail_pass = ' authorization code ' 
    # message sender mail address 
    sender =' Sender mail address ' 
    # Mail recipient email address, attention to the need [] package, which means you can write multiple mass e-mail address 
    Receivers = [ ' [email protected] ' ] 

    # Set email message 
    # mail content provided 
    Message = MimeText (MSG, ' Plain ' , ' UTF-. 8 ' )
     # mail subject 
    Message [ ' the Subject ' ] = Header (title, ' UTF-. 8 ' )
     # sender information 
    message [ ' the From ' ] = sENDER
     # recipient information 
    message [ ' the To '] = Receivers [0] 

    # login message and send
    the try :
         # smtpObj smtplib.SMTP = () 
        # # connect to the server 
        # smtpObj.connect (mail_host, 25) 
        smtpObj = smtplib.SMTP_SSL (mail_host)
         # log on to the server 
        smtpObj.login (mail_user, mail_pass)
         # transmit 
        smtpObj.sendmail ( 
            SENDER, Receivers, message.as_string ()) 
        # exit 
        smtpObj.quit ()
         Print ( ' Success ' )
     the except smtplib.SMTPException AS E:
         Print ( ' error ' , E)   # printing error 

DEF toList (Soup): 
    In Flag = True 
    List = [] 
    STRs = '' 
    # of td traverse under the tr tag and value 
    for tr in soup.find_all ( ' tr ' ):
         IF in Flag: 
            in Flag = False;
             Continue 
        I =. 1
         for TD in tr.stripped_strings:
             IF (I ==. 1 or I == 2 ): 
                I + =. 1
                 Continue
            strs += "_" + td
            i += 1
        list.append(strs)
        strs = ''
    return list

def compareTwoList(oldList,newList):
    diflist=''
    for sub in newList:
        #判断是否唯一
        if(oldList.count(sub)==0):
            diflist = sub
            break
    return diflist

if __name__ == '__main__':
    listener()

Not surprisingly, then this script to run out to all my achievements so far, but I'm not sure the computer does not shut down the country for so many days, so I will put this script runs on the server

http://mseo.chinaz.com/lvyous1.nx04.com/   http://seo.chinaz.com/lvyous2.nx04.com/     http://mseo.chinaz.com/lvyous3.nx04.com/   http://seo.chinaz.com/lvyous4.nx04.com/ 

http://mseo.chinaz.com/lvyous5.nx04.com/   http://seo.chinaz.com/lvyous6.nx04.com/     http://mseo.chinaz.com/lvyous7.nx04.com/   http://seo.chinaz.com/lvyous8.nx04.com/

Guess you like

Origin www.cnblogs.com/huma/p/12158170.html