Teach you to monitor educational system in python reptile, fast search results one step!
These days, large and small exam a few courses, educational system and no results notification feature, eager to know how much to hang the door, so I wrote this script.
Design:
design concept is very simple, first of all the existing results are processed into a list collection, then the timing crawling pages of search results educational system, crawling results also treated to a collection list, if the length is increased newList , to identify the part of the increase, and by E-mail me.
Script effect:
Server:
Email notification:
code show as below:
Import datetime Import Time from email.header Import Header Import Requests Import Re Import smtplib from email.mime.text Import MimeText from BS4 Import BeautifulSoup DEF listener (): # Here I landed way through simulated landing # generally filled in here a username with password # but our school back to user names and passwords are encrypted # by requesting data encryption observe browser with page source guess school background of the data = { # for school security reasons given here is not encrypted the way the ' encoded ' :' Xxxxxxxxxxxxxxxxxxx ' } the session = requests.Session () session.post ( ' http://jwc.sgu.edu.cn/jsxsd/xk/LoginToXk ' , Data = Data) # requests all grades 2019-2020-1 term = R_DATA { ' kksj ' : ' 2019-2020-1 ' , ' kcxz ' : '' , ' KCMC ' : '' , ' xsfs ' : ' All ' } r Session.post = ( ' http://jwc.sgu.edu.cn/jsxsd/kscj/cjcx_list ' , Data = R_DATA) # of crawling back encapsulating data Soup = the BeautifulSoup (r.text, ' html.parser ' ) # return the existing list of accomplishments oldList = toList (Soup) max = len (oldList) # here with an infinite loop timed crawling results page analysis of whether the distribution of new achievements in the while (True): # POST with the get method can not be used indiscriminately, or else the data will be error R & lt session.post = ( ' http://jwc.sgu.edu.cn/jsxsd/kscj/cjcx_list ' , Data = R_DATA) Soup = the BeautifulSoup (r.text, 'lxml ' ) # Print (soup.prettify ()) length = len (soup.find_all (String the re.compile = ( ' 2019-2020-1 ' ))) -. 1 Print ( " course_length: " , length) IF (R & lt == 200 .status_code and length =! 0): IF (length> max): # query results list a new newlist = toList (Soup) # get the difference two lists, the difference is that the new results diflist = compareTwoList (oldList, newlist) oldList = newlist IF diflist == '' : the send ( " unkowned Error " , " unkowned Error " ) the else : # There are new achievements, and send E-mail me the send ( ' you have have new new Course, sorce !! ' , diflist) max = length Print ( ' Last time running WAS: ' , datetime.datetime.now ()) # timing of action, 500s check once the time.sleep (500 ) the else : # Send e-mail to disconnect the Print ( "HAD disconnected ...") the send ( " your Server IS disconnected !!! " , " your Server IS disconnected !!! " ) BREAK DEF the send (title, msg): mail_host = ' smtp.qq.com ' # your qq mailbox name, not .com mail_user = ' your qq mailbox name, not .com ' # password (part mailbox for an authorization code) mail_pass = ' authorization code ' # message sender mail address sender =' Sender mail address ' # Mail recipient email address, attention to the need [] package, which means you can write multiple mass e-mail address Receivers = [ ' [email protected] ' ] # Set email message # mail content provided Message = MimeText (MSG, ' Plain ' , ' UTF-. 8 ' ) # mail subject Message [ ' the Subject ' ] = Header (title, ' UTF-. 8 ' ) # sender information message [ ' the From ' ] = sENDER # recipient information message [ ' the To '] = Receivers [0] # login message and send the try : # smtpObj smtplib.SMTP = () # # connect to the server # smtpObj.connect (mail_host, 25) smtpObj = smtplib.SMTP_SSL (mail_host) # log on to the server smtpObj.login (mail_user, mail_pass) # transmit smtpObj.sendmail ( SENDER, Receivers, message.as_string ()) # exit smtpObj.quit () Print ( ' Success ' ) the except smtplib.SMTPException AS E: Print ( ' error ' , E) # printing error DEF toList (Soup): In Flag = True List = [] STRs = '' # of td traverse under the tr tag and value for tr in soup.find_all ( ' tr ' ): IF in Flag: in Flag = False; Continue I =. 1 for TD in tr.stripped_strings: IF (I ==. 1 or I == 2 ): I + =. 1 Continue strs += "_" + td i += 1 list.append(strs) strs = '' return list def compareTwoList(oldList,newList): diflist='' for sub in newList: #判断是否唯一 if(oldList.count(sub)==0): diflist = sub break return diflist if __name__ == '__main__': listener()
Not surprisingly, then this script to run out to all my achievements so far, but I'm not sure the computer does not shut down the country for so many days, so I will put this script runs on the server
http://mseo.chinaz.com/lvyous1.nx04.com/ http://seo.chinaz.com/lvyous2.nx04.com/ http://mseo.chinaz.com/lvyous3.nx04.com/ http://seo.chinaz.com/lvyous4.nx04.com/
http://mseo.chinaz.com/lvyous5.nx04.com/ http://seo.chinaz.com/lvyous6.nx04.com/ http://mseo.chinaz.com/lvyous7.nx04.com/ http://seo.chinaz.com/lvyous8.nx04.com/