Python Reptile 1: blog visits

Foreword

Python package can be reptiles, there are many, but the requests are known as "Let the HTTP service of mankind" ... the tone is not small, but it does just as well.

This is the first blog in reptiles, to achieve a very simple function: Get your own blog home page visits.

Of course, reptiles certainly can not escape the general use regex (regular expression), and therefore the re Python package is also very common.

 

analysis

Open Garden blog site and log in, click on the left "My essay":

Click F12 can view the page source code:

And then found each of which showed the amount of reading "to read:" + digital, pay attention to where the colon is in English, the number of digits uncertain.

Regular expressions, a number '\ d' can be described,

Appears 0-n times with '*', appeared 0-n times with '+', appears 0-1 times with '?'

Here, the "reading:" behind must have a digital, hence the '*' or '+' are possible.

 

Code

Import Requests 
 Import Re 

url = ' https://cnblogs.com/maoerbao/p/ '   # I have a collection of essays URLs for all 
f = requests.get (url) .text                # Gets html page content and convert it to text 
a = re .findall ( ' read: \ D * ' , F)              # regular expressions, extracting an amount of each reading 

zydl = 0 
L = []
 for I in a: 
            YDL   = int (I [. 3 :]) 
            zydl = + YDL zydl 
            L.append (YDL) 

Print ( 'Alberta blog: \ n- ' )
 Print ( ' Total number of articles:% d ' % len (L))
 Print ( ' Total amount of reading:% d ' % zydl)
 Print ( ' the largest single piece of read:% d ' % max (L))
 Print ( ' smallest unit of read articles: D% ' % min (L))

 

operation result

 

Guess you like

Origin www.cnblogs.com/maoerbao/p/11518575.html