Python crawls and analyzes the historical lottery data of Super Lotto

Python crawls and analyzes the historical lottery data of Super Lotto

The blogger, as a crawler beginner, uses the requests and beautifulsoup libraries to crawl data this time

Crawling the website: http://datachart.500.com/dlt/history/history.shtml —500彩票网
(After analysis, it is found that the source code of the website is not to find different data through page jumps, so you can find network through F12 Column to find the webpage that actually stores all historical lottery results)

As shown: Insert picture description herecrawler part :

from bs4 import BeautifulSoup   #引用BeautifulSoup库
import requests                 #引用requests
import os                       #os
import pandas as pd
import csv
import codecs

lst=[]
url='http://datachart.500.com/dlt/history/newinc/history.php?start=07001&end=21018'
r = requests.get(url)                     
r.encoding='utf-8'
text=r.text
soup = BeautifulSoup(text, "html.parser")
tbody=soup.find('tbody',id="tdata")
tr=tbody.find_all('tr')
td=tr[0].find_all('td')
for page in range(0,14016):
    td=tr[page].find_all('td')
    
    lst.append([td[0].text,td[1].text,td[2].text,td[3].text,td[4].text,td[5].text,td[6].text,td[7].text])
    with open("Lottery_data.csv",'w') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['期号','号码1', '号码2', '号码3', '号码4', '号码5', '号码6', '号码7'])
        writer.writerows(lst)
csvfile.close()

Data analysis:
firstly display all lottery numbers and corresponding winning numbers
Insert picture description here

Then through the 5+2 mode, the two groups of data combinations with the highest frequency were analyzed respectively, and the probability of winning this combination was calculated vaguely as 3 times the average probability of winning (the final result is not displayed directly but marked in red in the csv file )
Insert picture description here
Source code and corresponding csv file
Link: https://pan.baidu.com/s/16wEHnpvrzMsK1ijW0AkhiA
Extraction code: nmjx

Tips: Thank you for your one-click triple connection~ In addition, you can point out the deficiencies to the blogger in person! !

Guess you like

Origin blog.csdn.net/xucan_123/article/details/113943714