Do a ridiculous thing: use AI to reason about the next double color ball result v0.1

Do a ridiculous thing: use AI to reason about the next double color ball result v0.1

Github address: https://github.com/yinqishuo/Bicolorballs-AI

introduction

The cause of the matter was that my father was cheated by a relative and suddenly fell in love with Double Color Ball. He didn't even understand the rules and lottery results, so he asked me to study this matter and choose a number for him. He also said that several people in his hometown won millions and bought a car and a house. I..., who can refuse a million-dollar bonus dream that can be bought for 2 yuan? --------------------------20231204

1. Double Color Ball Rules

Question: Shuangseqiu Lotto! Why do you always draw huge prizes at the same time?_Lottery_Sina Sports Storm_Sina.com

  1. Two-color balls consist of red balls and blue balls. The red ball has a total of 33 numbers (01-33), and the blue ball has a total of 16 numbers (01-16).
  2. In each double-color ball lottery draw, 6 numbers are selected from the red ball and 1 number is selected from the blue ball. A total of 7 numbers are selected as betting numbers.
  3. You can choose to select numbers manually or use the machine selection function. When selecting numbers manually, you can choose the numbers of 6 red balls and 1 blue ball by yourself. The machine selection function will randomly generate a set of numbers.
  4. The betting amount for each double-color ball is RMB 2.
  5. You can choose single bets or multiple bets. Single betting refers to selecting only one set of numbers for betting, while multiple betting refers to selecting multiple sets of numbers for betting to increase the chance of winning.
  6. Shuangseqiu holds lottery draws twice a week, on Tuesdays, Thursdays and Sundays at 9pm.
  7. During the lottery draw, 6 numbers will be drawn from the red ball as the winning number, and then 1 number will be drawn from the blue ball as the blue ball number.
  8. Winning rules are determined based on how well the numbers you select match the winning numbers. The prizes are divided into first prize, second prize, third prize, fourth prize, fifth prize, sixth prize and seventh prize. The first prize is 6 red + 1 blue, and the second prize is 6 red. ,So on and so forth.

The following are the winning probabilities and bonuses for each award of Shuangseqiu (the total winning probability is 6.71%):

Awards Probability of winning Bonus (estimated)
First prize (win 6 red + 1 blue) 1/17721088 millions of dollars
Second prize (win 6 reds) 0.0000846% About 500,000 yuan
Third Prize (5 red + 1 blue) 0.000914% 3000 yuan
Fourth Prize (win 5 red or 4 red + 1 blue) 0.0434% 200 yuan
Fifth prize (win 4 red or 3 red + 1 blue) 0.7758% 10 yuan
Sixth Prize (win 1/2 red + 1 blue or only blue ball) 5.889% 5 yuan

If you invest 2 yuan in a two-color ball, the average return you can get is less than 1 yuan, which is 0.9377 yuan.

2. Research ideas and purposes

It is obviously impossible to study an algorithm for winning the jackpot, so the initial goal is to increase the probability of winning the 4th prize To1/100. In this way, you can guarantee that for every 100 bets you buy, you will win a 4th prize, with no profit or loss. The ideal is very full, hahahaha, let us start taking action now.

  • Obviously, each double-color ball is an independent random event, and the results before and after are not connected. But I still try to build it into a time series model first. The next winning result is derived from the previous N results.
  • Since you are engaged in metaphysics, you must engage in deep learning. Alchemy and metaphysics are simply not a good match. The preferred deep learning network LSTM
  • Once you have the model, you still need to use data to train it. Then you must use a crawler to find out the lottery results in recent decades.

Just do it!

3. Obtain historical winning results

Analyze web page source code

image-20231204082407590

  1. Select Zhongcai.com, open the browser source code and use F12 to view the source code. The web page will send an asynchronous jQuery request every time it is queried and return the query results in json format. (Non-computer major, only a simple understanding of the front-end and back-end mechanisms of web pages)

  2. Use the python request library to imitate requests and crawl historical data

  3. Simply directly request this URL and the data returned is empty; a simple search shows that this is a phenomenon caused by jQuery using jsonp for processing when making cross-domain requests. Note a few points: 1) You need to set the Referer header because this is a cross-domain request; 2) _=1701651618276 is a timestamp;

  4. To be on the safe side, cookie and User-Agent headers are added

    import requests
    import json
    import random
    url = 'https://jc.zhcw.com/port/client_json.php'
    headers = {
          
          
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0",
        "Referer": "https://www.zhcw.com/"
    }
    cookies = {
          
          
        "Hm_lvt_692bd5f9c07d3ebd0063062fb0d7622f": "1701622572",
        "_gid": "GA1.2.1218667687.1701622572",
        "Hm_lvt_12e4883fd1649d006e3ae22a39f97330": "1701622573",
        "PHPSESSID": "6k70gq2h44nksmou3n8374jq13",
        "_ga_9FDP3NWFMS": "GS1.1.1701622572.1.1.1701623257.0.0.0",
        "_ga": "GA1.2.1720843243.1701622572",
        "Hm_lpvt_12e4883fd1649d006e3ae22a39f97330": "1701623257",
        "Hm_lpvt_692bd5f9c07d3ebd0063062fb0d7622f": "1701623258"
    }
    params = {
          
          
            'callback':'jQuery112208474410773064831_1701622567918',
            'transactionType':10001001,
            'lotteryId':1,
            'issueCount':0,
            'startIssue':'',
            'endIssue':'',
            'startDate':'2003-02-01',
            'endDate':'2003-04-01',
            'type':2,
            'pageNum':1,'pageSize':30,
            "tt": random.random(),
            "_": str(int(time.time() * 1000))
        }
    # 发送HTTP GET请求
    response = requests.get(url,headers=headers,cookies=cookies,params=params)
    
    # 提取JSONP响应中的JSON数据
    json_data = response.text.split('(')[1].split(')')[0]
    # 解析JSON数据
    data = json.loads(json_data)
    # 提取双色球号码信息
    for entry in data['data']:
        issue = entry['issue']
        openTime = entry["openTime"]
        front_winning_num = entry['frontWinningNum']
        back_winning_num = entry['backWinningNum']
        saleMoney = entry["saleMoney"]
        print(f"{
            
            issue} |{
            
            openTime}|{
            
            front_winning_num} |{
            
            back_winning_num} |{
            
            saleMoney}")
    
    
    结果输出:
    2003011 |2003-03-30|04 05 11 12 30 32 |15 |12782494
    2003010 |2003-03-27|01 02 08 13 17 24 |13 |12402130
    2003009 |2003-03-23|05 09 18 20 22 30 |09 |12386072
    ...
    

save to file

image-20231204092839207

  1. There is a lot of information in each issue: date, lottery date, winning number, sales, prize pool, and winning status. But some data in some early data are missing, so I chose to keep Lottery date, Issue number , Winning numbers, Sales, Prize pool .

  2. The easiest way to save is to save to CSV as plain text.

  3. Because we can only crawl 30 pieces of data at a time, we divide it by month and crawl it once every 2 months. There will be no more than 30 lottery draws in 2 months. We started crawling Shuangseqiu from 20230201 on February 16, 2003, and generated each time node string.

    import datetime
    def add_two_months(date_str):
        date_format = "%Y-%m-%d"
        current_date = datetime.datetime.strptime(date_str, date_format)
        now = datetime.datetime.now()
        datestart = []
        dateend = []
        while current_date < now:
            datestart.append(current_date.strftime(date_format))
            current_date_end = current_date + datetime.timedelta(days=59)
            if  current_date_end < now:
                dateend.append(current_date_end.strftime(date_format))
            else:
                dateend.append(now.strftime(date_format))
            current_date = current_date + datetime.timedelta(days=60)
        return datestart,dateend
    start_date = "2003-02-01"
    datestart,dateend = add_two_months(start_date)
    print(datestart)
    print(dateend)
    
    输出结果:
    ['2003-02-01', '2003-04-02', '2003-06-01', '2003-07-31',...,'2023-02-16', '2023-04-17', '2023-06-16', '2023-08-15', '2023-10-14']
    ['2003-04-01', '2003-05-31', '2003-07-30', '2003-09-28',..., '2023-04-16', '2023-06-15', '2023-08-14', '2023-10-13', '2023-12-04']
    
  4. Loop crawl and save

    file = open('Bicolorballs.csv', 'w')  # 打开Bicolorballs.csv文件,写模式
    for s,e  in zip(datestart,dateend):
        params = {
          
          
            'callback':'jQuery112208474410773064831_1701622567918',
            'transactionType':10001001,
            'lotteryId':1,
            'issueCount':0,
            'startIssue':'',
            'endIssue':'',
            'startDate':s,
            'endDate':e,
            'type':2,
            'pageNum':1,'pageSize':30,
            "tt": random.random(),
            "_": str(int(time.time() * 1000))
        }
        # 发送HTTP GET请求
        response = requests.get(url,headers=headers,cookies=cookies,params=params)
    
        # 提取JSONP响应中的JSON数据
        json_data = response.text.split('(')[1].split(')')[0]
        # 解析JSON数据
        data = json.loads(json_data)
        # 提取双色球号码信息
        for entry in data['data']:
            issue = entry['issue']
            openTime = entry["openTime"]
            front_winning_num = entry['frontWinningNum']
            back_winning_num = entry['backWinningNum']
            saleMoney = entry["saleMoney"]
            prizePoolMoney = entry["prizePoolMoney"]
            data = f"{
            
            issue},{
            
            openTime},{
            
            front_winning_num},{
            
            back_winning_num},{
            
            saleMoney},{
            
            prizePoolMoney}\r\n"
            file.write(data)
    file.close()  # 关闭文件
    

    image-20231204104235092

  5. Because of multiple requests and different response return times, the order of the data will be messed up, but this is not important. Just open it in excel and sort it.

  6. An interesting thing suddenly occurred to me. What will be the sales curve of double color balls in the past 20 years? Hahaha, will it be completely consistent with the national economic development curve?

    image-20231204105508973

    The figure below shows the changes in national GDP from 2003 to 2023:

    China GDP

    It cannot be said that there is no relationship, it can only be said that they are almost exactly the same. In the long run, lottery sales are roughly related to the national economic development. Its sales have basically remained stable after 2016. It is speculated that there may be two reasons: the market size has grown to the upper limit or It is related to the slowdown in China’s economic growth in recent years. In addition, sales will also be impacted by various emergencies. For example, the epidemic in 2020 caused a clear trough in sales; after the epidemic recovers in 2022, it will rebound to a small peak. In addition, sales also show a cyclical nature. They are higher at the beginning and end of each year and lower in the middle of the year. This is an interesting phenomenon. It is speculated that the reasons for this phenomenon are 1) Going home for the New Year 2) Year-end bonus 3) At the end of the year Pay wages.

    Finally, here is a historical change curve of sales and bonus pool:

    image-20231204115553164

4. Build algorithm model

1. Original data distribution

Let's first start with the probability distribution of the original data and see if we can find some patterns. Calculate the probability distribution of all balls winning:

image-20231204133949925

Approximate maximum 5 red spheres: 14, 24, 1, 22, 617;Blue sphere1. Third division 614, 614, 597, 594, 586, 585 sum 213.

2.Model v0.1

Dataset construction

We first simply assume that the number of the next ball is only affected by the previous winning number. Then the data set only needs to introduce two variables: red ball number and blue ball number. The red balls are numbered as five non-repeating numbers from 1 to 33, and the blue balls are numbered as a single number from 1 to 16. I construct it as
R = I 33 × 1 , B = I 16 × 1 , where i award = 1 , i none = 0 R = I_{33\times 1 }, B = I_{16\times 1}, where i_award = 1, i_none = 0 R=I33×1,B=I16×1,insidei=1,i=0

I have previously observed the periodic changes in lottery sales within the year. I simply assume that the next winning ball number is affected by the winning results of the previous year (about 150 times), that is:< a i=1> R t , B t = f ( R t − 1 , B t − 1 , R t − 2 , B t − 2 , . . . R t − 150 , B t − 150 ) R_t,B_t = f (R_{t-1},B_{t-1},R_{t-2},B_{t-2},...R_{t-150},B_{t-150})
Rt,Bt=f(Rt1,Bt1,Rt2,Bt2,...Rt150,Bt150)
Our purpose is to construct a time series reasoning model f.

Based on the knowledge I have learned so far, I choose the LSTM deep learning model (Long Short-term Memory Recurrent Neural Network).

Based on the rules of the two-color ball game, winning with the blue ball is much more important than winning with the red ball. We should obviously pay more attention to the blue ball within the model. But as an early version of this model, we assume that the red basketball winning results are independent of each other, further simplifying the above model to:
R t = f 1 ( R t − 1 , R t − 2 , . . . R t − 150 ) , B t = f 2 ( B t − 1 , B t − 2 , . . . B t − 150 ) R_t = f_1(R_{t-1},R_{t-2},...R_{t-150}) ,\\ B_t = f_2(B_{t-1},B_{t-2},. ..B_{t-150}) Rt=f1(Rt1,Rt2,...Rt150),Bt=f2(Bt1,Bt2,...Bt150)

That is, two LSTM models are trained to infer red balls and blue balls separately. At this point we can start constructing our data set:

Read data

import numpy as np
import tensorflow as tf

with open('Bicolorballs.csv', 'r') as file:
    reader = csv.reader(file)
    red_balls = []
    blue_balls = []
    for row in reader:
        red = np.array([int(i)-1 for i in row[2].split(' ')])
        # 编码
        red = tf.one_hot(red, depth=33)
        red = tf.reduce_sum(red, axis=0)
        red_balls.append(red)
        blue = np.array([int(row[3])-1])
        blue = tf.one_hot(blue, depth=16)
        blue = tf.reduce_sum(blue, axis=0)
        blue_balls.append(blue)
print("数据集长度为:",len(red_balls))
print("红球编码后为:",red_balls[0].numpy())
print("蓝球编码后为:",blue_balls[0].numpy())
#输出结果:
#数据集长度为: 3083
#红球编码后为: [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0.]
#蓝球编码后为: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]

Construct a red ball data set with a time step of 150. The first 80% of the data is the training set and 20% is the test set.

data = tf.stack(red_balls)
# 定义超参数
n_steps = 150  # 时间步数,即每个样本包含的历史时间步数
T = len(data)
features = []
for i in range(T-n_steps):
    features.append(data[i: n_steps + i,:])
labels = data[n_steps:,:]
features = tf.stack(features) 

# 将数据集划分为训练集和测试集
train_size = int(len(data) * 0.8)
train_X, train_y = features[:train_size], labels[:train_size]
test_X, test_y = features[train_size:],labels[train_size:]

# 创建tf.data.Dataset对象
train_dataset = tf.data.Dataset.from_tensor_slices((train_X, train_y))
test_dataset = tf.data.Dataset.from_tensor_slices((test_X, test_y))

# 可选:对数据集进行一些预处理操作,例如乱序、批量化和缓存等
train_dataset = train_dataset.shuffle(100).batch(32).prefetch(tf.data.AUTOTUNE)
test_dataset = test_dataset.batch(32).cache().prefetch(tf.data.AUTOTUNE)

#测试数据集的的形状
#data,label = iter(train_dataset.take(1)).next()
#print(data.shape)
#(32, 150, 33)     #(B,T,C)
Construct model

First build a simpledouble-layer LSTM network, the number of LSTM units is 64/128/256, and the output layer uses a fully connected layer ( 33), change the output to the output of 33 channels, and then connect the sigmoid layer to map the output vector to the probability between 0-1. The loss function uses multi-classificationcross-entropy loss function.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
# 定义输入和输出形状
input_shape = (150, 33)
output_shape = (1, 33)
# 创建双层LSTM网络
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=input_shape))  # 第一层LSTM
model.add(LSTM(64))  # 第二层LSTM
model.add(Dense(output_shape[-1], activation='sigmoid'))  # 输出层
# 编译模型
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# 打印模型结构
model.summary()
################################################
#Model: "sequential"
#_________________________________________________________________
# Layer (type)                Output Shape              Param #   
#=================================================================
# lstm (LSTM)                 (None, 150, 64)           25088     
#                                                                 
# lstm_1 (LSTM)               (None, 64)                33024     
#                                                                 
# dense (Dense)               (None, 33)                2145      
#                                                                 
#=================================================================
#Total params: 60,257
#Trainable params: 60,257
#Non-trainable params: 0
Model training

Start training and witness the results of my day's research. Save the training log and optimal parameters, and the learning rate defaults to 0.001

from datetime import datetime
from tensorflow import keras
EPOCHS = 200
NetNAME = 'LSTM2'
tf.debugging.set_log_device_placement(True)
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
NAME = timestamp + NetNAME +"_batchsize";print(NAME)
logdir = "./logs/" + NAME
modeldir = "./model/"+NAME+".h5"
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath=modeldir, monitor='val_accuracy', verbose=1, save_best_only=True, mode = 'max')
with tf.device('/GPU:0'):
    model.fit(
        x=train_dataset, 
        epochs=EPOCHS,     
        validation_data=test_dataset,    
        callbacks=[tensorboard_callback,checkpoint])

After training several times, the results that seem to be satisfactory so far are as follows. The loss function of the test set and training set will drop a little, but the absolute value is still very large, and the model does not converge. Looking at the accuracy change curve, during the 200 rounds of training, there were several times when the accuracy was relatively high, indicating that the model accidentally guessed part of the data correctly.

image-20231204191600099

image-20231204191637930

In addition, we use the same method to trainblue ball model. Because the blue ball has only one result in the end, we choose the activation function of the output layersoftmax layer and the loss function oftwo Categorical cross-entropy loss function.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import LearningRateScheduler
from datetime import datetime
from tensorflow import keras
from tensorflow.keras.callbacks import LearningRateScheduler

data = tf.stack(blue_balls)
# 定义超参数
n_steps = 150  # 时间步数,即每个样本包含的历史时间步数
T = len(data)
features = []
for i in range(T-n_steps):
    features.append(data[i: n_steps + i,:])
labels = data[n_steps:,:]
features = tf.stack(features) 

# 将数据集划分为训练集和测试集
train_size = int(len(data) * 0.8)
train_X, train_y = features[:train_size], labels[:train_size]
test_X, test_y = features[train_size:],labels[train_size:]

# 创建tf.data.Dataset对象
train_dataset = tf.data.Dataset.from_tensor_slices((train_X, train_y))
test_dataset = tf.data.Dataset.from_tensor_slices((test_X, test_y))

# 可选:对数据集进行一些预处理操作,例如乱序、批量化和缓存等
train_dataset = train_dataset.shuffle(100).batch(32).prefetch(tf.data.AUTOTUNE)
test_dataset = test_dataset.batch(32).cache().prefetch(tf.data.AUTOTUNE)

# 定义输入和输出形状
input_shape = (150, 16)
output_shape = (1, 16)
# 创建双层LSTM网络
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=input_shape))  # 第一层LSTM
model.add(LSTM(64))  # 第二层LSTM
model.add(Dense(output_shape[-1], activation='softmax'))  # 输出层
# 编译模型
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# 打印模型结构


EPOCHS = 400
NetNAME = 'B_LSTM2'
tf.debugging.set_log_device_placement(True)
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
NAME = timestamp + NetNAME +"_batchsize";print(NAME)
logdir = "./logs/" + NAME
modeldir = "./model/"+NAME+".h5"

def lr_scheduler(epoch, lr):
    if epoch % 100 == 0 and epoch != 0:
        lr = lr / 2
    return lr
lr_callback = LearningRateScheduler(lr_scheduler)
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath=modeldir, monitor='val_accuracy', verbose=1, save_best_only=True, mode = 'max')
with tf.device('/GPU:0'):
    model.fit(
        x=train_dataset, 
        epochs=EPOCHS,     
        validation_data=test_dataset,    
        callbacks=[tensorboard_callback,checkpoint])

Why! You can't learn basketball at all, you're seriously overfitting.

image-20231204194418091

Model reasoning

Load the red ball and blue ball models for inference. The red balls will take the 6 highest probability values, and the blue balls will take the 5 highest probability values. Guess five times, each time the red ball remains the same and the blue ball changes. Judge how many prizes you have won based on the true value, let’s play! ! ! ! !

import tensorflow as tf
from tensorflow.keras.models import load_model

def CalculateTheAwards(R,B,Rlabel,Blabel):
    R = R.numpy()
    B = B.numpy()
    Rlabel = Rlabel.numpy()
    Blabel = Blabel.numpy()
    Rcount = 0
    Bcount = Blabel[B]
    for r in R:
        Rcount = Rcount + Rlabel[r]
    if Bcount+Rcount ==7:
        print('恭喜你,你中一等奖了!!')
    elif Rcount ==6 and Bcount ==0:
        print('恭喜你,你中二等奖了!!')
    elif Rcount ==5 and Bcount ==1:
        print('恭喜你,你中三等奖了!!')
    elif Bcount+Rcount ==5:
        print('恭喜你,你中四等奖了!!')
    elif Bcount+Rcount ==4:
        print('恭喜你,你中五等奖了!!')
    elif Bcount ==1 :
        print('恭喜你,你中六等奖了!!')
    else:
        print('很遗憾,你没有中奖')
# 加载模型
Rmodel = load_model('model/20231204-174742LSTM2_lr100_batchsize.h5')   #20231204-173622LSTM2_batchsize.h5
Bmodel = load_model('model/20231204-192802B_LSTM2_batchsize.h5')

i = 2   #测试数据编号
# 加载新的数据进行推理
R = tf.expand_dims(tf.stack(red_balls[-150-i:-i]), axis=0)
B = tf.expand_dims(tf.stack(blue_balls[-150-i:-i]), axis=0)
Rlabel = red_balls[-i+1]
Blabel = blue_balls[-i+1]
print(Rlabel.numpy())
print(Blabel.numpy())
# 对新数据进行预测
Rpredictions = Rmodel.predict(R,verbose=0)
Bpredictions = Bmodel.predict(B,verbose=0)
# # 打印预测结果
# print(Rpredictions)
# print(Bpredictions)

# 选择出最大的5/6个元素的索引
_, Rindices = tf.math.top_k(tf.constant(Rpredictions[0]), k=6)
_, Bindices = tf.math.top_k(tf.constant(Bpredictions[0]), k=5)
print(f'预测结果是红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[0]+1).numpy()}')
CalculateTheAwards(Rindices,Bindices[0],Rlabel,Blabel)
print(f'预测结果是红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[1]+1).numpy()}')
CalculateTheAwards(Rindices,Bindices[1],Rlabel,Blabel)
print(f'预测结果是红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[2]+1).numpy()}')
CalculateTheAwards(Rindices,Bindices[2],Rlabel,Blabel)
print(f'预测结果是红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[3]+1).numpy()}')
CalculateTheAwards(Rindices,Bindices[3],Rlabel,Blabel)
print(f'预测结果是红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[4]+1).numpy()}')
CalculateTheAwards(Rindices,Bindices[4],Rlabel,Blabel)
simulation game

We use our model to simulate games based on historical lottery data. The result of each bet is obtained through model reasoning using the winning numbers of the previous 150 times. Each purchase of 5 bets costs 10 yuan. Let’s play it 10 times first and see

红球开奖结果: [0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0]
蓝球开奖结果: [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为10    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为14    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为11    ---->     恭喜你,你中六等奖了!!5块钱
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为15    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为4    ---->     很遗憾,你没有中奖
红球开奖结果: [0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0]
蓝球开奖结果: [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为8    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为14    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为10    ---->     恭喜你,你中六等奖了!!5块钱
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为5    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为15    ---->     很遗憾,你没有中奖
红球开奖结果: [0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0]
蓝球开奖结果: [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为8    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为5    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为4    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为12    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为15    ---->     很遗憾,你没有中奖
红球开奖结果: [0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1]
蓝球开奖结果: [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为12    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为4    ---->     恭喜你,你中六等奖了!!5块钱
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为14    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为9    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为3    ---->     很遗憾,你没有中奖
红球开奖结果: [0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0]
蓝球开奖结果: [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为14    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为10    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为11    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为1    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为3    ---->     恭喜你,你中六等奖了!!5块钱
红球开奖结果: [0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0]
蓝球开奖结果: [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为1    ---->     恭喜你,你中六等奖了!!5块钱
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为5    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为10    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为12    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为8    ---->     很遗憾,你没有中奖
红球开奖结果: [0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0]
蓝球开奖结果: [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为1    ---->     恭喜你,你中六等奖了!!5块钱
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为5    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为4    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为3    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为8    ---->     很遗憾,你没有中奖
红球开奖结果: [0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0]
蓝球开奖结果: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为13    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为9    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为7    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为15    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为3    ---->     很遗憾,你没有中奖
红球开奖结果: [0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1]
蓝球开奖结果: [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为9    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为16    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为15    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为3    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为4    ---->     恭喜你,你中六等奖了!!5块钱
红球开奖结果: [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1]
蓝球开奖结果: [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为9    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为16    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为4    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为13    ---->     很遗憾,你没有中奖
预测结果是红球编号[14 16 29 20 11 33],蓝球编号为1    ---->     恭喜你,你中五等奖了!!10块钱

hahahahahahahahahaha, the output result of the red ball model is different every time, but the highest probability is always these 6 numbers, There is a problem , it seems to only like to buy these numbers. But no matter what, the probability of winning the 6th prize is not bad. Today's work was not in vain (it took 11 hours). I will work on v0.2 when I have time in the future.

conclusion of issue

To facilitate next improvement, 1) the red ball output is fixed, 2) the blue ball model is seriously overfitted. Both models are basically unable to converge, but this is normal, but I think there should be room for optimization.

5. Shuangseqiu application

Crawl the latest 150 winning results from the website, input them into the model for prediction, and get the red and blue ball numbers.

import requests
import json
import random
import tensorflow as tf
import numpy as np
from datetime import datetime,date
from tensorflow.keras.models import load_model
# 加载模型
Rmodel = load_model('model/20231204-174742LSTM2_lr100_batchsize.h5')   #20231204-173622LSTM2_batchsize.h5
Bmodel = load_model('model/20231204-192802B_LSTM2_batchsize.h5')

url = 'https://jc.zhcw.com/port/client_json.php'
headers = {
    
    
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0",
    "Referer": "https://www.zhcw.com/"
}
cookies = {
    
    
    "Hm_lvt_692bd5f9c07d3ebd0063062fb0d7622f": "1701622572",
    "_gid": "GA1.2.1218667687.1701622572",
    "Hm_lvt_12e4883fd1649d006e3ae22a39f97330": "1701622573",
    "PHPSESSID": "6k70gq2h44nksmou3n8374jq13",
    "_ga_9FDP3NWFMS": "GS1.1.1701622572.1.1.1701623257.0.0.0",
    "_ga": "GA1.2.1720843243.1701622572",
    "Hm_lpvt_12e4883fd1649d006e3ae22a39f97330": "1701623257",
    "Hm_lpvt_692bd5f9c07d3ebd0063062fb0d7622f": "1701623258"
}

history = []
for i in range(1,6):
    params = {
    
    
            'callback':'jQuery112208474410773064831_1701622567918',
            'transactionType':10001001,
            'lotteryId':1,
            'issueCount':150,
            'startIssue':'',
            'endIssue':'',
            'startDate':'',
            'endDate':'',
            'type':0,
            'pageNum':i,'pageSize':30,
            "tt": random.random(),
            "_": str(int(time.time() * 1000))
        }
    # 发送HTTP GET请求
    response = requests.get(url,headers=headers,cookies=cookies,params=params)

    # 提取JSONP响应中的JSON数据
    json_data = response.text.split('(')[1].split(')')[0]
    # 解析JSON数据
    data = json.loads(json_data)
    # 提取双色球号码信息
    for entry in data['data']:
        issue = int(entry['issue'])
        front_winning_num = entry['frontWinningNum']
        back_winning_num = entry['backWinningNum']
        history.append([issue,front_winning_num,back_winning_num])
# 按照期号进行排序
history = sorted(history, key=lambda x: x[0])
#print(history)        

red_history = []
blue_history = []
# 构造模型的输入数据
for h in history:
        red = np.array([int(i)-1 for i in h[1].split(' ')])
        # 编码
        red = tf.one_hot(red, depth=33)
        red = tf.reduce_sum(red, axis=0)
        red_history.append(red)
        blue = np.array([int(h[2])-1])
        blue = tf.one_hot(blue, depth=16)
        blue = tf.reduce_sum(blue, axis=0)
        blue_history.append(blue)        
#print(blue_balls)       



# # 加载新的数据进行推理
R = tf.expand_dims(tf.stack(red_history), axis=0)
B = tf.expand_dims(tf.stack(blue_history), axis=0)


# 对新数据进行预测
Rpredictions = Rmodel.predict(R,verbose=0)
Bpredictions = Bmodel.predict(B,verbose=0)
# # 打印预测结果
#print(Rpredictions)
# print(Bpredictions)

# # 选择出最大的5个元素及其索引
_, Rindices = tf.math.top_k(tf.constant(Rpredictions[0]), k=6)
_, Bindices = tf.math.top_k(tf.constant(Bpredictions[0]), k=5)
print('今天是:', date.today(),'  推理下一次双色球的号码是:')
print(f'1:红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[0]+1).numpy()}')
print(f'2:红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[1]+1).numpy()}')
print(f'3:红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[2]+1).numpy()}')
print(f'4:红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[3]+1).numpy()}')
print(f'5:红球编号{
      
      (Rindices+1).numpy()},蓝球编号为{
      
      (Bindices[4]+1).numpy()}')

The result is as follows:

今天是: 2023-12-04   推理下一次双色球的号码是:
1:红球编号[14 16 29 20 11 33],蓝球编号为1
2:红球编号[14 16 29 20 11 33],蓝球编号为10
3:红球编号[14 16 29 20 11 33],蓝球编号为12
4:红球编号[14 16 29 20 11 33],蓝球编号为7
5:红球编号[14 16 29 20 11 33],蓝球编号为13

Guess you like

Origin blog.csdn.net/weixin_41099712/article/details/134795644