The second pair programming job (copy)

The second pair job

Beginning Link

Teammate
Github repository

Foreword

I prefer mahjong.

Specific division of labor

Qiu Chang Jie: back-end
Sun Cheng Kai: front-end
is very simple division of labor


PSP table

PSP2.1 Personal Software Process Stages Estimated time consuming (minutes) The actual time-consuming (minutes)
Planning plan 120 80
Estimate Estimate how much time this task requires 120 80
Development Develop 2990 3330
Analysis Needs analysis (including learning new technologies) 60 60
Design Spec Generate design documents 30 30
Design Review Design Review 30 30
Coding Standard Code specifications (development of appropriate norms for the development) 30 180
Design Specific design 180 240
Coding Specific coding 2500 2550
Code Review Code Review 60 120
Test Test (self-test, modify, commit the changes) 100 120
Reporting report 80 100
Test Report testing report 30 30
Size Measurement Computing workload 10 10
Postmortem & Process Improvement Plan Hindsight and propose process improvement plan 40 60
total 3190 3510

Design and implementation of problem-solving ideas description and explanation

Use the network interface

For processing considerations, for convenience, the use of the python restful API requests from an upstream module Yongfu access server, and packaged in a five functions as follows.

Interface functions Features
login Transmitting logon request
register Sending a registration request
find_info Information war inquiry
rank Get rankings
Srank Access to personal list of war

Design and realization of the internal code organization

a

Our team before the start of the design process of the subject do some simple mathematical analysis, as when we get 13 cards when the target under consideration is currently put our hands think the biggest brand, and others can not put more than we currently put the cards. In this condition we can set the total number of card type N, put the card type can be played several other card type is t, then t / N can be simply considered the ideal win rate of the deck.

\[ c=\frac{t}{N} \]

The weights may be set to a simple

\[ w=lg(\frac{N}{N-t}) \]

But often not enough to consider only the hand of cards to make decisions, the probability can be better than our card-type combination appear in the hands of opponents should also be considered, we see this probability is set to s. The c * (1-s) may be considered relatively simple and the success rate of the deck:

\[ p=c*(1-s) \]

Often, C is larger, the smaller the s, this has not been proven.

Here is the algorithm implementation process:

Preparation: Looking for a written basic framework of the available library cards can save time and then build the bottom, here is a very simple third-party library called poker. Chinese document , using the simplest of each card initialization. Front and rear ends are achieved using python, I used to write only the tip contact with html + css, unprecedented to achieve with this PyQT the front page.

The first step: Define a leveldict flag for each type of license weights and various preparations in advance

Ready function Features
make_suit Get the current hand of color dict
make_rank Get the current hand of cards value dict
Weight_init Matrix brand weight initialization
Times_init Initialization curiosity matrix weights
Compare_down Comparison of two cards the size of the first pier
Compare_up Comparison of two cards second or third pier pier size (comparable to a deck of cards two Sandun whether pour)
Compare_gap Compare deck of cards of the first and second pier pier if pouring water
Compare_combo Calculated complete two cards of winning or losing

Tips: weight such as a first card type is 0 pier (San card) of a second pier 3 (connected to) the third pier 5 (straight), the will Weight [0] [3] [5] is initialized to (0 + 3 + 5) / 100 so other. Curiosity weights for each position is initialized to 1, followed by training and maybe when it is used.

Step 2: class dfs fact, that is no brain through all possible (first pier grade 0-4 is likely to scatter three cards, there may be 0-9 second pier is scattered cards to a straight flush, the first Sandun empathy) a total of 5 * 10 * 10 possible combinations of the type of the current composition of the board can draw

类一:choose_action_option(Hand,color_dict,rank_dict)
参数:手牌和前面两个函数得到的两个dict
功能:根据当前手牌给出各种牌型的dict
返回:
在这里插入图片描述

类内方法 功能 返回
Flush_Collection 判断同花顺 同花顺花色和该顺最后一张牌牌值的list
Bomb 判断炸弹 炸弹牌牌值的list
Gourd 判断葫芦 葫芦牌中三条部分和可能的所有对子组成的list
Flush 判断同花 同花的花色牌和能组成同花的5张牌中最大的牌值
Collection 判断顺子 该顺最后一张牌牌值的list
Three 判断三条 三条牌牌值的list
Pair 判断对子 对子牌牌值的list

Tips:此处后面为什么是函数而不是类了呢。。当然是因为方便调试!(懒得写了。函数比较方便。。)

主要函数 功能 返回
choose_step_sb(choose_) 根据类返回的dict返回所有可能的牌型 包含了0-9所有可能牌型的dict
action_step_sb(Hand,choose_,shape,rank,suit) 根据牌型dict对当前手牌进行更改 某墩的牌型和剩下的手牌
action_step_add(Hand_three,Hand_two,Hand_rest,shape,rank,suit) 将三墩牌完善 包含了三墩牌和各自牌型大小的dict
judge_right 判断牌型的合理性 True表示该三墩牌摆的合法

Tips:这里仅仅获得了手牌的一种排法的dict,包含了三墩的牌和各自的大小

第三步:由第二步得到的各种牌型可能训练权重

没错!这里我还是因为方便调试!只写了一个函数来训练它,甚至都没脸画表了!
主要函数:begin_game(myhand_need,Weight,Times,beta,is_Train,Auto,b,real)

参数 说明
Weight 牌型矩阵,决定了选某种牌型组合的可能性
Times Curiosity matrix, record the number of times encountered each brand training
beta I named the curiosity cut index, often used some kind of curiosity about the card type of training time reduced
is_Train Whether it is training situation, True training, False when the current maximum possible output card type
Auto Training manual or automatic training
b Board in the case of others, if compared with the other three cards is_Train, otherwise None
real Whether it is a real game scene

The first step in training, from mental retardation to start: the first row above the law must be the second step to get the maximum from the third pier no brain, no trick two largest types of brain, the other three cards are cards on the table great big discharge with no brain, himself through all the combinations with the brothers of the hand can be discharged without the brain greatly relatively large, according to reward return to winning or losing is (probably do not deserve to be called back) to change the weights.
The second step training, the use of IQ beat mentally handicapped: training after completion of the first step to get a more reasonable weight matrix, adjusting real = True, the other three cards on the table is still considerably large, but they are based on existing card type go to card-type matrix to find the maximum possible discharge results compare with the deck with the brothers return.
The third step training, the use of IQ beat IQ, the table four people are looking for the right type of card from your hand may moderate the maximum weight matrix card type, the regression.
Tips: reward that is winning or losing water
return way:

\[ r=r(s)*\frac{β}{\sqrt{N(s)}} \]

Number r (s) i.e., winning or losing water, β i.e. curiosity reduction index, N (s) i.e., the matrix memory Times

The key and key part of a flowchart description of the algorithm implemented

Dict and color card value obtained via two initial function dict


choose_step_sb get through each card type dict (or without)

Get a complete deck of cards via action_step_sb and action_step_add pendulum method

Judge_right get through the legitimacy of this pendulum method

It is legitimate to start training.

The rest of the front and rear ends

The front end of the main page is divided as follows

Interface and a class corresponding to a file independently py, organized by the unified file main.py, mainBox single, multi-form.

interface The class name
Log in and welcome Index
registered Register
Main interface Naenindeksh
Query Interface Search
自动出牌结果界面 ResultSingle
战局信息界面 Result
个人中心 Home
排行榜 Rank
个人战绩 SingleRank

与后端的接口如下:

接口函数 功能
login 登陆验证
register 注册
play 进行自动游戏
find_infp 查看战局信息
rank 查看排行榜
srank 查看个人记录

最终效果如下:
由于上传大小限制,拆分成5个部分来演示。







关键代码解释

算法

reward = 0
#得出手牌和牌桌其他人的输赢水
reward += Compare_Combo(finish_hand,b[0]) 
reward += Compare_Combo(finish_hand,b[1])
reward += Compare_Combo(finish_hand,b[2])
#好奇心衰减水
reward = reward * ((beta)/math.sqrt(Times[int(shape_1)][int(shape_2)][int(shape_3)]))
Times[int(shape_1)][int(shape_2)][int(shape_3)]+=1
#回归更改权重
Weight[int(shape_1)][int(shape_2)][int(shape_3)]+=(int(reward)/1000)

前端:

 # 登陆界面切换主界面
ind.show_mainindex_sg.connect(show_mainindex)
# 登陆界面切换注册界面
ind.show_register_sg.connect(show_register)
# 注册界面切登陆界面
regind.register_ok_sg.connect(register_ok)
# 主界面切自动对战
mainind.auto_pressed_sg.connect(show_result)
# 主界面切用户中心
mainind.home_pressed_sg.connect(show_home)
# 主界面切搜索
mainind.search_pressed_sg.connect(show_search)
# 搜索切结果
searchind.search_sg.connect(show_id)
# 搜索返回
searchind.back_sg.connect(back_off)
# 自动对战返回
sresultind.result_exit_sg.connect(sresult_exit)
# 结果返回
resultind.result_exit_sg.connect(result_exit)
# 用户中心返回
homeind.home_exit_sg.connect(home_exit)
# 用户中心切排行榜
homeind.rank_sg.connect(show_rank)
# 用户中心切个人战绩
homeind.single_rank_sg.connect(show_single_rank)
# 排行榜返回
rankind.rk_comeback_sg.connect(rank_exit)
# 个人战绩返回
sgrankind.single_rk_comeback_sg.connect(single_rank_exit)
# 详情
sgrankind.detail_sg.connect(show_de)
sys.exit(app.exec())

后端:

def play(token):
    for i in range(1):
        try:
            url = "https://api.shisanshui.rtxux.xyz/game/open"
            headers = {'x-auth-token': token}
            response = requests.request("POST", url, headers=headers)
            result = response.text.encode("utf8")
            result = json.loads(result)
            print(result)
            id = result['data']['id']
            card = result['data']['card']
            card = card.replace("10", 'T')
            card = card.replace("*", '@')
            hand_card = card.split()
            url = "https://api.shisanshui.rtxux.xyz/game/submit"
            Weight = pickle.load(open('./resource/model/a.txt', 'rb'))
            Times = pickle.load(open('./resource/model/b.txt', 'rb'))
            beta = 0.9
            myhand = []
            for i in range(13):
                myhand.append(Card(str(hand_card[i][1]) + str(hand_card[i][0])))
            print(myhand)
            myhand.sort()
            b = Game.begin_game(myhand, Weight, Times, beta, is_Train=False, Auto=False, b=None, real=True)
            three = b['level_3'][0]
            shape_3 = b['level_3'][1]
            two = b['level_2'][0]
            shape_2 = b['level_2'][1]
            one = b['level_1'][0]
            shape_1 = b['level_1'][1]
            a = []
            c = ''
            for i in range(3):
                c = c + str(one[i])[1] + str(one[i])[0]
                if (i != 2):
                    c += ' '
            c = c.replace("T", '10')
            c = c.replace("@", '*')
            a.append(c)
            c = ''
            for i in range(5):
                c = c + str(two[i])[1] + str(two[i])[0]
                if (i != 4):
                    c += ' '
            c = c.replace("T", '10')
            c = c.replace("@", '*')
            a.append(c)
            c = ''
            for i in range(5):
                c = c + str(three[i])[1] + str(three[i])[0]
                if (i != 4):
                    c += ' '
            c = c.replace("T", '10')
            c = c.replace("@", '*')
            a.append(c)
            payload = str({'id': id, 'card': a})
            payload = payload.replace(': ', ':')
            payload = payload.replace(', ', ',')
            payload = payload.replace("'", '"')
            headers = {
                'content-type': "application/json",
                'x-auth-token': token
            }
            response = requests.request("POST", url, data=payload, headers=headers)
            result = response.text.encode("utf8")
            result = json.loads(result)
            status = result['status']
            msg = result['data']['msg']
            if (msg != 'Success'):
                print("11111111111")
                print(myhand)
                print(three)
                print(two)
                print(one)
                break
            flag_hand = []
            for i in range(13):
                f = str(myhand[i])
                q = str(f[1]) + str(f[0])
                q = q.replace("T", '10')
                flag_hand.append(q)
        except:
            need = {'status': 1}
            print(need)
            return need
        need = {'status': status, 'id': id, 'msg': msg, 'origin_cards': flag_hand, 'cards': []}

        flag_one = []
        for i in range(3):
            a = str(one[i])
            b = a[1] + a[0].replace("T", '10')
            flag_one.append(b)
        flag_two = []
        for i in range(5):
            a = str(two[i])
            b = a[1] + a[0].replace("T", '10')
            flag_two.append(b)
        flag_three = []
        for i in range(5):
            a = str(three[i])
            b = a[1] + a[0].replace("T", '10')
            flag_three.append(b)
        flag = {'lv': level_dict[str(shape_1)], 'card': flag_one}
        need['cards'].append(flag)
        flag = {'lv': level_dict[str(shape_2)], 'card': flag_two}
        need['cards'].append(flag)
        flag = {'lv': level_dict[str(shape_3)], 'card': flag_three}
        need['cards'].append(flag)
        print(need)
        return need

性能分析与改进

描述你改进的思路

1.最开始的回归方式是单纯根据输赢水/100,这样可能会使一些很少遇到的情况出现偏差,它本来应该比另外一种经常出现的大,可经常出现的牌型出现的次数过多导致过分更改它的权重产生误判,所以使用了最近用来解决RL中sparse reward问题的好奇心的方法(其实就是个最简单的计数)。
2.因为由表得出make_suit也就是遍历得到花色dict的函数是最耗时的,估计因为是每次手牌的预处理都有用到它,所以第一墩13张牌时仍让它遍历,第二墩8张牌时就从先前的dict减去少掉5张牌的花色,第三墩类似,会稍微减少一点时间
3.在训练完十万个单位后,为了减少训练时间,不用让它遍历每种情况,只要是它经过的情况都用一个矩阵来标1,下次就不用再次访问同样牌型的情况
4.前端则是由于py的特性,导致本身启动时间较慢,唯一想到的解决方法是减少初始加载的模块数或者用c编写系统接口。
5.只有有网络请求,则网络请求部分往往是耗时最大的,该次作业也是一样。

展示性能分析图和程序中消耗最大的函数

万万没想到居然是make_suit。这玩意不就遍历个手牌把花色记录下来居然花这么多时间。惊了。

单元测试

覆盖率

poker是用来初始化卡牌的基础库,这个就不关键了,可见控制出牌发牌逻辑的Game的覆盖率是87,虽然低但还算勉强。

Github的代码签入记录。

Github readme

遇到的代码模块异常或结对困难及解决方法

问题描述(2分)

1.手牌合法性检测
2.因为是遍历手牌牌型的所有情况,两种情况的两副牌有可能出现某一墩完全一样的情况,无法比较大小。
3.底层poker库的Card类型和接口提供的卡牌样式完全不同,本身花色类型和前端接口也是八字不合,会出现❤这种无法保存的现象
4.网络请求失败导致整个程序崩溃。
5.多界面切换的挂起导致的内存损耗。
6.队友的屁股真的翘

做过哪些尝试(2分)

1.本来是想写一个判断手牌牌型和最大牌大小的顺便也方便前端,但因为后面训练的时候需要进行两副牌输赢水的计算,干脆就把每墩之间的比较独立写出,若合法情况则是第三墩赢第二墩,第二墩赢第一墩,这种牌型就是合法的。
2.根据鸵鸟算法,那么我们就把它........直接返回不输不赢吧
3.手动更改底层库
4.增加异常处理
5.修改控件件的父子关系。

是否解决(2分)

1.解决了
2.根据鸵鸟算法,解决了
3.解决了,哭了
4.解决了
5.不完美解决

有何收获(2分)

1.大千世界无奇不有,换种思路展开新世界
2.操作系统教会了我们非常关键的一种算法,使我受益良多
3.在开展项目前因先沟通好前后端的接口设计问题,同时将功能合理区分开,降低耦合度。
4.完成不使用ps纯手码前端成就。

评价你的队友

值得学习的地方(2分)

1.你是小队之光
2.你是天选之子
3.你是人工智障带师
4.你是力量的象征
5.敏捷,果敢
6.速度很快
7.屁股真的翘

需要改进的地方(2分)

1.别用jupyter开发这种项目了,会死人了。

学习进度条

学习表

第N周 新增代码(行) 累计代码(行) 本周学习耗时(小时) 累计学习耗时(小时) 重要成长
1 407 407 15 15 学到了十三水的玩法,了解了原型设计工具的使用方法
2 230 637 14 14 制定了基本实现思想和前端窗体框架
3 1200 1837 20 20 实现基本交互逻辑
4 1045 2882 20 20 实现接口对接和后端处理

牢骚

单说算法,采用这样一个算法其实只是个尝试,十三水这个游戏本身的决策不算复杂,单一情况下的所有解也比较的少,正确率最高的解法应该还是根据胜率去穷举最好,在这方面ai所体现的优势便没有那么明显,当然本身训练过程也发现某些牌型出现的情况较少,导致对其的权重较低,而往往出现概率小的牌是更大的牌,这就很危险了,所以在开始引入数学模型的基础权值再进行调整。还是偶尔能看到一些权值系统下天马行空的想法的。
再说设计本身,我们接口的设计比较的简洁,也减少了各自开发的难度,各司其职,干扰较少。前端用了PyQt,emmm,怎么说,有优有劣,我可能会更喜欢html+css的组合,qss本身作为类似css的样式表,其选择器和功能都些许不如css,总之还是不错的,曾经几度按下自己想打开ps的手,最后还是去画了一张背景出来。

Guess you like

Origin www.cnblogs.com/pullself/p/11710790.html