ASE Advanced Software Engineering first pair programming

Problem Definition

Problem definition

Zou Yan teacher's blog has a game: Innovation timing - the golden point game

N players, each writing one or two rational number between 0 to 100 (excluding 0 or 100), submitted to the server, the server calculates an average of all the numbers at the end of the current round, then multiplying by 0.618 (the so-called Golden constants), to obtain the G value. G closest player number (absolute value) obtained submissions-N, G farthest from the player gets points -2, 0 points to other players. Only one player does not score points when participating.

Difficult problems

  • Hard to know other people's tactics, can only be estimated through the historical value of certain gold point, how to use the information is a difficulty.
  • First, from the point of view rules of the game, a victory in the return game is very high (equivalent to the fraction obtained number of players), the furthest point away from the gold players will buckle down 2 points, the risk of loss is not great, feeling the game mechanisms to encourage risk-taking, but how to predict the value of gold in dramatic point of change when "Adventure" to nearby gold was very difficult.
  • Winner takes all the rules so that only the most accurate prediction of the bot to score, score a lot of difficulty, how reliable is a difficult score.

Modeling method

Motivation and Introduction to Algorithms

We feel more similar to a golden point game series forecasting problem, while online under several groups organized to test the game to open the room, we found that using Demo average of the past five golden point as the next point of the predicted value of gold did very well, Demo is not lost enhanced learning, I feel like a communication problem in the series forecasting intuitively, we feel good RNN may be able to solve this problem, learn to be better than the method of taking the average prediction means, so on the use of pytorch in the LSTM to solve this problem.

The problem as a series prediction problem processing a curve curve itself is a very important value and its rate of change, so we chose two kinds of inputs, the first input mode selection k golden point value in the past as an input sequence , the second input mode selection in the past k golden point time difference, that is, k-1 difference values ​​as input to get two predictive value of number1 and number2, we feel that such a prediction method should be able to get better results.

flow chart

  • The beginning of the process

  • After the first test for the forced division of business processes

Some other ideas

As the golden point game in a multiplayer game theory tend to a small number, and a small number of players, so a large number of points can affect the trend of gold, so you can take advantage of the rules are allowed to submit two numbers, with a large number perturbing, while appropriately increasing their number to increase the likelihood of another score.

First, the disturbances should be random, since the determination of the disturbance is equivalent to the result of the game plus a bias, not to interfere with the other players.

A simple idea is to a number (number1) is set to 99 (the maximum allowed), another number (number2) plus 0.618 (99 - formerly predicted value) / (the number of players 2), precisely speaking, number2 increase the value of the gold points will be affected, but the number of players is not particularly less time should be ignored. At that time I thought, ah this disturbing to me is known a priori, I can use this information to influence to get some initiative, and later found that the effect is actually general.

But when the actual test found that the disturbance is easy to pit themselves, because the two number one is set to a large value in fact the equivalent of cutting off a way out, and others less compared to predict a number of gold points, If you can not ensure that the deficit is very high scoring rate, but increased the risk of losing, so add another number can be out of relatively conservative, even though just less than minutes also do not always go as far as possible points.

Result analysis

The first round game

The first collective test results 1000 games found by our highly ranked (bottom third), we analyzed the reasons afterwards, because we are from a lot of change over the Demo, Demo retained disturbances in strategy, so there are large fluctuations curve, there is a small part of the gold point values ​​are great, our RNN should be subject to this influence, the predicted values ​​are always too large, are rarely predictive value less than 1, so that a large part of the gold points score and we have no relationship, this game is very passive environment of our Bot.

On the other hand, we have found a lot of bot use a prime spot on the use, use on a golden point multiplied by 0.618 strategies, so there are a lot of rounds have appeared to use the same strategy with open sub-group of black phenomenon, without the use of such a group policy will not get points in the respective rounds, and the high frequency of this phenomenon, we have no recourse but to beat joined, so the way in which a number of change on a golden point and multiplied by 0.618 to Bot on other points together.

Real Play

Official game we came in 5th place, I feel really RNN not suitable for such a highly volatile series forecasting, the top few are using reinforcement learning achieved good results. While looking at the overall volatility is a great spot gold chart metaphysics feel good question, but a closer look does have some rounds there are some local laws, such as the continuous decline in the peak or rebound scenario is possible to do good group strategies adopted in these cases is better, they are able to get a high score.

Reflection summary

  • The results golden point game of match you expected it?

    The results of the first round of the game is not very in line with expectations, because before the students have built private rooms tested, the test results point of view was expected to get in on the results, but the results of 1000 down is reciprocal, that is a test post hoc analysis there are some rooms bot adopted a more aggressive strategy to prevent us points, we disturbance bad strategy design, so there are many cases of penalty points, resulting in poor performance in the first round of the game.

  • Before the official game, you take what kind of strategy to evaluate the quality of the model?

    Bot and other students were tested in advance of the race, we observe and score points situation.

  • If the numbers may be submitted for each round into 3 or looking for more participants to participate in the competition, your method is also suitable it?

    We use the RNN, so well be extended, but from our point of view the results of the numerical model has been greatly affected large disturbance can not be guaranteed to achieve better results

  • Please evaluate the work of partners, evaluation methods please reference the discussion of the sandwich method. Twinning partners and proposed areas for improvement.

    My partner is Shengnan_An, he is a very positive, effective people, pair programming operating arrangement that I am very busy few days, he and I share their own ideas, and soon completed a Demo. Our final code are also the most he had written, I think our problem is mainly the result of the general schedule on, I started a few days busy, and small partners need to go back to school a few days behind, so no time to discuss and more, done in a hurry, when tested analyzing the shortcomings of it. A matter of time really have no idea, I think the junior partner was quite to the edge.

Guess you like

Origin www.cnblogs.com/QiLF/p/11563920.html