2023 "Huawei Cup" Chinese Graduate Mathematical Modeling Competition (Question C) In-depth Analysis | Complete Code of Mathematical Modeling + Full Analysis of the Modeling Process

Huawei Cup Mathematical Modeling Question C

Have you ever felt at a loss when faced with complex mathematical modeling problems? As the O Award winner of the 2021 American College Student Mathematical Modeling Competition, I provide you with a set of excellent problem-solving ideas to allow you to easily deal with various problems.
Let’s take a look at Question C of the research competition~!

Problem restatement

There are many innovative competitions now, and the larger competitions generally adopt two-stage (online review, on-site review) or three-stage (online review, on-site review and defense) review. The characteristic of innovation competitions is that there is no standard answer and requires independent review by review experts based on the review framework (suggestions) proposed by the proposer (group). Therefore, for the same work, the scores of different judges may be quite different. In fact, when the scale of the competition is large and the number of judges is large, the problem of large ranges is more prominent. Obviously, simply ranking based on the sum of the scores of multiple judges is not a good solution for judging innovation competitions. Therefore, it is of far-reaching significance to explore the impartiality, fairness and scientific nature of large-scale innovation competition review plans.

Question one

At each judging stage, works are usually distributed randomly, and each work requires independent review by multiple judges. In order to increase the comparability of the scores given by different review experts, there should be some overlap between the collections of works reviewed by different experts. But if some intersections are large, there must be some intersections that are small, and comparability becomes weaker. Please establish a mathematical model to determine the optimal "cross-distribution" plan based on the situation of 3,000 participating teams and 125 review experts, and each work is reviewed by 5 experts, and discuss the relevant indicators (your own definition) and implementation details of the plan .

The goal is to determine the optimal “cross-distribution” scheme to increase the comparability of scores given by different reviewers. We can model this problem as a combinatorial optimization problem and find the best solution through mathematical modeling and solving methods.

  1. Determine the variable:
    Define a binary variable x(i, j), where i represents the number of the review expert and j represents the number of the work. This variable indicates whether the i-th review expert reviews the j-th work. The value of x(i, j) is 1, which means that the review expert i has reviewed the work j, and the value of x(i, j), which is 0, means that there is no review.
  2. Define the objective function:
    The goal is to maximize the comparability between the scores given by different review experts, that is, to maximize the intersection between the collections of works reviewed by different experts. Therefore, we can define the objective function as:
    M aximize : ∑ ( i , j ) x ( i , j ) Maximize: ∑(i,j) x(i, j)Maximize:(i,j)x(i,j )
    This objective function expresses the need to maximize the number of cross-reviews between all review experts and works.
  3. Add constraints:
    limit each reviewer to review at most k works: this can be expressed as the following constraint:
    Subject to: ∑jx(i, j) <= k, for all i
    this constraint ensures that each reviewer will not The maximum number of reviewed works k is exceeded.
    Each work needs to be reviewed by m review experts: this can be expressed as the following constraints:
    S objectto: ∑ ix (i, j) = m, for all j Subject to: ∑ix(i, j) = m, for all jSubjectto:i x ( i ,j)=m , for all j
    this constraint ensures that each work will be reviewed by m review experts.
    Binary variable constraints:x ( i , j ) ∈ 0 , 1 x(i, j) ∈ {0, 1}x(i,j)0,1
  4. Solving optimization problems:
    ●Use optimization algorithms or mathematical programming tools to solve the optimization problems established above. These tools can help find the optimal x(i, j) variable value, that is, the best "cross-distribution" solution.
  5. Analyze the results:
    ●Once the solution is completed, the results can be analyzed to determine which review experts should review which works to maximize the comparability of reviews.
  6. Implementation details:
    ●In actual application, practical factors such as the availability of review experts, characteristics of the work, review time, etc. need to be considered. Additionally, multiple experiments and adjustments may be required to optimize the protocol.
    It is an Integer Linear Programming (ILP) problem because the variables x(i, j) we want to solve are binary integers (0 or 1), and we want to maximize a linear objective function.
import pulp

# 创建线性规划问题
model = pulp.LpProblem("CrossDistribution", pulp.LpMaximize)

# 定义评审专家数量和作品数量
num_experts = 125
num_works = 3000

# 定义每位评审专家最多评审的作品数量和每份作品需要被评审的专家数量
k = 20  # 最多评审的作品数量
m = 5   # 每份作品需要被评审的专家数量

# 创建二进制变量x(i, j)
x = pulp.LpVariable.dicts("x", ((i, j) for i in range(num_experts) for j in range(num_works)), cat='Binary')

# 定义目标函数:最大化交叉评审数量
model += pulp.lpSum(x[i, j] for i in range(num_experts) for j in range(num_works))

# 添加约束条件
# 每位评审专家最多评审k份作品的约束
for i in range(num_experts):
    model += pulp.lpSum(x[i, j] for j in range(num_works)) <= k

# 每份作品需要被评审m位专家的约束
for j in range(num_works):
    model += pulp.lpSum(x[i, j] for i in range(num_experts)) == m

# 求解线性规划问题
model.solve()

# 打印结果
print("Status:", pulp.LpStatus[model.status])

# 打印每位评审专家评审的作品列表
for i in range(num_experts):
    selected_works = [j for j in range(num_works) if x[i, j].value() == 1]
    print(f"评审专家 {
      
      i+1} 评审的作品列表:{
      
      selected_works}")

# 打印最大化的交叉评审数量
print("最大化的交叉评审数量:", pulp.value(model.objective))

Compare sales of different products

Question 2

We need to explore different review schemes, compare their effectiveness, and design new standard score calculation models. First, we can choose two different review options and compare them. Then, based on the comparison results and the ranking of first-prize works in data 1 (consensus data obtained through expert consultation), the standard score calculation model can be improved. The following are the specific steps for Question 2:
Step 1: Choose two review plans and compare
1. Original review plan:
a. Use the standard score calculation method in Appendix 1 for review, and sort by standard score.
2. New evaluation scheme (example):
a. Use the weighted average method to calculate the evaluation score of each expert to take into account the differences between experts. Expert historical review performance can be used as weights.
b. Sort the weighted average review score of each work.
Through these two options, you can get different rankings and winning results for your works. Compare their strengths and weaknesses, including the consistency, fairness and credibility of the review results.
Step 2: Improve the standard score calculation model
1. Analyze data 1 (data 1 provided in Appendix 2):
a. Analyze the ranking of first prize works in data 1, which is the ranking agreed upon by experts. Understand the characteristics and distribution of these works.
2. Improve the standard score calculation model:
a. Based on the analysis results of Data 1 and the experience of comparing the two review schemes, design a new standard score calculation formula to better reflect the review characteristics of large-scale innovation competitions. This new calculation formula can take into account factors such as the expert's historical performance, the characteristics of the work and multi-dimensional review.
The example is as follows (this is just an example, you can modify it according to the actual situation):
New standard score = original score + expert historical review deviation.
Among them, the expert historical review deviation can be calculated based on the expert historical review performance and the characteristics of the work.
Step Three: Verify and Adjust
1. Use the new standard score calculation model to rank the first prize works in data 1 and compare with the ranking agreed upon by experts. Ensure new models better capture review results.
2. Conduct some experiments and tests to verify whether the new standard score calculation model performs well on a wider range of competition data sets.
3. Continuously adjust and improve the new standard score calculation model to meet the needs of different competition and review conditions.

4. Raw Score of each work:
a. Let R ij R_{ij}RijIndicates the original score of expert i for work j.
5. Each expert’s historical review bias (Expert Bias):
a. Let E i E_iEiE represents the average historical review score of expert i.
b. Expert bias B ij B_{ij}BijIt can be expressed as:
B ij = R ij − E i B_{ij} = R_{ij} - E_iBij=RijEi
6. Multidimensional Score:
a. Let M j M_jMjIndicates the multi-dimensional review score of work j.
7. Work Variation:
a. Let V j V_jVjRepresents the variance of the original score of work j.
8. Standard Score:
a. Let S ij S_{ij}SijRepresents the standard score of expert ii for work jj, which can be expressed as:
S ij = R ij + B ij − α ⋅ V j S_{ij} = R_{ij} + B_{ij} - \alpha \cdot V_jSij=Rij+BijaVj
Among them, \alphaα is an adjustment parameter used to balance the impact of differences between works on the standard score. You can choose the appropriate α α \alphaα according to your actual needsaa value.

import numpy as np

# 原始评审成绩矩阵,其中每行代表一位评审专家的评审成绩
# 每列代表一份作品的不同维度评审成绩
raw_scores = np.array([
    [85, 90, 88, 92],
    [78, 82, 80, 85],
    # 添加更多专家的评审成绩
])

# 专家历史评审平均得分
expert_average_scores = np.mean(raw_scores, axis=1)

# 作品原始成绩的方差
work_variations = np.var(raw_scores, axis=0)

# 调整参数,用于平衡作品之间的差异对标准分的影响
alpha = 0.1  # 您可以根据需要调整这个参数

# 计算标准分
num_experts, num_works = raw_scores.shape
standard_scores = np.zeros((num_experts, num_works))

for i in range(num_experts):
    for j in range(num_works):
        expert_bias = raw_scores[i, j] - expert_average_scores[i]
        standard_scores[i, j] = raw_scores[i, j] + expert_bias - alpha * work_variations[j]

# 打印每位专家对每份作品的标准分
for i in range(num_experts):
    for j in range(num_works):
        print(f"专家 {
      
      i+1} 对作品 {
      
      j+1} 的标准分:{
      
      standard_scores[i, j]}")

Insert image description here

Question 3

Step 1: Analyze extremely poor issues
First, we need to analyze extremely poor issues in large-scale innovation competition reviews. Works with large extreme differences will usually be in high or low grades, which may lead to instability and unfairness in the review results. In order to better understand the problem of extreme differences, the following analysis can be performed:
1. Extreme difference statistics: Calculate the extreme differences of each work during the review stage to understand the distribution of extreme differences.
2. The relationship between innovativeness and extremes: Analyze whether there is a correlation between innovativeness and extremes, that is, whether works with higher innovativeness are more likely to have extremes.
3. Experts’ range adjustment: Analyze the experts’ range adjustments to the works in the second stage of review, and understand which works need to adjust their ranges to obtain more accurate review results.

Step 2: Establish a range model
. In order to solve the range problem, a range model can be established that takes into account the characteristics, innovation and historical performance of the review experts. The following are the steps to build an example extreme model:
1. Feature engineering: Take the characteristics of each work (such as question difficulty, innovativeness score, paper structure, etc.) into consideration as the input features of the model.
2. Innovation index: Introduce an innovation index, which can evaluate the innovation from the description, method, experiment and other aspects of the work. This can be metrics based on natural language processing or other techniques.
3. Expert historical review data: Use experts’ past review historical data, including their review results and review deviations.
4. Model training: Use machine learning or statistical models (such as regression models, decision trees, neural networks, etc.) to associate input features with ranges.
5. Model adjustment: Adjust and verify based on the performance of the model to ensure that the model can accurately predict the extreme performance of the work.

Step 3: Range adjustment strategy
Once the range model is established, the following strategies can be adopted to adjust works with large ranges:
1. Automatic range adjustment: Use the model to predict ranges and automatically adjust works with large ranges. This may be accomplished by reassigning review weights, recalculating standard scores, or other methods.
2. Expert consultation: In the second stage of review, experts can negotiate and discuss works with large differences to reach a consistent review result. This requires collaboration and discussion among experts.
3. Model feedback: Feed back the range information predicted by the model to experts, allowing them to pay more attention to works with large ranges during review to reduce inconsistencies.
4. Multi-dimensional review: Use multi-dimensional review methods to reduce the extreme differences in review results.
Step 4: Verification and improvement
Finally, the effect of the model and range adjustment strategy needs to be verified. Historical competition data can be used for model validation, and models and strategies can be continuously improved based on actual review situations. Ensure that they improve the consistency and fairness of review results while maintaining sensitivity to innovativeness.

import numpy as np
from sklearn.linear_model import LinearRegression

# 假设有以下数据,其中X是作品的特征,Y是极差
X = np.array([
    [0.8, 0.9, 0.7],
    [0.6, 0.5, 0.8],
    # 添加更多作品的特征
])

Y = np.array([0.2, 0.3, 0.1, ...])  # 对应的极差数据

# 创建线性回归模型
model = LinearRegression()

# 训练模型
model.fit(X, Y)

# 假设有新的作品需要进行评审,需要预测其极差
new_work = np.array([[0.7, 0.8, 0.6]])  # 新作品的特征

predicted_variation = model.predict(new_work)

# 根据预测的极差进行调整
adjusted_score = raw_score - predicted_variation

# 打印结果
print("预测的极差:", predicted_variation)
print("调整后的成绩:", adjusted_score)

Insert image description here

Question 4

Review model overview:
This review model aims to comprehensively consider review experts, work characteristics and innovation to produce more fair and accurate review results. This model transforms the review problem into an optimization problem and solves the best review result through an optimization algorithm.
Model suggestions and steps:
1. Feature engineering:
a. Collect feature data of the work. These features can include question difficulty, innovation score, method complexity, experimental design, author background, etc.
b. Collect historical review data of review experts, including their review results and review deviations.
2. Construct an objective function:
a. Define an objective function that associates the review results with work characteristics and expert reviews.
b. The objective function can include the weight of innovativeness, the weight of different characteristics, the weight of expert historical performance, etc.
3. Optimization problem:
a. Convert the review problem into an optimization problem, with the goal of maximizing or minimizing the objective function. For example, you can try to maximize the overall score of the work to reflect the quality and innovation of the work.
4. Solve optimization problems:
a. Use optimization algorithms (such as linear programming, integer programming, genetic algorithm, etc.) to solve optimization problems to obtain the best review results.
b. This can be achieved through existing mathematical optimization libraries.
5. Verify and adjust the model:
a. Use historical competition data to verify and adjust the model to ensure that the model can produce accurate review results.
b. Consider using techniques such as cross-validation to evaluate model performance.
Improvement suggestions :
1. Data collection:
a. Collect more information about review experts, including professional fields, experience and review history.
b. Collect more information about the work, especially about innovation and contribution.
2. Multi-level review:
a. Consider using a multi-level review method, in which the first level of experts conducts a preliminary review of the works, and the second level of experts conducts a further review of the preliminary works. This reduces margins and improves review accuracy.
3. Expert consultation:
a. Encourage experts to negotiate and discuss the review results to improve consistency and credibility.
4. Transparency:
a. Make the review process more transparent, including a clear explanation of the review criteria and weighting.
5. Feedback mechanism:
a. Introduce a feedback mechanism so that review experts can understand how their review results affect the final review results and make improvements in future reviews.
6. Algorithm improvement:
a. Continuously improve the optimization algorithm to solve review problems more efficiently.

Ablation experiment analysis:

Baseline model: Use the complete method, that is, matrix decomposition + value range control + sparsity constraints. Measure its approximation error RMSE and complexity C.
Remove the value range control: only use matrix decomposition + sparsity constraints, without limiting the value range. Measure RMSE and C.
Remove sparsity constraints: only matrix decomposition + value range control is used, sparsity is not required. Measure RMSE and C.
Matrix factorization only: range control and sparsity constraints are not used. Measure RMSE and C.
Compare the RMSE and C of different models. High RMSE indicates loss of approximation accuracy; high C indicates increased complexity.

Matrix decomposition is an effective method to achieve low-complexity approximation to DFT, but it requires design to achieve sparsity.
Constraining the value range of elements in the matrix can reduce the calculation amount of a single multiplication.
On the premise of meeting the accuracy requirements, the decomposition solution that minimizes complexity can be found through search.
Decomposing the Kronecker product matrix can decompose a large DFT into multiple small matrices, reducing the difficulty of optimization.
Ablation experiments can verify the impact of different design decisions on approximation error and complexity.
It is necessary to weigh the error accuracy and computational complexity, and determine the acceptable trade-off based on actual needs.
This method can be used as a low-complexity DFT implementation strategy that replaces FFT.
Details such as optimized search and code implementation need to be further improved.
Objective Function:
●Usually expressed as J or f(x), where x is the decision variable, which can be the review result.
●For example, maximizing the overall score of the work can be expressed as:
J ( x ) = ∑ ( wi ∗ fi ( x ) ) J(x) = ∑(w_i * f_i(x))J(x)=(wifi( x ))
Constraints:
●Can be expressed as equality or inequality conditions, used to limit the value range of decision variables.
●For example, constrain the total score not to exceed a certain threshold:
∑ ( wi ∗ fi ( x ) ) ≤ C ∑(w_i * f_i(x)) ≤ C(wifi(x))C

1. Standard Deviation:
a. Used to measure the degree of dispersion of a data set, usually expressed as σ.
b. The standard deviation of the review results can be used to measure the instability of the review results.
2. Correlation Coefficient:
a. Used to measure the correlation between two variables, usually expressed as ρ (rho).
b. Can be used to analyze the correlation between review results and work characteristics.
3. Weight:
a. Used to assign different importance to different features or review results.
b. can be expressed as w.
4. Optimization problem notation:
a. Maximization problems are usually represented by max, such as maximize J(x).
b. Minimization problems are usually expressed as min, such as minimize J(x).
5. Optimization Variables:
a. Indicates the decision variables that need to be optimized, usually expressed as x.
6. Model Parameters:
a. Indicates the parameters in the model, which can be weights, coefficients, etc., usually expressed as θ.

from scipy.optimize import minimize

# 定义目标函数
def objective_function(x):
    # 这是一个示例目标函数,可以根据实际情况替换
    return x[0]**2 + x[1]**2

# 定义约束条件
def constraint(x):
    # 这是一个示例约束条件,可以根据实际情况替换
    return x[0] + x[1] - 1

# 初始猜测值
initial_guess = [0.5, 0.5]

# 定义约束
constraints = ({
    
    'type': 'eq', 'fun': constraint})

# 最小化目标函数
result = minimize(objective_function, initial_guess, constraints=constraints)

# 输出结果
print("最优解:", result.x)
print("最优值:", result.fun)

Ablation Experiment Analysis

1. Clarify the research questions:
a. Determine which factors may affect the fairness and effectiveness of the review.
2. Define the factors to be ablated:
a. Based on the research question, determine the key factors to be ablated. These factors can include: the number of review experts
i. review method (one-stage, two-stage)
ii. score adjustment method (standardization, removing the highest and lowest scores, etc.)
3. Set up a benchmark experiment:
a. Conduct a benchmark experiment, using the current Review the program as a benchmark. Record all relevant data, including judging scores, awards, extreme differences, etc.
4. Gradually ablate factors:
a. Conduct a gradual ablation experiment for each factor. For example, regarding the number of review experts, this can be done as follows: use all experts for review in the benchmark experiment.
i. Then, gradually reduce the number of review experts, for example, only use 80%, 60%, 40%, etc. of experts for review.
ii. Record the review results under each experimental condition.
5. Record and analyze data:
a. Record data under each experimental condition, including review scores at each stage, awards, range sizes, etc.
b. Use statistical analysis methods to compare data under different experimental conditions to determine which factors have a significant impact on the review results and range.
6. Draw conclusions:
a. Draw conclusions based on the results of data analysis. Determine which factors have an important impact on the fairness and effectiveness of the review plan, and which factors have an impact on the size of the range.
7. Improve the review plan:
a. Based on the conclusion, make possible improvement suggestions. For example, if the number of review experts significantly affects the results, consider increasing expert participation or improving expert training.
8. Verification experiment:
a. If possible, conduct verification experiments to confirm the reliability of the conclusions. Validation experiments can be conducted using different data sets or competitions.
9. Summarize the research results:
a. Write a report or summary detailing the experimental design, results and conclusions. This will help others understand your research work.

(5 private messages/2 messages) How to evaluate question C of the 2023 Mathematical Modeling Competition? -csdn

Guess you like

Origin blog.csdn.net/qq_25834913/article/details/133235216