2023 The 6th Hebei Province Postgraduate Mathematical Contest in Modeling Topic B Python Solution Code

Topic B of the 6th Hebei Provincial Postgraduate Mathematical Contest in Modeling in 2023

Video explanation and download of documents and codes in this article: [2023 Hebei Province Graduate Mathematical Modeling Contest B Question Dataset and Code-哔哩哔哩] https://b23.tv/weulGAO

Abnormal detection of photovoltaic cells and prediction of power generation capacity Under the strategic background of carbon peaking and carbon neutrality, my country's photovoltaic power generation technology is developing rapidly. The total installed capacity of wind power and solar power in my country is expected to reach more than 1.2 billion kilowatts by 2030. Photovoltaic power generation is a technology that directly converts light energy into electrical energy by using the photovoltaic effect at the semiconductor interface. The key components of this technology are photovoltaic cells, which can be packaged and protected after being connected in series to form a large-area photovoltaic cell module, and then cooperate with power controllers and other components to form a photovoltaic power generation device.

Photovoltaic power generation is affected by factors such as the quality of photovoltaic modules, meteorological factors, solar irradiance and usage, resulting in fluctuations in power generation. Photovoltaic modules are mainly spliced ​​by semiconductor silicon solar cells composed of many photovoltaic crystalline silicon wafers. Various defects will occur during the manufacture, transportation and use of components, such as linear cracks, star cracks, broken grids, black cores, thick wires, etc. (the same photovoltaic cell may have multiple defects. These defects will vary to varying degrees Reduce the photoelectric conversion efficiency and reliability of the module. For example, the electroluminescence (EL) image of the photovoltaic cell in Figure 1 is marked as defect-free, and the surface can clearly see that the battery is intact as a whole; the EL image of the photovoltaic cell in Figure 2 contains linear crack type defects .

​Figure 1 Defect-free photovoltaic cell sample

Figure 2 Photovoltaic cell samples with surface defects

At the same time, meteorological factors are also an important factor affecting photovoltaic power generation, including wind speed, temperature, humidity, and too. Meteorological factors are also an important factor affecting photovoltaic power generation, including wind speed, temperature, humidity, solar irradiance, and sudden extreme weather conditions. Etc., may lead to a decrease in the efficiency of photovoltaic power generation, thereby affecting the entire solar irradiance, sudden extreme weather conditions, etc., may lead to a decrease in the efficiency of photovoltaic power generation, thereby affecting the stability of the entire energy system. Cloudy days, air pollution, sun altitude, etc. will also affect the stability of the solar energy system irradiated on the photovoltaic panels. Cloudy days, air pollution, sun altitude, etc. will also affect the intensity of sunlight shining on photovoltaic panels. In addition, changes in the intensity and direction of solar radiation may also cause fluctuations in the amount of photovoltaic power generation and thus the intensity of light. In addition, changes in the intensity and direction of solar radiation may also lead to fluctuations in photovoltaic power generation, thereby affecting the stability of the entire energy system. affect the stability of the entire energy system.

There are a batch of relevant data of photovoltaic power generation in a certain area, and workers have divided them into images according to these data. There are two types of attachments and data. Attachment 1 gives some photovoltaic EL cell images, battery images, Attachment 2 gives the meteorological data for 2020 and 2021, and the meteorological data for the attached year, and Attachment 3 gives The power generation capacity data for 2020 and 2021 are given, the power generation capacity data for the annex year is given, and the historical data is given in annex 4. Historical data are given.

Please conduct analysis and modeling based on the relevant data in the attachment to solve the following problems: Please conduct analysis and modeling based on the relevant data in the attachment to solve the following problems:

Question 1 Dataset

Question 1 Check the image data of photovoltaic EL battery components by yourself, and construct your own photovoltaic cell defect number, model the constructed data set, perform defect detection on the picture in Annex 1, fill in the detection results in the table below, and compare the detection results rationality analysis.

The currently found data sets are as follows, divided into annotations and pictures, and test pictures, a total of 400+MB, including annotations,

Dataset: A PV EL anomaly detection dataset of solar cells, which contains 36,543 near-infrared images with various internal defects and heterogeneous backgrounds. The dataset contains 1 class of non-abnormal images and 12 different classes of abnormal images, including cracks (lines and stars), finger breaks, black cores, misalignments, thick lines, scratches, chips, corners, and material defects, etc. In addition, 40358 ground truth bounding boxes are provided for 12 types of defects

Data set processing: You can use target detection methods such as yolo, SSD and other algorithms to evaluate the trained mAP, etc.

​Question 2

2023 Hebei Province Postgraduate Mathematical Modeling Question B Question 2: Please preprocess the data according to the meteorological data in Attachment 2 (15min time interval), establish a meteorological model, and analyze the wind speed, wind direction, and temperature from November 12-18, 2021 Wait for the data to make predictions and complete the table below.

ARIMA (Autoregressive Moving Average Model) and SARIMA (Seasonal Autoregressive Moving Average Model) are commonly used statistical models when it comes to time series forecasting. They are both used to fit time series data, capture characteristics such as trend, seasonality and periodicity in it, and make predictions of future values. At the same time, ACF (autocorrelation function) and PACF (partial autocorrelation function) are important tools for time series data and are used to determine the order of ARIMA and SARIMA models.

1. ARIMA (Autoregressive Moving Average Model):- ARIMA is a commonly used time series forecasting model, which combines the characteristics of autoregressive (AR) and moving average (MA) for fitting non-stationary time series data.

  1. The AR (autoregressive) part uses the lag value of the time series data itself to predict the future value, expressed as AR(p); the MA (moving average) part uses the lag error term to predict the future value, expressed as MA(q).

  2. Difference (d) is used to transform non-stationary data into stationary data so that it satisfies the prerequisites of the ARIMA model.

  3. ARIMA(p, d, q) is a combination of AR, I (difference), and MA, where p, d, and q are the orders of AR, I, and MA, respectively.

2. SARIMA (Seasonal Autoregressive Moving Average Model):- SARIMA is an extended version of the ARIMA model, specially designed to deal with seasonal time series data.

  1. In addition to the p, d, and q parameters of the ARIMA model, SARIMA also has seasonal AR, I, and MA parameters, denoted as AR(P), I(D) and MA(Q).

  2. The seasonal period(s) is used to specify the interval of seasonality in the data, such as a 365-day year or a 7-day week.

  3. SARIMA(p, d, q) × (P, D, Q, s) is the representation of the SARIMA model, where p, d, and q correspond to the order of AR, I, and MA, respectively, and P, D, and Q correspond to the seasons Orders of AR, I and MA, s is the seasonal period.

3. ACF (Autocorrelation Function):- ACF is used to measure the linear correlation between time series data and its own lagged value.

  1. Plotting an ACF plot can help determine whether time series data is characteristic of an autoregressive (AR) model, that is, whether there is a correlation between the data and its lagged values.

  2. If the ACF exhibits significant correlations at lags other than 0, an AR model may be required.

4. PACF (Partial Autocorrelation Function):- PACF is used to measure the direct linear relationship between time series data and its own lag value, eliminating the influence of intermediate lag values.

  1. Plotting a PACF plot can help determine whether time series data is characteristic of a moving average (MA) model, that is, whether there is a direct linear relationship between the data and its lagged values.

  2. If PACF exhibits significant correlations at lags other than 0, an MA model may be required.

First load the data:

import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.tsa.stattools import adfuller

# 1. 数据预处理
data = pd.read_excel("附件2.xlsx")
data['timestamp'] = pd.to_datetime(data['timestamp'])
data.set_index('timestamp', inplace=True)

Then fill in the missing values ​​using linear interpolation using

​Data visualization:

plt.rcParams['font.sans-serif'] = ['SimHei'] # 步骤一(替换sans-serif字体)
plt.rcParams['axes.unicode_minus'] = False   # 步骤二(解决坐标轴负数的负号显示问题)
# 2. 数据可视化
plt.figure(figsize=(12, 6), dpi=300)
plt.plot(data['风速(m/15min)'])
plt.title('风速趋势图')
plt.xlabel('时间戳')
plt.ylabel('风速(m/15min)')
plt.tight_layout()  # 调整图像布局,避免坐标轴标注超出边界
plt.savefig('风速趋势图.png')

Then establish a meteorological model of wind speed and perform ACF and PACF tests:

# 3. 建立气象模型 - 风速
wind_speed_data = data['风速(m/15min)']

#  Stationarity Check
result = adfuller(wind_speed_data)
print(f'ADF Statistic: {result[0]}')
print(f'p-value: {result[1]}')
print('Critical Values:')
for key, value in result[4].items():
    print(f'  {key}: {value}')

Then time series training and forecasting:

# ARIMA 模型拟合 -风向
model_wind_direction = ARIMA(wind_direction_data, order=(p, d, q))
model_fit_wind_direction = model_wind_direction.fit()

# 残差分析 - 可视化并检查残差是否存在模式或显著的偏差。
residuals = model_fit_wind_direction.resid
plt.figure(figsize=(10, 6))
plt.plot(residuals)
plt.title('ARIMA 模型残差(风向)')
plt.xlabel('时间戳')
plt.ylabel('残差')
plt.tight_layout()  # 调整图像布局,避免坐标轴标注超出边界
plt.savefig('ARIMA模型残差(风向).png')

# ARIMA模型预测

Considering seasonal factors, you can also use seasonal time series prediction SARIMA to predict: (the code is omitted, the download address is at the beginning of the article)

Question 3 Combine the meteorological data obtained in question 2 to model, preprocess the data, establish a mathematical model of weather and power generation capacity, and predict the photovoltaic power generation capacity on November 18, 2021, and complete the following table

Establishing a mathematical model of meteorology and power generation capacity is a regression problem. You can use principal component analysis to extract the principal components, and then use a variety of regression methods, such as decision tree regression, support vector machine regression, XGBoost regression, etc.

Guess you like

Origin blog.csdn.net/qq_45857113/article/details/131950102