Analysis and Identification of Chemical Composition of Ancient Glass Products

  • Problem 1: Analyze the relationship between the surface weathering of these glass cultural relics and its glass type, decoration and color; combine the type of glass, analyze the statistical law of whether there is weathering chemical composition content on the surface of cultural relic samples, and predict its weathering according to the weathering point detection data. Chemical composition content before weathering.
  • Problem 2: According to the attached data, analyze the classification rules of high potassium glass and lead-barium glass; select the appropriate chemical composition for each category to divide it into subcategories, give specific division methods and division results, and evaluate the rationality of the classification results and sensitivity analysis.
  • Question 3: Analyze the chemical composition of unknown categories of glass cultural relics in Annex Form 3, identify their types, and analyze the sensitivity of the classification results.
  • Question 4: For different categories of glass cultural relic samples, analyze the relationship between their chemical components, and compare the differences in the relationship between chemical components between different categories.

1. Participation feeling

As a modeling novice, I have participated in the national digital modeling competition for 22 years. Since the AB questions are all physics questions, our team finally chose the C question. During the competition, I mainly served as a modeler and a programmer, because I am A student in a computer science class has learned a lot of programming languages, but he usually uses python as his first language and c++ as his second language. In order not to increase his learning pressure, he did not study matlab in depth, and compared to For undergraduates, the use value of python language is far greater than that of matlab, so I chose python as the modeling language. I will also share the relevant materials and codes of using python for mathematical modeling in the process of preparing for the competition. This time I will share my ideas for this competition. I am a novice in modeling, so please forgive me for my poor thinking.

2. Question 1

First, we divide the question into the following three sub-questions:

  1. The relationship between the three data characteristics of glass cultural relics and whether the surface is weathered is analyzed.
  2. Combined with the type of glass, the statistical law of the chemical composition content on the glass surface is summarized.
  3. Predict the chemical composition of weathered cultural relics before weathering.

2.1. Relationship analysis

Analysis of variance is to find out the factors that have a significant impact on the thing through data analysis, the interaction between various factors, and the optimal level of significant impact factors, etc.
Therefore, we construct a relationship model of glass type, decoration, color characteristics and surface weathering: firstly, data preprocessing is performed on the data in Form 1, that is, multiple imputation methods are used to complete missing values, and deep data mining is carried out. Then explore the relationship between glass type, decoration, and color and whether the surface is weathered, and explore whether the three characteristics have a significant impact on whether the surface of the cultural relic is weathered.
flow chart

2.1.1. Model solution

model solving
In the F test, the magnitude of the significant P value can effectively reflect whether the characteristic index has an important impact on the surface weathering, and whether there is a main effect. Within a certain range, the smaller the P value, it shows that the factor has an important impact on the surface weathering. The stronger the significance of is, the main effect exists. The specific analysis results are as follows:
For the variable: ornamentation, it can be obtained from the analysis of the results of the F test that the significant P value is 0.214, which is not significant on the level, and has no significant impact on surface weathering, and there is no main effect.
For the variable: color, it can be obtained from the analysis of the results of the F test that the significant P value is 0.012, which is significant on the level, and has a significant impact on surface weathering, and there is a main effect.
For the variable: type, it can be obtained from the analysis of the results of the F test that the significant P value is 0.276, and there is no significant level, there is no significant impact on surface weathering, and there is no main effect.
For the interaction item decoration and color, it can be obtained from the analysis of the results of the F test that the significant P value is 0.253, and there is no significant level, there is no significant impact on surface weathering, and there is no interaction.
For the interaction item decoration and type, it can be obtained from the analysis of the results of the F test that the significant P value is 0, and the level is significant, and it has a significant impact on surface weathering, and there is an interaction.
For the interaction item color and type, it can be obtained from the analysis of the results of the F test that the significant P value is 0.116, and there is no significant level. There is no significant impact on Q5-surface weathering, and there is no interaction.

2.2. Statistical Laws

We divide the data according to the type of glass, and then continue to divide the data according to whether it is weathered or tattooed, and compare and observe the chemical composition of the glass surface.
statistical law
statistical law
It can be seen from the above table that most of the chemical components of high-potassium glass cultural relics will decrease after weathering, while most of the chemical components of lead-barium glass cultural relics will remain relatively stable after weathering; However, after the weathering of lead-barium glass cultural relics, the variance of the chemical composition of the same type of objects is relatively large.

2.3. Composition prediction

Prediction of chemical composition before weathering: Through in-depth data mining and correlation analysis, we found that the sum of the three types of indicators, decoration, type, and color, has a significant impact on surface differentiation, accounting for as high as 99.7%, which is extremely close and highly correlated High, so we can approximate that if two glass samples have the same decoration, type, and color, the chemical composition of the two samples is the same. Based on the above analysis, the chemical composition of the cultural relic weathering point before weathering can be predicted.

3. Question 2

Question 2 requires the analysis of the classification rules of high-potassium glass and lead-barium glass, and on this basis, to obtain the sub-category method and results for the appropriate chemical composition of each category, and to conduct rationality and sensitivity analysis. The idea of ​​this problem is mainly divided into the following five steps:
(1) Data standardization: In this paper, the chemical composition data of the sampling points are firstly averaged and centered, and then scaled according to the standard deviation of the data to obtain a mean value of 0 and a variance of 1. Normal distribution.
(2) Reduce the dimension of the chemical composition of cultural relic sampling points: This paper first solves the correlation coefficient matrix for the standardized data, then calculates the eigenvalues ​​of the correlation coefficient matrix and the corresponding standard orthogonalization eigenvectors, and finally solves the contribution of the principal components The first three principal components are selected according to the cumulative contribution rate, which reduces the dimension of the chemical composition of the form.
(3) Extract the classification rules of high-potassium glass and lead-barium glass: on the basis of the above work, first use the main components obtained by sorting as the judgment features of the decision tree, and then select the feature with the largest information gain from the root node as the node features, generate new sub-nodes, generate multiple iterations, and finally pruning the decision tree to obtain the classification rules of the two types of glass.
(4) Summarize the subcategory division method: According to the form data and reference materials, this paper conducts subcategory analysis on the weathered glass, using the K-means clustering algorithm, and first sets k as 2 (clustering into two categories: severe weathering, general weathering), randomly select the initial point as the centroid, calculate the Euclidean distance between each sample and the centroid, gather the sample points into the most similar class, iterate repeatedly until the centroid no longer changes, and finally determine each The subclass to which the sample belongs.
(5) Sensitivity analysis: Through the control variable method, explore the change of model accuracy when a certain parameter of the model changes and other parameters remain unchanged; add data disturbance processing, that is, reasonable scaling of data, and study the sensitivity analysis of the model.

3.1. Classification law model

Firstly, the Z-Score algorithm is used to standardize the chemical composition ratio data in Form 2. Then apply the principal component analysis algorithm to the proportion of each chemical component in Form 2, and screen out the proportion indicators of chemical components whose cumulative contribution is greater than 90%, so as to achieve the purpose of dimensionality reduction. The principal components with a cumulative contribution rate of 90% were screened out from the 14 principal components, and the ranking results according to the degree of contribution were: the first principal component, the second principal component, and the third principal component.
According to the results of principal component analysis, the parent-child pie chart of the contribution ratio of each principal component is:
mother-child diagram

3.1.1. Classification rules

Use the decision tree to judge its main component potassium oxide (K2O) at the first node. If it contains potassium, it will continue to the next node. If it does not contain potassium, it will be lead-barium glass. A threshold is set for judgment. If the threshold is reached, it means that although it contains potassium, its content is low, so it is judged as lead-barium glass. If the sodium content is lower than the threshold, it means that the potassium content meets the standard, and it is a high-potassium glass.

3.2. Subclassification model

In this paper, the data screened by type in the appendix are subclassified according to the degree of weathering, and the high-potassium glass and lead-barium glass are analyzed according to severe weathering and general weathering, and the weathered high-potassium glass and lead-barium glass are respectively analyzed Carry out k-means clustering, and then perform type discrimination according to the category of known severe weathering points in the attachment, and use python to solve.

3.3. Model testing and sensitivity analysis

3.3.1. Decision tree model inspection

The Kappa coefficient can be used to evaluate the accuracy of the results of supervised learning. In this paper, it is used to evaluate the accuracy of the classification results of the decision tree model.
K appa = p 0 − pe 1 − pe Kappa=\frac{p_0-p_e}{1-p_e}Kappa=1pep0pe
The solution Kappa=0.8374>0.81 can prove that the classification of the decision tree model is almost completely consistent, that is, the accuracy is very high.

3.3.2, K-means model test

The elbow is used to test the best k value of K-means clustering. As the number of categories increases, the trend of SSE passes through the inflection point, that is, the best k value, and the downward trend of SSE will slow down.
elbow method
It can be seen from the figure that after K=2, the downward trend of SSE gradually slows down, so the optimal number of subcategories for problem 2 is 2, that is, the model sets a reasonable K value (severe weathering and general weathering).

3.3.3. Model sensitivity analysis

We performed small disturbance processing on the original data, that is, reasonable scaling, and performed sensitivity analysis, and compared the predicted results of the processed data with those of the original data. The results are as follows: It can be seen from the table, where
Sensitivity analysis
X It is the original data set, and the prediction results after reasonable scaling are almost consistent with those of the original data, with an accuracy rate of 96.875%, indicating that the model has better robustness and higher accuracy.

4. Question 3

Question 3 asks for category classification of the unknown categories of glass artefacts in Form 3. The steps to solve this problem are divided into the following two steps: on the basis of the second question, firstly use the decision tree model of question 2 to classify the unknown glass into high-potassium glass or lead-barium glass, and then use the K-means clustering algorithm to compare the previous step Subcategories are divided into subcategories. Based on the conclusion of the second problem model, this paper uses the principal component analysis method again to identify the unknown glass categories. Then use K-means cluster analysis to select the appropriate sub-category index as the criterion for sub-category analysis. Using python programming to solve the above model, the classification results of 8 samples of unknown glass cultural relics are obtained in the following table: The
Question 3 Results
idea of ​​sensitivity analysis is the same as that of question 2, so I will not repeat it here. Interested students can refer to the sensitivity analysis method of question 2 and proceed apply.

5. Question 4

Question 4 requires classification of samples of high potassium glass and lead-barium glass, and analysis of the relationship between their chemical compositions. This paper considers the use of Pearson coefficients to construct a correlation coefficient matrix to analyze the relationship between chemical components. The flow chart of the overall idea of ​​problem 4 is as follows:
question four
We can find that there is a significant difference between the correlation between the chemical components of high potassium glass and the correlation between the chemical components of lead-barium glass, and for high potassium glass and lead-barium glass according to The classification analysis of surface weathering shows that surface weathering will also affect the correlation between chemical components, so there are significant differences in the chemical composition relationship between different categories.

6. References

[1] Si Shoukui, Sun Zhaoliang. Mathematical Modeling Algorithms and Applications [M]. Beijing, National Defense Industry Press, 2017 [
2] Wei Xiao, Wen Xuefeng, Yang Ruidong. Neoproterozoic of Darong Brick Factory, Congjiang County, Guizhou Distribution characteristics and correlation analysis of trace elements in modern weathered crust sections of boundary metamorphic rocks[J]. Earth and Environment. 2010,38(03) [3] Yu Ying, Meng Yanju. Cluster
analysis of air quality in 31 major cities in China and Principal Component Analysis [J]. Science and Technology and Industry. 2022,22(05)
[4] Zhong Qi, Luo Jin, Qi Shuhua. Sensitivity Analysis of Sample Size of Orange Orchard Extracted by Random Forest Classification Algorithm [J]. Jiangxi Xue.019 ,37(05)
[5] He Kunlong, Zhao Wei, Liu Xiaohui, Liu Jiao. Sensitivity Analysis of Training Set for Machine Learning Model of Surface Temperature Reconstruction under Cloud and Fog Coverage[J]. Journal of Remote Sensing, 2021,25(08):1722-1734 .

7. Complete thesis

Analysis and Identification of Chemical Composition of Glass Cultural Relics

Guess you like

Origin blog.csdn.net/m0_53192838/article/details/127417873
Recommended