Analysis of real questions | 2021 National Competition Question B: Preparation of C4 Alkenes by Ethanol Coupling

1. Preparation

1.1 Topic background

C4 olefins are widely used in the production of chemical products and medicines, and ethanol is the raw material for the production of C4 olefins. During the preparation process, the catalyst combination (i.e. the combination of Co loading, Co/SiO2 and HAP loading ratio, and ethanol concentration) and temperature will have an impact on the selectivity of C4 olefins and the yield of C4 olefins (see the appendix for terminology explanation) . Therefore, it is of great significance and value to explore the process conditions for the preparation of C4 olefins by catalytic coupling of ethanol through the combination design of catalysts. A chemical laboratory conducted a series of experiments on different catalysts at different temperatures, and the results are shown in Appendix 1 and Appendix 2.
Please complete the following questions through mathematical modeling:
(1) For each catalyst combination in Appendix 1, study the relationship between ethanol conversion rate, C4 olefin selectivity and temperature, and for the given catalyst combination at 350 degrees in Appendix 2 The test results at different times in an experiment were analyzed.
(2) To explore the effects of different catalyst combinations and temperatures on the conversion of ethanol and the selectivity of C4 olefins.
(3) How to choose catalyst combination and temperature to make the yield of C4 olefins as high as possible under the same experimental conditions. If the temperature is lower than 350 degrees, how to choose the catalyst combination and temperature to make the yield of C4 olefins as high as possible.
(4) If it is allowed to add 5 more experiments, how should it be designed and give detailed reasons.
Appendix: Explanation of terms and description of attachments
Temperature : Reaction temperature.
Selectivity: The proportion of a product in all products.
Time: the reaction time of the catalyst in an ethanol atmosphere, in minutes (min).
Co loading: The weight ratio of Co to SiO2. For example, "Co loading is 1wt%" means that the weight ratio of Co to SiO2 is 1:100, which is recorded as "1wt%Co/SiO2", and so on.
HAP: A catalyst carrier, the Chinese name is hydroxyapatite.
Co /SiO2 and HAP charge ratio: refers to the mass ratio of Co/SiO2 and HAP. For example, the catalyst combination numbered A14 in Appendix 1 "33mg 1wt%Co/SiO2-67mg HAP-ethanol concentration 1.68ml/min" means that the mass ratio of Co/SiO2 and HAP is 33mg: 67mg and ethanol is added at 1.68ml per minute, followed by analogy.
Ethanol conversion rate: the one-way conversion rate of ethanol per unit time, its value is 100% x (ethanol intake amount-ethanol residual amount)/ethanol intake amount.
C4 Olefins Yield: Its value is ethanol conversion  selectivity to C4 olefins.
Attachment 1: Performance Data Sheet. In the table, ethylene, C4 olefins, acetaldehyde, aliphatic alcohols with 4-12 carbon numbers, etc. are the products of the reaction; the catalyst experiments with numbers A1~A14 use the charging mode I, and the catalyst experiments with the numbers B1~B7 use the charging method Mode II.
Attachment 2: Test data for a given catalyst combination at 350°C.

1.2 Problem-solving tools

Language: python3.8

Compiler: SPSSPRO Notebook

Download link: SPSSPRO Notebook (free online use, recommended)

2. The first problem-solving tutorial

2.1 Overall problem-solving ideas

The first question is divided into two questions to answer,

1》For each catalyst combination in Appendix 1, study the relationship between ethanol conversion rate, C4 olefin selectivity and temperature.

For the first sub-question, it should be noted that the catalyst combination should be analyzed one by one. For example, for the catalyst combination number A1, the three-dimensional diagram of ethanol conversion rate, C4 olefin selectivity and temperature can be drawn for spatial analysis. A two-dimensional graph can be drawn for volatility analysis, and then some quantitative indicators can be used for descriptive analysis, such as Pearson correlation coefficient to analyze its correlation.

Annex 1 Data

2》Analyze the test results of a given catalyst combination at 350 degrees in Appendix 2 at different times in one experiment.

For the second sub-question, since the temperature is fixed, the easiest way is to draw a two-dimensional graph of the ethanol conversion rate and the selectivity of C4 olefins, and then analyze the phase relationship.

2.2 Problem-solving flow chart

The first problem-solving ideas

2.3  Detailed problem-solving steps

First read the data, and read the data in Attachment 1.

Attachment 1 data

It should be noted that some catalyst combinations need to be filled in for later use in code drawing.

Just use fillna directly, set ffill, that is, fill up.

filled data

Then group them according to the combination of catalysts, which we mentioned in the idea of ​​2.1. Draw after grouping

1》Spatial analysis of ethanol conversion rate, C4 olefin selectivity and temperature three-dimensional map

2》Ethanol conversion, selectivity of C4 olefins and two-dimensional graph of temperature for fluctuation analysis

3》Pearson correlation coefficient analysis of ethanol conversion, selectivity of C4 olefins and temperature

SPSSPRO Notebook implementation code

volatility analysis

Space Analysis

Pearson correlation coefficient analysis of its correlation

You can also use the correlation analysis of SPSSPRO to drag in the data and generate an analysis report with one click.

Grouped by catalyst combination, all generated charts

3. The second problem-solving tutorial

3.1  Overall problem-solving ideas

The effects of different catalyst combinations and temperatures on the conversion of ethanol and the selectivity of C4 olefins were discussed.

When doing this question, you first need to figure out what is a catalyst combination. You can see the picture below, the screenshot is the catalyst combination.

Catalyst combination

To solve the problem, we need to extract these combinations according to the content of the attachment.

Y (dependent variable): ethanol conversion rate, C4 olefin selectivity (divide in 2 times)

X (independent variable): different catalyst combinations and temperatures

For regression analysis, multiple linear regression can be used. Machine learning regression is not suitable here, because the analysis results of linear regression will be more interpretable.

3.2  Problem-solving flow chart

Second problem solving ideas

3.3 Detailed problem-solving steps

The first thing to do is to extract the combination according to this rule (there are hints in the attachment).

Among them, the newly generated column names are:

co-sio2: Co/SiO2 mass ratio, for example, 200mg as shown in the figure below;
Co loading: the weight ratio of Co to SiO2, for example, 1wt% as shown in the figure below;
HAP: HAP mass ratio, as shown in the figure below 200mg;
ethanol: the amount of ethanol added, for example, add 1.68 ml per minute in the figure below

After extraction as shown below, then save it as excel,

Linear regression based on SPSSPRO, set

Y (dependent variable) vs. X (independent variable)

The results can be obtained as follows:

SPSSPRO linear regression output result 1

From the analysis of the results of the F test, it can be obtained that the significant P value is 0.000***, which is significant on the level, and rejects the null hypothesis that the regression coefficient is 0, so the model basically meets the requirements. For the collinear performance of variables, the variables Co-SiO2, If the HAP VIF value is greater than 10, there is a collinear relationship. Simply remove the collinear independent variable or perform ridge regression or stepwise regression.

Here we use ridge regression for quadratic regression. Ridge regression (Ridge) is a biased estimation regression method for small sample data and dealing with multicollinearity of independent variables (generally VIF value is greater than 10). Ridge regression provides a method of biased estimation to eliminate collinear effects by introducing a positive number to improve the normal equations. When K=0, it is the least squares estimate. Since the ridge regression is a biased estimate, the value of K should be as small as possible. Ridge regression abandons the unbiased estimation of the ordinary least squares method and loses part of the information. Therefore, the ² of the ridge regression equation is usually slightly lower than the ordinary least squares regression, but the estimated partial regression coefficients are often closer to the real situation, so that The stability and reliability of the model are improved, and it has a good effect on the restoration and fitting of ill-conditioned data.

Ridge Regression Description Cited to SPSSPRO

Set to automatically read the K value, and determine K=0.113 according to the ridge trace diagram and the variance expansion factor method. (It can be understood as a penalty system coefficient)

Ridge regression output referenced to SPSSPRO 1

The figure above visually shows the situation when the standardized coefficients of each independent variable of this model tend to be stable. After determining the K value, check the regression analysis results. The results of the ridge regression show that the model F significance value is 0.000 ***, the level is significant, and the null hypothesis is rejected, indicating that there is a regression relationship between the independent variable and the dependent variable. Looking at the standardized coefficient, we can quantitatively explore the influence of different catalyst combinations and temperatures on the ethanol conversion rate.

And the formula of the model: ethanol conversion rate (%)=26.153 + 0.042 × Co-SiO2 + 0.032 × Co loading capacity + 0.059 × HAP-9.261 × ethanol

In the same way, the influence of different catalyst combinations and temperatures on the  selectivity of C4 olefins can be explored .

4. The third problem-solving tutorial

4.1  Overall problem-solving ideas

How to choose catalyst combination and temperature to make the yield of C4 olefins as high as possible under the same experimental conditions. If the temperature is lower than 350 degrees, how to choose the catalyst combination and temperature to make the yield of C4 olefins as high as possible.

The core idea of ​​the third question is to construct an XY model. After verifying that the goodness-of-fit R2 reaches a certain standard, set up a simulation model, and then adjust and select the catalyst combination while controlling other related variables (same experimental conditions). and temperature, so that the yield of C4 olefins is as high as possible.

The solution to the second question is similar, but the third question can use more regression models, because the second question focuses on analyzing the relationship between variables, so the goodness of fit is not the key, and the core idea of ​​the third question is simulation , so the goodness-of-fit R2 is the key.

The first step is to determine X and Y,

Y (dependent variable): C4 olefin yield

X (independent variable): different catalyst combinations and temperature, and other relevant variables (need to be constructed as much as possible)

C4 Olefin Yield Calculation Method

Therefore, it can be known that just change the Y in the second question to Y*C4 olefin selectivity (%)

After the model is built, multiple algorithms can be compared and tuned to select the best model. Then generate some data, and adjust the selected catalyst combination and temperature to make the yield of C4 olefins as high as possible while controlling other related variables unchanged (same experimental conditions) .

The second small question: If the temperature is lower than 350 degrees, how to choose the catalyst combination and temperature to make the yield of C4 olefins as high as possible.

The second small problem is that the temperature can be limited to less than 350 degrees, which can be easily solved.

Machine Learning Regression in SPSSPRO

4.2  Problem-solving flow chart

4.3 Detailed problem-solving steps

Based on the data in 3.3, the yield of C4 olefins is Y (the dependent variable), the machine learning model is used for prediction, and the catalyst combination is added as the independent variable.

Split the data into training set and test set as follows

Heuristic algorithms can be used, such as genetic algorithm, PSO particle swarm, simulated annealing theory, etc. for optimization

SPSSPRO- hyperparameter optimization

After the model is trained, check the evaluation results of the model, and you can see that the R2 of the test set is 0.821, which means the goodness of fit is high, and the performance of the model is acceptable (I just ran it without tuning).

Next, we need to generate a large amount of simulation data, that is, carry out a Cartesian product enumeration combination on the catalyst combination, which can be calculated by calculating the maximum and minimum values ​​of the existing data, as shown below

The maximum and minimum ratio range of the catalyst combination

Then 5 layers of for iterations to generate different combinations of data, for example, as shown below, with temperature, cosio2, HAP in steps of 1, co loading and ethanol in steps of 0.1, 130,000 different catalyst combinations were generated

Then these catalyst combinations (X) are fed into the model for prediction, and then the prediction results are sorted in descending order. The catalyst combination with the highest C4 olefin yield is the optimal result.

And the second small question is to screen out the temperature less than or equal to 350 degrees, as the optimal result.

5. The fourth question-solving tutorial

If it is allowed to add 5 more experiments, how should it be designed and give detailed reasons.

The fourth question is actually a supplement to the third question.

Feed these catalyst combinations (X) into the model for prediction, and then sort the prediction results in descending order. For the catalyst combination with the highest yield of C4 olefins, we give the top five iterative optimal methods, which is the optimal result.

6 code acquisition

Above, all the codes and question data can be obtained for free through the following, pay attention to the SPSSPRO community account [follow Huanhuan to play digital simulation]:

Get code + questions + data for free

It is not easy for the author to create. If you think it is useful, please like, collect and pay attention to Sanlian.

Guess you like

Origin blog.csdn.net/weixin_44099072/article/details/126402684