Correlation Analysis of Pyrolysis Yield Using a Linear Regression Model

: The paper focuses on the impact of pyrolysis combination and mixture ratio on the yield of various pyrolysis products, such as tar, water, coke residue, and syngas, in the field of catalytic reaction analysis for pyrolysis product generation. Initially, data preprocessing was carried out, outlier detection and missing value processing were respectively performed, and a model was established to explain the relationship between mixing ratio and yield. Descriptive statistical analysis was employed to gain an initial understanding of the overall data situation. Subsequently, statistical indices such as mean, range, and standard deviation of pyrolysis products for each combination were quantified to unveil the extent of influence of different mixing ratios on yield. The correlation analysis and linear regression model were then used to establish the relationship model between mixture ratio and yield, with further explanation of the correlation and presentation of the functional expression. Finally, mathematical formulas and graphs were utilized to analyze the linear trend between product and mixing ratio under various pyrolysis combinations. This study offers robust data analysis and modeling support for comprehending the impact of different mixing ratios on pyrolysis product yield.


Introduction
With the gradual improvement of human life quality, the search for renewable energy materials has become the focus of the current world.Xinjiang is a region with abundant cotton stalk production, and its cotton planting area accounts for more than 80% of the country's total planting area.[1] However, as a renewable energy source, cellulose, lignin and other biomass contained in cotton stalk have attracted much attention, and biomass as an energy source has gradually become a trend.[2]Sun Zhiao [3] found that cotton stalk had better comprehensive combustion performance and lower ignition temperature and burnout temperature.Liu Simeng [4] found that hydrothermal oxidation treatment could produce higher fixed carbon content after pyrolysis of cotton stalk.Zhao Jiaxing [5]Zhao Jiaxing found in his research on the quality improvement of desulfurization ash and biomass pyrolysis products that desulfurization ash promoted the formation of small and medium molecular compounds during pyrolysis, thus increasing the yield of pyrolysis water and pyrolysis gas, while significantly decreasing the yield of pyrolysis oil.In order to further step into the field of the relationship between pyrolysis combination and pyrolysis products, this paper summarizes the influence of changing the mixture ratio of pyrolysis combination on the yield of various pyrolysis products such as tar, water, coke residue and syngas based on the research experience of the former.According to the data analysis provided by https://www.nmmcm.org.cn/index/,Firstly, descriptive statistical analysis was carried out on the average and standard difference data of each pyrolysis combination, and scatter plot was drawn to observe the influence trend of different mixing ratios of pyrolysis combinations on pyrolysis products and whether there was any significant influence.Then correlation analysis and linear regression model were established, under the condition that different mixing ratios were taken as independent variables and the yield of pyrolysis products was taken as dependent variables.The relationship between the mixture ratio and the yield of pyrolysis products and the yield of pyrolysis products under different mixture ratio were quantitatively analyzed, and the corresponding influencing factors and rules were obtained.

Establishment and Solution based on
Linear Regression Model

Data Pre-Processing
According to the data, first of all, the data are preliminarily preprocessed to check whether there are abnormal values or missing values in the data respectively.It can be clearly known from the three tables in the data that all provided are the yields of decomposition products of the pyrolysis combination, and the unit is 100%.Therefore, it can be seen that the sum of products under different mixing ratios should be 100.As can be seen from Table 1, there are no missing values in the sample data.
Based on observation, outliers are selected, that is, data that is different from the sample data.Two decimal places are retained for all data, and some data are integers or only one decimal place.Outliers account for a small proportion of the total sample data, so in order to maintain data consistency, they are selected to change the outliers.The specific data after the change is shown in the data marked in red in Table 2 and  3.After the above outliers and missing values are determined and changed, descriptive statistics are carried out on the three groups of data to understand the general data distribution.The mean value, range and standard deviation of corresponding pyrolysis products were calculated respectively, and some of the results were shown in Table 4.By calculating the average value of pyrolysis products of each combination, the average yield level can be roughly understood.If the range of pyrolysis products of each group is large, such as the range value of tar production in the table is 7.33, 10.86 and 9.87 respectively, which are all too large, it indicates that tar is significantly affected under different mixing ratios of pyrolysis combinations.Then, the dispersion degree of a data set is reflected according to the standard deviation data.In the yield table of DFA/CE pyrolysis decomposition products, the standard deviation values of water production and tar production are too large, indicating that the dispersion degree is poor and the yield deviation under different mixing ratios is large, while in the DA/CS pyrolysis decomposition products table, the standard deviation of coke production is only 0.246.It shows that the product is basically unaffected by different mixing ratios.

Model Establishment
Through descriptive statistics of the data, the products susceptible to different mixing ratios can be roughly understood.In order to more intuitively state their relationships, scatter plots of different pyrolysis combinations are drawn, as shown in FIG. 1, 2a and 2b.It can be seen from FIG. 1 that desulfurization ash has a significant effect on tar yield, while other pyrolysis products all show a slight upward trend, and the overall stability is relatively stable, indicating that desulfurization ash plays an insignificant role in the pyrolysis reaction of cotton stalk.Furthermore, desulfurization ash/cellulose pyrolysis plays a relatively significant role in promoting the pyrolysis reaction, and the yield of decomposition products can be intuitively obtained.The tar yield increased with the increase of mixing ratio, indicating that the water yield of desulfurization ash in cellulose was slightly affected by the different mixing ratio.
The pyrolysis reaction of desulfurization ash/lignin showed that the output of tar, water, coke and syngas were not affected by different mixing ratios.The above has preliminarily analyzed whether desulfurization ash promotes the pyrolysis of cotton stalks, cellulose, and lignin, further establish correlation analysis, using Pearson correlation coefficient to illustrate the mixing ratio relationship the specific modeling process is as follows: In numbers that are not all zeroλ2, λ2, ⋯, λk make: established, of which μ Is a random error term.When μ= At 0, there is complete collinearity between explanatory variables, otherwise, it is called incomplete collinearity.According to the size of the semi partial correlation coefficient, explanatory variables that have no significant impact on the dependent variable can be selected.Consider the following two linear regression models The judgment coefficients are R^2 and R respectively_ 1^2,the relationship between the two is: The correlation analysis model can be obtained from the above formula.Since the data we are studying is a continuous variable, we use Pearson correlation coefficient to complete the correlation test, due to the fluctuation of X and Y, the covariance value cannot fully demonstrate the correlation between the two variables.Therefore, we need to standardize the covariance to obtain the Pearson correlation coefficient.The specific steps are as follows: In addition, in order to obtain the relationship between the yield of pyrolysis products and the mixture ratio of pyrolysis combinations more accurately, a linear regression model was constructed to solve the function expression of the yield of pyrolysis products under each pyrolysis combination, the Pearson correlation coefficients are 0.969, 0.979, 0.905, 0.949, 0.968, 0.968 and 0.959 respectively.The lowest is more than 0.9, so as to set the following multiple linear regression model: Where the independent variables x1, x2..... xn is the predictor, y is the dependent variable, ε is the random variable, and β is the parameter to be estimated.β can be calculated by least square method.The calculation formula is as follows: Findingβminimize Q(β) gives a least squares estimate ofβ Linear regression equations are usually verified by model fitting degree R2, analysis of variance (F test) and T test.R is the correlation coefficient between the predicted value and the observed value, which can represent the interpretation degree of the model to the observed data, and it is generally required to reach more than 0.85.The expression for R is as follows:

Model Solving and Analysis
Pearson correlation coefficient was established to represent the relationship between the yield of pyrolysis products (tar, water, coke slag, syngas) and the mixing ratio of corresponding pyrolysis combinations, and the correlation diagram between desulfurization ash/cotton stalk, desulfurization ash/cellulose, desulfurization ash/lignin pyrolysis combinations and their pyrolysis products under different mixing ratios was made respectively.Figure 3 below takes desulfurization ash/cotton stalk pyrolysis combination as an example.It can be intuitively seen from the above figure that, except for the poor correlation between the pyrolysis product tar and the mixture ratio of the pyrolysis combination, the overall correlation is high, and the Pearson coefficient obtained by solving it is 0.983.Then the corresponding function expression was solved by the linear regression model, and the linear diagram of the mixture ratio of pyrolysis products and pyrolysis combinations was obtained.Part of the linear diagram was shown in FIG.4a and 4b.The specific solution expression is shown in Table 5 below.For DFA/CE (cellulose) pyrolysis products, the linear regression model coefficient of the relationship between tar yield and desulfurization fly ash mixture ratio is 9.38, and the intercept is 37.51.This indicates that the tar yield tends to increase with the increase of desulfurization fly ash mixture ratio.
For DFA/LG (lignin) pyrolysis products, the linear regression model coefficient of the relationship between tar yield and desulfurization fly ash mixture ratio is -8.36, and the intercept is 15.38.This indicates that the tar yield tends to decrease with the increase of desulfurization fly ash mixture ratio.
In both graphs, we can see the actual data points (blue) and the lines predicted by the model (red).In the case of DFA/CE, the model predicts an increase in tar yield as the proportion of desulphurized fly ash increases.In the case of DFA/LG, the model predicts that the tar yield will decrease as the proportion of desulfurized fly ash increases.
These results show that desulphurized fly ash has different catalytic effects on the pyrolysis of cellulose and lignin.In particular, desulphurized fly ash seems to promote the pyrolysis of cellulose to produce tar, while inhibiting the pyrolysis of lignin to produce tar.The relationship between the yield of pyrolysis products and the mixing ratio of the corresponding pyrolysis combination can be clearly understood from Table 5, which further explains the results of correlation analysis and improves the uncertainty of correlation analysis.In the end, the tar yield in desulfurization ash/cotton stalk, tar yield in desulfurization ash/cellulose and water yield has a significant impact on the mixing ratio of pyrolysis combination.

Conclusion
"Carbon peaking" and "carbon neutrality" were first included in the government work report during the 2021 Two Sessions, The vigorous development of new energy and the search for more renewable energy have become important tasks for China to achieve sustainable development strategy In this context, the impact of pyrolysis combination mixing ratio on the yield of various pyrolysis products is proposed, Using descriptive statistics such as mean and standard deviation combined with drawing scatter plots of pyrolysis products, we observed that the combination of desulfurization ash and cotton straw pyrolysis has a significant inhibitory effect on tar yield, The combination of desulfurization ash and cellulose pyrolysis has a significant promoting effect on tar.Next, conduct a deeper correlation analysis to obtain a correlation heatmap, It was found that there is a high correlation between tar production and changes in catalyst mixing ratio, the Pearson coefficient is 0.983.Further establish a linear regression model to calculate the functional expressions of each pyrolysis combination and pyrolysis product yield, Draw the conclusion that Water yield to DFA/CS mix ratio, Coke yield and DFA/CS mixture ratio, Tar production to DFA/CE mix ratio, Water yield mix ratio the mixing ratio of the four pyrolysis combinations shows an upward trend, The tar yield in desulfurization ash/cotton straw, tar yield in desulfurization ash/cellulose, and water yield have a significant impact on the final mixing ratio with pyrolysis.

Figure 1 .
Figure 1.Yield of Decomposition Products from DFA/CS Pyrolysis

Figure 2a .
Figure 2a.Yield of DFA/CE pyrolysis decomposition products Figure 2b.Yield of DFA/LG pyrolysis decomposition products

Figure 4a .Figure 4b .
Figure 4a.Linear graph of tar production and different mixing ratios of desulfurization ash/cellulose Figure 4b.Linear graph of tar yield and different mixing ratios of desulfurized ash/cotton stalk Through linear regression analysis, we get the following results:For DFA/CE (cellulose) pyrolysis products, the linear regression model coefficient of the relationship between tar yield and desulfurization fly ash mixture ratio is 9.38, and the intercept is 37.51.This indicates that the tar yield tends to increase with the increase of desulfurization fly ash mixture ratio.For DFA/LG (lignin) pyrolysis products, the linear regression model coefficient of the relationship between tar yield and desulfurization fly ash mixture ratio is -8.36, and the intercept is 15.38.This indicates that the tar yield tends to decrease with the increase of desulfurization fly ash mixture ratio.In both graphs, we can see the actual data points (blue) and the lines predicted by the model (red).In the case of DFA/CE, the model predicts an increase in tar yield as the proportion of desulphurized fly ash increases.In the case of DFA/LG, the model predicts that the tar yield will decrease as the proportion of desulfurized fly ash increases.These results show that desulphurized fly ash has different catalytic effects on the pyrolysis of cellulose and lignin.In particular, desulphurized fly ash seems to promote the pyrolysis of cellulose to produce tar, while inhibiting the pyrolysis of lignin to produce tar.

Table 1 .
Total yield of some products Yield of Decomposition Products from DFA/CS Pyrolysis wt.%(daf)

Table 4 .
Corresponds to the mean value, range and standard deviation of pyrolysis products

Table 5 .
The function expression of each pyrolysis combination and pyrolysis product