Research on Glass Classification and Identification Based on Cluster Analysis and Multiple Regression Models

: Ancient glass undergoes a large exchange of internal elements with environmental elements in burial, and its composition changes, thus affecting the correct judgement of its category. In order to classify and identify the types of ancient glass, this paper sets different types of glass as the dependent variable, and the grain, weathering or not, and their chemical composition characteristics as the independent variables, so as to form an interpretable classification law. In order to carry out the sub-classification under the same type of glass, then a chemical element content is selected and then the colour of the glass grain is observed under that content. In order to identify the type of glass, this paper builds a multiple regression model on the data of chemical composition and glass type, from which the type of glass is identified. The models established in this paper have passed the stability test.


Introduction
Ancient glass is highly susceptible to weathering under the influence of the buried environment.During the weathering process, the internal elements are exchanged with the environmental elements in large quantities, which leads to changes in the proportion of its composition, thus affecting the correct judgement of its category.Therefore, it is important to investigate the influence of different chemical compositions and their contents on the weathering of glass surfaces and the degree of weathering [1][2].

Model solution
The data after pre-processing were imported into SPSSPRO for decision tree analysis to obtain the proportion of importance of the respective variables as follows Table 1

Analysis of results
It is known that lead oxide is much more important than the other compounds, which leads to a classification pattern for high-potassium and lead-barium glasses.The chemical composition was classified by cluster analysis, which stipulates the mean ± standard deviation as the classification interval, and divided into n categories.The following table 2 shows the subclassification rules according to the lead-barium glass compounds (due to space limitations, this paper only shows the classification rules for the Ministry of Wind compounds) [3].The rules for subclassification of compounds based on high-potassium glass compounds are shown in table 3 below (due to space constraints, the table shows the rules for the classification of partial wind compounds) [4].The results of the lead-barium glass classification are shown in Table 5 below (due to space constraints, only the first ten artefacts are shown for the clustered categories) Cluster analysis, also known as cluster analysis, is a multivariate statistical analysis method to classify samples or indicators according to the principle of "clustering by class", the object of which is usually a large number of samples, which can be reasonably classified according to their respective characteristics.In the cluster analysis, the classification interval of the sample is calculated based on the mean and standard deviation, because the standard deviation reflects the degree of dispersion of the sample, i.e., the larger the standard deviation, the larger the gap between the samples, which further indicates the greater sensitivity.Example: From the results of lead-barium glass, it can be seen that the standard deviation of silica is generally larger than the standard deviation of other chemical compositions, and by analogy we can derive the sensitivity of other chemical compositions.

Identification Modelling and Analysis
Assume that glass type is the dependent variable m, silica, NaO, KO, CaO, MgO, Al2O3, FeO, CuO, PbO, BaO, P2O5, Strontium Oxide, SnO, Sulphur Dioxide are the independent variables 1 2 3 14 , , h h h h , the regression coefficient is , ,     , and the constant term is 0  .Then the multiple linear regression equation is set: Data regression was analysed using SPSSPRO yielding regression coefficients of 0.007, 0.051, -0.03, 0.004, 0.033, 0.033, 0.016, 0.029, 0.019, 0.034, 0.015, 0.097, 0.002, 0.016.Substituting the correlation coefficients into the regression equation yields: Based on the results of the analysis, it can be obtained that the value of 2 R is 0.852, which shows that the regression model has a good fit.It can be used.Figure 1 below shows the fit of this regression equation: The average of the results of the quantitative analyses of high-potassium and lead-barium is taken as the threshold value; samples below the threshold are classified as high-potassium, and samples above the threshold are classified as lead-barium.That is, when m < 1.5, the artefact is of high potassium glass type, and when m > 1.5, the artefact is of lead-barium glass.
The data were predicted and the predictions are shown in Lead barium 1.455 Note: After testing the covariance of some samples of the model, the prediction result of the model has a slight error with the fact, and according to the analysis of the chemical composition of A8, we get that its type should be lead-barium.
Sensitivity analysis of the classification results: Since the prediction of the model has a certain error with the real value, it may be inconsistent with the reality when analysing some smaller data, and the sensitivity of the results is general.

Conclusions
Ancient glass is highly susceptible to weathering under the influence of the buried environment.During the weathering process, the internal elements are exchanged with the environmental elements in large quantities, which leads to changes in the proportion of its composition, thus affecting the correct judgement of its category.Therefore, it is of great research value to investigate the influence of different chemical compositions and chemical composition contents on the weathering or not of the surface of glass articles and the degree of weathering.In this regard, the cluster analysis model can be used to solve the problem.The grain, the presence or absence of weathering and their chemical compositions can be constructed as many features as possible according to the type of glass in order to form an interpretable classification law.For the purpose of subclassification, the glass types are first classified, then a chemical element content is selected then the colour of the glass grain is observed at that content, and finally the classification is made.

Figure 1 .
Figure 1.Plot of the fitted effect of the regression equation Thresholds: The average of the results of the quantitative analyses of high-potassium and lead-barium is taken as the threshold value; samples below the threshold are classified as high-potassium, and samples above the threshold are classified as lead-barium.That is, when m < 1.5, the artefact

Table 1 .
: Proportion of importance of chemical composition

Table 3 .
Subclassification rules for high-potassium glass compoundsThe results of the classification of the high-potassium glass are shown in Table4below (due to space constraints, only the first eight artefacts are shown for the clustered categories) Note: ***, **, * represent 1 per cent, 5 per cent and 10 per cent significance levels, respectively.

Table 4 .
Results of the classification of high-potassium glasses

Table 5 .
Results of lead-barium glass classification

Table 6 below : Table 6 .
Map of type prediction results