Composition analysis of ancient glass products based on logistic regression and principal component analysis

: In order to analyze and study the composition of ancient glass products, this paper preprocessed the data and carried out statistical analysis of charts to qualitatively analyze the relationship between weathering and color, type and pattern of cultural relics, and established a Logistic regression model for quantitative analysis. There are four types of cultural relics according to the type of cultural relics and whether they are weathered. Through the evaluation model of principal component analysis, the statistical law of weathering and chemical composition content of excavated glass cultural relics is calculated according to the corresponding comprehensive score range of each kind. A multiple linear regression model was established to predict the pre-weathering component content. The correlation and difference between the chemical components of different kinds of glass relics were analyzed. The correlation coefficients of high potassium glass and lead barium glass were analyzed respectively, and the correlation coefficient heat maps were drawn. Determine the relationship between the chemical components and the differences between different chemical components.


Introduction
The Silk Road was a channel for cultural exchanges between China and the West in ancient times, and glass is valuable evidence of early trade.From the 10th to the 12th century AD, glass cultural relics were gathered in China at their peak, and the unearthed glass cultural relics in this period were widely distributed and culturally distinctive [1] .China's ancient glass products and foreign glass products similar in appearance, but the chemical composition is not the same.The main chemical component of glass is silicon dioxide (SiO2).However, the melting point of pure quartz sand is high, and in order to reduce the melting temperature, flux and stabilizer need to be added in refining.The main chemical composition of glass is also different if the flux added to the glass is different [2] .The limestone added during refining can act as a stabilizer to prevent weathering of glass, but the chemical composition ratio of ancient glass will change after weathering [3] .Ancient glass is susceptible to weathering under the influence of burial environment.
In the process of weathering, the internal elements have a lot of exchange with the environmental elements, resulting in a change in their composition ratio, which affects the correct judgment of their categories.Different degrees of weathering, the content of elements inside the glass is not the same, and the color of the glass is not the same.
The purpose of this paper is to explore the relationship between weathering, color, type and decoration of glass cultural relics by means of data preprocessing and chart statistical analysis, and at the same time to classify four types of cultural relics by logistic regression model.Through the evaluation model of principal component analysis, the comprehensive score range of various cultural relics was calculated, so as to deeply understand the weathering condition and chemical composition content law.The multivariate linear regression model is also established, which refers to the maximum principal component quantity by variance explanation and cumulative projection importance.Then the composition of principal components is obtained through component matrix table .Then the importance of variables is obtained through the factor load coefficient table.Finally, the normalized formula of weathering degree was obtained by partial least squares regression [4] , and the pre-weathering component content was predicted by monitoring point data.On this basis, correlation coefficient analysis [5] and thermal map mapping were used to analyze the correlation and difference between the chemical components of different types of glass cultural relics, so as to determine the correlation and difference level between the components.Through these professional research and analysis, we hope to better understand the protection and restoration of cultural relics, and provide a reliable scientific basis for the identification, restoration and preservation of cultural relics under the modern scientific and technological environment.

Based on Logistic regression prediction model
Observing the chemical composition ratio data of classified glass relics, in theory, the sum of the proportion of each component should be 100%, but due to detection methods and other reasons, the sum of the proportion of its composition and non-100% situation.Data with the sum of all the chemical components of ancient glass between 85% and 105% are used as qualified data [6] .Data with the sum of chemical components outside this range are regarded as abnormal data and eliminated.A null value in the data indicates that the component is not detected by the detection means and is treated as 0.   The relationship between glass decoration and weathering can be seen from Figure 1.The glass relics decorated with B decoration are more easily weathered than those decorated with A decoration and C decoration, and the weathering conditions of glass relics decorated with A decoration and C decoration are similar.Figure 2 shows the relationship between glass types and weathering.Lead-barium glass is more susceptible to weathering than high-potassium glass, and the proportion of lead-barium glass to weathering is higher than that of potassium glass.Figure 3 shows the relationship between glass color and weathering.Black cultural relics are basically weathered, while dark blue and green cultural relics are basically unweathered.The proportion of green unweathered was higher than that of dark green and light green, and the proportion of light green unweathered was higher than that of dark green.Dark blue has a higher proportion of unweathered weather than purple, light blue and turquoise.
Logistic regression model was established to find out the internal rule of the quantitative relationship between glass weathering and glass type, decoration and color.Firstly, the relationship between weathering and a single variable is considered, that is, the Logistic equation of glass type, decoration and color.Then consider the relationship between weathering and two variables, that is, discuss whether there is an interaction between a single variable on "weathering".Established with or without weathering and glass types and ornamentation; There is no weathering and glass type and color; There are three Logistic equations of weathering pattern and color; Finally, Logistic equations of weathering and glass type, ornamentation and color are established.A total of seven Logistic mathematical models were obtained, as shown in Table 1.
As shown in Figure 4, it can be found that the type of glass cultural relics has the greatest influence on predicting the weathering probability of glass cultural relics.The Logistic equation based on glass type, color and pattern predicted the weathering results with the highest accuracy.

Evaluation model based on principal component analysis
According to the types of cultural relics and whether they are weathered, they are divided into four types: weathered high-potassium glass, non-weathered high-potassium glass, weathered leadbarium glass, and unweathered lead-barium glass; Assume that there are m variables for principal component analysis and a total of n evaluation objects, ij x is the jth index of the ith evaluation object, and the value of each index is the sample mean value of the jth indicator, and j s is the standard deviation of the jth indicator, as shown in Equation (1).
Calculate the correlation coefficient matrix R, where ij R is the correlation coefficient of the jth index of the i index, as shown in equation ( 2 n(n≤m) principal components were selected to calculate the comprehensive evaluation value.The information contribution rate and cumulative contribution rate of the eigenvalue ( 1, 2, , ) j jm   are calculated, as shown in equation (5).The information contribution rate of principal component i y is calculated as shown in equation ( 6).By analyzing the variance interpretation, the contribution rate p  is calculated, and the number of principal components is determined as n according to the principle that the contribution rate reaches 85%.By calculating and analyzing the principal component load coefficient matrix G, the importance of hidden variables in each principal component can obtained.The composition matrix H of the four types of glass was calculated and analyzed respectively, as shown in equation ( 7).The coefficients in the component matrix are linear combination coefficients, and the principal component scoring formula and the total scoring formula are obtained.
The principal component weight is α_j, as shown in equation ( 8); Where i F is the score of the i th principal component, j b is the linear combination coefficient, and Q is the content of each differentiation component of a cultural relic.Where F is the total score of a certain cultural relic, and i a is the weight of the main component.The characteristic range of weathered high-potassium glass, non-weathered high-potassium glass, weathered lead-barium glass, and unweathered lead-barium glass is calculated, as shown in Table 2 below.According to the monitoring points, the contents of each chemical component before weathering of all cultural relics were obtained, as shown in Figure 1, and the distribution of each component content of the three cultural relics glass was predicted.It can be seen that the contents of PbO and BaO will increase significantly with the higher the degree of cultural relics weathering; The contents of K2O, CaO, CuO, P2O5, SrO and SO2 will increase with the higher degree of cultural relics weathering.The contents of Na2O, MgO, Al2O3 and Fe2O3 will decrease with the higher degree of cultural relics weathering.With the higher the degree of weathering of cultural relics, the content of SiO2 will be significantly reduced.

Multiple linear regression prediction model
Pearson correlation analysis method was adopted, as shown in equation (15).By calculating the chemical components x and y to measure the degree of linear correlation between them, the Pearson correlation coefficient r is obtained.
First, the data in the integration table is divided into two categories according to the type, and the correlation coefficient matrix between each compound is obtained.The two-dimensional thermal map of the correlation coefficient is drawn with the coefficient matrix.The darker the color, the closer the value of the correlation coefficient is to 1, indicating the stronger the positive correlation, and the lighter the color, the closer the correlation coefficient is to -1.The stronger the negative correlation is.The color is centered, and the closer the data is to 0, the weaker the correlation.Data with the absolute value of correlation coefficient greater than 0.9 are used as the judgment criteria for correlation.By analyzing this figure, the correlation between the chemical components of different types of glass cultural relics samples was obtained, and the differences between the correlation coefficients of different types of chemical components were compared.
By comparing Figure 6 and Figure 7, it can be found that the color shades in Figure 6 are clearly divided, while those in Figure 7 are more evenly divided.It shows that there are more connections between the chemical components of lead barium glass and less connections between the chemical components of high potassium glass.
The differences of chemical composition correlation between different types of glass are as follows: 1.There is a strong positive correlation between alumina and silica in lead-barium glass, but a strong negative correlation between alumina and silica in high-potassium glass; 2. There is a strong negative correlation between strontium oxide and phosphorus pentoxide in lead-barium glass, and a strong positive correlation between strontium oxide and phosphorus pentoxide in highpotassium glass.3.In general, the correlation between the chemical components of lead barium glass is mostly stronger than that between the chemical components of high potassium glass.However, due to the small number of samples of high-potassium glass, it may weaken the correlation between the chemical components of high-potassium glass, so this difference is not particularly obvious compared with the first two.

Conclusions
In this paper, principal component analysis can eliminate the correlation between each chemical component and reduce the workload of selecting indicators.The correlation coefficient method can be used to obtain the correlation effect of each chemical component more directly.This paper uses a variety of analysis methods, such as data preprocessing, chart statistical analysis, logistic regression model, principal component analysis and multiple linear regression model, to deeply study multiple factors of glass relics from different angles.The logistic regression model was used to classify four types of cultural relics.The principal component analysis method was used to evaluate the model, and the comprehensive score range of various cultural relics was determined.The correlation and differences among the chemical components of different types of glass cultural relics were deeply analyzed through correlation coefficient analysis and thermal map rendering, providing reference value for cultural relics identification, restoration and preservation.The model can be applied to various types of cultural relics, and the result is universal and extensive, which is a research of practical value.However, the data sources in this paper may be limited, and the sample size is limited, which may affect some results.Although this paper adopts a variety of methods to establish the model, the correlation coefficient method cannot judge the strong correlation relationship with nonlinear, and the model itself is uncertain, which cannot accurately predict the color, type, chemical composition and other related factors of cultural relics, so it needs to be combined with the field observation and judgment of professionals.
The model presented in this paper can be applied to the related fields of the protection, restoration and reconstruction of glass cultural relics, expand the number of samples, and more comprehensively understand the degree of weathering and chemical composition of cultural relics.The model can be applied to the actual identification, restoration and preservation of cultural relics,

Figure 1 :
Figure 1: The proportional relationship between glass ornamentation and weathering.

Figure 2 :
Figure 2: Relationship between glass type and weathering ratio.

Figure 3 :
Figure 3: The relationship between glass color and weathering ratio.

Calculating the eigenvalues of the correlation coefficient
characteristic vector of m a new indicator variables.As shown in equation (4), where 1 y is the first principal component, 2 y is the second principal component, and m y is the m principal component.

1 (
matrix, sample number N is introduced to reduce the influence of content differences between different samples, and'    is the constant term.As shown in equation (13), ij k represents the content of the jth chemical component at the i monitoring point.As shown in equation (14), ' ij k ' represents the content of the jth chemical component at the i monitoring point after prediction.

Figure 5 :
Figure 5: Predicted composition content of 3 kinds of cultural glass

Figure 6 :
Figure 6: Lead barium correlation coefficient two-dimensional heat map

Table 1 :
This caption has one line so it is centered.