The Impact of National Language Proficiency on Personal Life Satisfaction—An Empirical Analysis Based on the 2017 China General Social Survey

: This paper examines the relationship between Mandarin proficiency and life satisfaction in China using data from the 2017 China General Social Survey. Drawing on the literature on language and well-being, we hypothesize that Mandarin proficiency is positively associated with life satisfaction and that this relationship may be moderated by factors such as gender, region, urban-rural status, and household income. We employed multiple regression analyses to test our hypotheses, controlling for various demographic and socio-economic variables. The results show a positive and significant relationship between Mandarin proficiency and life satisfaction and support the moderating effects of gender, region, urban-rural status, and household income. Our findings have implications for language policy and suggest that promoting Mandarin proficiency can help improve individuals' life satisfaction and well-being in China.


Introduction
China's economic growth, spurred by policy reforms and liberalization, has led to improvements in living standards and a focus on spiritual prosperity. The importance of personal life satisfaction is underlined, influenced by internal and external factors. Language, specifically proficiency in Mandarin as highlighted by Chinese law, is a key variable in life quality, impacting socio-economic status and life satisfaction [4,2,16,10,9,12]. However, the relationship between language skills and personal life satisfaction remains under-explored. This paper examines whether national language proficiency contributes to life satisfaction, considering factors like gender, age, marital status, education, urban-rural divide, income, and social equity. Results indicate a positive, robust relationship between national language proficiency and life satisfaction, more pronounced for female urban residents.
Life satisfaction and its determinants, particularly language proficiency in multilingual societies like China, have been a focal point of research. Drawing on the China General Social Survey (CGSS) 2017, we investigated the relationship between Mandarin proficiency and life satisfaction. Two hypotheses were tested, positing a positive relationship between Mandarin proficiency and life satisfaction, and variations across different subgroups. Our results, despite potential limitations from self-reported data and cross-sectional design, confirm these hypotheses and underscore the role of language proficiency in subjective well-being, with implications for language policy and social wellbeing initiatives in China. The study's dataset was obtained from the China General Social Survey (CGSS) cross-sectional data for 2017. This is one of China's most systematic and reliable databases for understanding the history of the contemporary socio-economic system and the lives of its citizens. The survey contained 12,582 valid questionnaires for 2017; after further eliminating invalid and abnormally low values, the final dataset had 3,793 valid samples.

Description of data sources and variables
One of the explanatory variables in this study is the respondent's satisfaction with their personal lives. Satisfaction is a measure of how satisfied people are with their lives. In conjunction with the more common social utility framework, satisfaction is used in socio-economic models. In the CGSS, the respondent is asked: "Overall, how satisfied are you with your current overall life situation?" Another explanatory variable is the level of proficiency in the national language. According to the "Standard of Mandarin Proficiency Test (for Trial Implementation)" issued by the State Language Commission, the main criteria for determining the level of Mandarin proficiency are pronunciation and use of free expressions, vocabulary and grammar, intonation, and completeness and fluency of content. In layman's terms, the NCLT refers primarily to the ability to use a combination of oral expressions (i.e., speaking). Therefore, we extracted two questions from them about the National Language Proficiency Test, measuring the respondent's oral capability from 'cannot speak at all' (1) to very fluent (5). This approach is a commonly used linguistic measure in comparative literature on applied economics, sociology, and other related fields, given its relatively intuitive, comprehensive, and effective way of reflecting the respondent's language proficiency.
For the selection of control variables, several factors, such as demographics, economic, and social environment, were accounted for, and main control variables, such as gender, age, marriage, education level, urban and rural areas, income, and social equity index were used in this study [1,3,7,8,11,6]. The descriptive statistics of the variables are summarized in Table 1, while more detailed results are provided in Appendix 1. As shown in the table, the average personal satisfaction indicator for 2017 was 4.86, between "not really satisfied" and "quite satisfied". This suggests that the respondents' knowledge of the national language still has considerable room for improvement.
The sample population had slightly more women than men, an average age of 50.8, and more than 76.2% were married. The average education level of respondents was between secondary to university level, 63.6% had lived in the city for a long time, and their self-identified level of social equity and family economic status was also low (mean score of 2.5).

Models and methods
As the explanatory variable life satisfaction is given as ordered discrete variables, this paper used an ordered probit model with the baseline equation given by the expression [13][14][15]5 where the Substituting the variables gives ( ) happiness mandarin 1, 2, , Based on the above equation, the likelihood function corresponding to the i th observation j can be written as Finally, the coefficients of the ordered Probit model can be obtained by using the great likelihood estimation α , β , γ and j r (j<J) .
To ensure that the regression results are realistic and reliable, we set dummy variables for personal life satisfaction and national language level. For personal satisfaction, '0' was given to scores 1-3, representing a low level of personal life satisfaction, and '1' to scores 4-7, indicating a high level of personal life satisfaction. For the national language level, '0' was given to scores 1-2, indicating low national language proficiency, and '1' for scores 3-5, representing high national language proficiency. An extended regression analysis was conducted using a binary Probit model to test the robustness of the baseline regression results, provided that all control variables were retained.
Since empirical research may sometimes require omitting variables and measurement errors, the regressions of the benchmark variables may also have endogenous problems. For example, the influence of factors, such as social circles, lifestyle habits, and differences in personality traits, on personal satisfaction may be overlooked, resulting in the omission of benchmark variables. Different groups may also have different value dimensions for evaluating their satisfaction with their quality of life. As a result, there is often a degree of variation in how people understand, perceive, and judge specific values of their satisfaction, leading to errors in specific objective measurements. In addition, individuals with relatively high levels of life satisfaction would appear more mature and confident and more likely to express their thoughts and feelings with their social surroundings, increasing their national language proficiency and resulting in biased regressions.
To overcome these potential endogeneity issues, the regression analysis was followed by an instrumental variables approach to correct for the endogenous relationship between proficiency in the national language and life satisfaction. Table 2 reports the ordered Probit model regression results (columns 1-6). Frijters pointed out that, provided the model is set up correctly, there are no advantages or disadvantages between OLS estimation and ordered Probit model estimation. And since the coefficients of the OLS regression results are more intuitive and easier to interpret, they were also computed (columns 7-8) and compared to the ordered Probit model results. The coefficients on national language proficiency in each equation were positive and significant at the 1% level, indicating a positive relationship between national language proficiency and personal life satisfaction.

Baseline regression
Among the control variables, the regression coefficient of gender is significantly negative, indicating that men and women are less satisfied with their personal lives compared to each other. The age regression coefficient showed a significant positive performance, indicating that mean age and personal life satisfaction showed a positive correlation, i.e., with further increase in mean age, personal life satisfaction also continued to increase steadily. The coefficient for urban and rural areas was significant and negative, indicating that rural residents generally have higher personal life satisfaction than their urban counterparts. The results suggest that socio-cultural and geographical factors have a considerable and robust influence on people's life satisfaction.
For economic factors, the higher the income, the higher the self-reported personal life satisfaction. This suggests that material wealth has a significant positive effect on an individual's subjective life satisfaction. For social factors, the more equitable the social environment is, the higher the life satisfaction of the people. This means that a good and healthy human and social environment also tends to positively affect personal life satisfaction. Note:***,**,* indicate significance at the 1 %, 5 % and 10 % levels, respectively. The t-values are in parentheses. Constant terms are omitted.

Robustness tests
Firstly, Columns (1) and (2) of Table 3 show the Probit regression results, with NPL and subjective life satisfaction set as dummy variables; scores 1-3 were given the value '0', and scores 4-5 were given the value '1'. Consistent with the previous ordered Probit regression, the coefficient for national common language proficiency was significant and positive, which suggests that national common language proficiency helps to increase one's sense of life satisfaction.
Since the NCLA coefficient can also be used to represent individual NCLA to some extent, we replaced the explanatory variables with individual NCLA and again conducted ordered Probit and OLS regressions. In good contrast to the results of item 2 of the original table, the results are shown in column (2) -column (3) of item 3 of the original table chart, the coefficient of the national general language listening level is still showing positive, with the most significant change at 1% condition. It is important to highlight that the coefficient of the CGSS 2017 National General Language Listening Level is significantly larger than the coefficient of the National General Language Expression Level, indicating that the level of listening ability has a greater impact on individual life satisfaction than the ability to express oneself verbally. The reason for this may be that, in the course of daily work, study, and interaction with others, it is more important for people to be able to listen and understand than to express themselves in order to perform their work and that "not being able to hear" is more likely to affect their psychological state and subjective feelings. Comparing Tables 2 and 3, the results from the different methods and the substitution of explanatory variables confirm that national language proficiency is positively related to subjective life satisfaction.
As previously mentioned, endogeneity problems can arise between national language proficiency and individual life satisfaction due to omitted variables, measurement error, and reverse causality of variables. One of the more effective ways of addressing endogeneity issues is the instrumental variables approach. Among these, the most common is the two-stage least squares method applied to continuous variables. But since the explanatory variable (subjective life satisfaction) is an ordered discrete variable, the conditional mixed process (hereafter referred to as "CMP") proposed by Roodman was used to estimate instrumental variables for NPL and life satisfaction. This method has been widely used in the academic community. Based on a seemingly uncorrelated regression, the CMP is estimated by constructing a recursive set of equations based on a maximum likelihood estimation. The estimation process is divided into two stages. In the first stage, the correlation is estimated between the instrumental variables and the core explanatory variable (i.e., national general language proficiency). In the second stage, the results of the first stage are substituted into the model, and the exogeneity of national general language proficiency can then be determined. If the endogeneity test parameter is significantly different from zero, the model has an endogeneity problem, and the CMP estimates are more accurate; otherwise, the baseline model does not have an endogeneity problem, and the ordered probit model estimates are reliable. Before performing CMP estimation, a suitable instrumental variable should be found. The selected instrumental variables would have to satisfy three conditions: (1) correlated with national general language proficiency; (2) not correlated with subjective life satisfaction; and (3) not correlated with other explanatory variables and error terms.
Three variables were considered as instrumental variables for this study: (1) frequency of newspaper use, (2) mother's education level, and (3) average national general language proficiency at the same provincial and municipal level. First, as newspapers are a vehicle for the standard national language, reading newspapers may have a positive impact on people's national language proficiency. Since language proficiency encompasses pronunciation and language organization, regularly reading newspapers can increase reading comprehension, improve vocabulary and organization, and enhance speaking skills. To some extent, the use of newspapers can be used to indicate proficiency in the national language. Second, parental education can also have some effect on children's language proficiency. And since mothers have relatively more linguistic interactions with their children, their educational attainment was also used as an instrumental variable. Finally, drawing on existing literature in language economics, regional language proficiency can be used to characterize the population's real language skills, but this parameter is not directly related to life satisfaction. Therefore, we used the mean of the national common language proficiency from the same region as the third instrumental variable indicator. Table 4 presents the instrumental variable regressions estimated using the CMP model. Among them, the CGSS2017 atanhrho 12 indicators are all significant at 1%, indicating that the serial Probit patterns of the standard regression are intrinsically interconnected, and the regression of CMP is more effective. In addition, while the coefficients for common language levels in both Ordered Probit and CMP models had the same sign, the CMP coefficients were somewhat larger. This could be due to endogeneity issues, such as omitted variables and measurement errors, causing the downward attenuation bias to outweigh the upward bias, a common phenomenon in labor and language economics literature. This also reflects the fact that the baseline estimates of the ordered Probit model are likely to be low and that the true impact of national language proficiency on subjective life satisfaction is greater.

Heterogeneity analysis of the impact of national language proficiency on subjective life satisfaction
Considering that the relationship between national language proficiency and subjective life satisfaction may vary, the samples were regressed into groups. In the gender subgroup, the endogeneity test parameter was significant for both female and male samples, indicating that the CMP results are more informative. In the CGSS 2017, the coefficient for national language proficiency was larger among women than among men, probably because women are more likely to work in the service industry. Fluency in Mandarin is vital in communicating with customers and improving work efficiency. Since women also invest more energy in communicating with their children, elders, and friends, they become more linguistically proficient. Thus women's proficiency in the national language has a greater impact on their personal life satisfaction than men's. See Table 5. For the regional groupings, the Atlas of Mandarin Structure was used in dividing the samples into two dialect areas: the northern dialect group (official) and the non-northern dialect group (nonofficial). The CMP results show that the coefficient for national language proficiency was greater among residents in non-northern dialect areas than in northern dialect areas (see Table 6). This means that language proficiency had a greater impact on life satisfaction among those living in the southern region than those in the north. China is a vast country with significant differences between the north and the south; within each region, there are many dialects and minority languages. Particularly for non-northern dialect areas, limited proficiency in the national language can restrict career choices, minimize day-to-day communications, and hinder social adaptation, thus lowering the person's life satisfaction. In the urban and rural subcategories, while the endogeneity test parameter was significant for both subgroups, the impact of national lingua franca proficiency on life satisfaction was more pronounced among urban residents (see Table 7). One possible factor is that although Mandarin in rural communities has increased significantly, its usage remains relatively low and has little impact on people's daily lives and emotional relief. In comparison, for urban residents who have not yet mastered Mandarin (Putonghua), the language barrier prevents them from obtaining quality jobs, limits their salary levels, and hinders their ability to de-stress and seek emotional relief. However, as the popularity of Putonghua increases, the impact of language proficiency on life satisfaction will increase even among rural communities.

Discussion
This study aimed to examine the relationship between Mandarin proficiency and life satisfaction in China, taking into account intersecting factors such as gender, region, urban-rural, and household differences. The results affirmed a positive association between Mandarin proficiency and life satisfaction and highlighted significant variations in this relationship across different contexts. However, our study has certain limitations, such as the inability to establish causal relationships between Mandarin proficiency and life satisfaction, and potential reporting biases in our analysis that relied on self-reported language proficiency and life satisfaction measures. Future research can consider the impact of other languages on individual well-being and possible trade-offs between enhancing national language proficiency and maintaining linguistic diversity. Our study underscores the need for more nuanced analyses considering the complex interplay of language proficiency, personal traits, and contextual factors. This would offer a more comprehensive understanding of how language proficiency affects well-being in various contexts. Overall, our study provides substantial contributions to understanding how language proficiency shapes life satisfaction.References

Conclusions
Language holds significant economic value and acts as an essential social resource. This paper employed 2017 CGSS survey data to explore the impact of national language proficiency on life satisfaction. The findings suggest that national language proficiency can enhance personal life satisfaction, and this effect is robust. Further analysis shows differences in the impact of national language proficiency on life satisfaction, influenced by variables such as gender, region, urban-rural dichotomy, and household characteristics. This study contributes to the literature on life satisfaction by offering a novel approach to enhancing individual life satisfaction through promoting the use of national languages.
Although the promotion of Mandarin is a fundamental language policy of China, it is yet to be fully popularized, especially in rural and remote ethnic communities. This study suggests that mastering the national language can effectively improve personal life satisfaction. China has always prioritized its national language cause since the 1980s, promoting Mandarin across the country through various policies and regulations and integrating the use of the national language into the performance assessment of poverty alleviation work. Governments at all levels have begun to consider the impact of language on personal welfare in terms of improving Mandarin proficiency and individual life satisfaction.