Understanding the Determinants of K-12 Academic Success in the United States: OLS Regression

: This paper investigates the critical factors that influence K-12 students' academic performance in the United States. Utilizing the 2019 Parent and Family Involvement (PFI) in Education Survey data collected by the National Center for Education Statistics (NCES), three new indices are constructed that summarize parental involvement in school and family events, parental satisfaction with schools, and students' extracurricular learning time. Initially, data visualization is employed to examine the data characteristics, followed by Ordinary Least Squares (OLS) regression to analyze the relationship between students' grades and these three indices. The findings reveal a significant positive correlation between students' grades and the extent of parental involvement in activities, parents’ satisfaction with school work, and students’ studying time outside the classroom. These results can provide valuable and actionable insights for future educational practices and policies while guiding parental involvement to support their children's academic achievement.


Introduction
K-12 education is a crucial foundation for children's academic and future success.Through K-12 education, students not only lay the foundation for higher education but also find their own interests and even future career development directions by learning various subjects.Moreover, K-12 education plays a vital role in promoting cognitive and social development, with strong links between academic achievement and positive outcomes in adulthood.Necessary interventions in the K-12 education stage for children with cognitive impairment and psychological impairment can effectively reduce their crime rate as adults (Jones et al., 2015).In addition, K-12 education provides the foundation for lifelong learning and the advancement of students.Children who receive a high-quality K-12 education are more likely to have better health outcomes [24], and healthier students will have an advantage in learning [4].
As student grades typically serve as the primary benchmark for assessing educational accomplishments, the question of how to boost these grades has emerged as a top priority for educators, parents, and students.
Research highlights the significant impact of parental involvement on the academic performance of K-12 students [18].Epstein et al. [9] suggest that actions such as monitoring children's homework, setting high expectations, and engaging in conversations about the school can contribute to higher student achievement.A study by Khairul Islam and Tanweer J Shapla [12] shows that active parental engagement in a student's learning and life can lower absenteeism rates [12].Consistent school attendance promotes cohesive learning experiences, resulting in improved grades.Parental involvement can take various forms, including attending parent-teacher conferences, volunteering at school, and tracking their child's academic progress, all of which can lead to enhanced academic outcomes.
Furthermore, the extent of a student's extracurricular studies can influence their grades.Research indicates that students who receive tutoring exhibit superior performance in mathematics and reading compared to those who do not [23].Similarly, students who dedicate more time to completing homework assignments achieve higher levels of academic success [11].This indicates that investing additional time and effort in studying beyond the classroom can improve grades.
Distinct from previous research, this study examines both parental involvement and students' extracurricular learning concurrently.Parental involvement encompasses various activities, including participation in school events and home-based interactions with children, such as reading together, playing games, and supervising homework completion.The frequency of parental engagement in these activities during an academic year serves as the variable for analysis.Students' extracurricular learning is assessed based on the number of days and hours dedicated to learning outside of school each week.Moreover, considering that parents' satisfaction with schools may indirectly impact children's academic performance, I have integrated this as a third factor in my analysis.It specifically includes parents' satisfaction with school teachers, school standards, school order and discipline, and communication methods between school teachers and parents.
This study aims to examine the correlation between the three identified factors and students' academic performance through the application of OLS regression analysis.Section 2 reviews prior research, while Section 3 presents the research methodology, detailing data models, variables, and analysis methods.Section 4 showcases the data visualization outcomes and findings derived from the OLS regression model.Lastly, Section 5 provides the conclusion of this paper.

Ordinary Least Squares (OLS) Regression
Ordinary Least Squares (OLS) Regression is a popular statistical method used to estimate the relationships between a dependent variable and one or more independent variables across various research fields [20].OLS regression offers several advantages, such as simplicity, interpretability, desirable statistical properties, and flexibility, making it an essential tool for researchers [5].Its applicability extends to diverse disciplines, including economics, social sciences, engineering, and natural sciences, allowing for a wide range of applications and insights.
Since the inception of OLS regression by Legendre [19] and its further development by Gauss [10], the model has been widely applied across various research areas, including economics, social sciences, engineering, and natural sciences.For instance, Angrist and Pischke [3] explored the application of OLS regression in causal analysis, while Kim [16] employed OLS regression to investigate the determinants of household electricity consumption.Moreover, Nasiri et al. [21] utilized the method to examine factors influencing land subsidence, further demonstrating the method's versatility and adaptability across disciplines.

Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a powerful statistical method that has garnered significant attention in recent years due to its diverse applications across various fields.PCA is a dimensionality reduction technique that transforms a large set of correlated variables into a smaller set of uncorrelated variables known as principal components, while retaining most of the variance in the original data [1].
The advantages of PCA are manifold.Firstly, PCA reduces the dimensionality of data, allowing for more manageable and efficient data analysis without losing essential information [14].This reduction in dimensionality can lead to faster computational times and reduced storage requirements, which is particularly important when dealing with large datasets.Secondly, PCA can help mitigate multicollinearity problems in regression analysis by generating uncorrelated principal components (PCs), which can then be used as predictors [13].This allows for more stable estimates and improved interpretation of the results.Thirdly, PCA can improve visualization and understanding of complex datasets by projecting high-dimensional data onto a lower-dimensional space, making it easier to identify patterns, trends, and relationships among variables [28].Lastly, PCA is a highly flexible technique that can be applied to a wide range of data types and is not limited to specific distributions or assumptions, making it a versatile tool for data analysis across various domains [1].

Determinants of Student's Academic Performance
There is a lot of research aiming to comprehend the factors that influence student academic performance.These determinants can be broadly classified into three categories: student-related factors, family-related factors, and school-related factors.
The first category, student-related factors, encompasses the individual attributes of students that directly impact their academic performance.Chih-Hao [6] suggests that good grades are inseparable from students' efforts, and those who invest more time and energy in self-study activities tend to achieve better academic results.Furthermore, Spitzer [26] found a positive correlation between study time and math performance, highlighting that increasing study time is particularly effective in improving the academic ability of students with poor academic performance.
The second category, family-related factors, plays a critical role in shaping a student's academic outcomes.Dettmers et al. [7] found that high-quality parental homework engagement is linked to student well-being at school and home, as well as student achievement in math and language.Highquality parental homework increases communication between students and parents fosters better relationships and promotes academic performance.Engin [8] asserts that a neglectful parental attitude towards a child's learning and life negatively impacts students' performance, as weakened parentchild connections result in a lack of motivation to learn.
Lastly, school-related factors contribute to students' academic performance.Different teaching methods adopted by teachers have varying effects on students' performance.For instance, Tokac & Thompson [27] demonstrated that more instructional interventions, including video games, may improve students' math performance.Additionally, school discipline impacts student performance.To maintain and enhance student achievement, school policies should prioritize a proactive approach that seeks to identify and address potential issues before they escalate, rather than reacting solely after a student experiences significant difficulties [2].

Data
In this study, the data on parent and family involvement in K-12 education were collected by the National Center for Education Statistics (NCES) in 2019 [22].The data set contains 15500 students' information, including student's basic identical information, student's basic information about their families, students' academic grades, parents' participation in school activities, parents' participation in school meetings, parent-school communication, parents' involvement in family events, parents' satisfaction with school, parents' satisfaction with teachers, parents' satisfaction with academic standards, parents' satisfaction with discipline, parents' satisfaction with school staff/parent interaction, days and hours that student spent doing homework in a week.In previous research, student achievement was found to be related to parental involvement and the amount of after-school learning [17] [25].Considering that parents' satisfaction with the school will convey a positive or negative emotional impact to students, this paper also takes parents' satisfaction with school work as one of the independent variables.As shown in Table 1, student grades are selected as the dependent variable, and the three independent variables are parental involvement in school and family events in this school year, parents' satisfaction with the school in this school year, and students extracurricular learning time.Parental involvement is assessed based on the frequency of parents' participation in school and family activities, while parents' satisfaction with the school is measured using their responses to questions about the school's performance and their overall perception of the institution.Lastly, students' extracurricular learning time is gauged through selfreported hours spent on learning activities outside of regular school hours.These variables are extracted and classified according to their relevance to the questionnaire questions, as shown in Table 1.Each question is scored on a scale of 0 to 1, and each student's score for each index is the sum of his or her scores for all questions included in each index.For example, a student's answers to the two questions included in Students' extracurricular learning time are

Ordinary Least Squares (OLS) Regression
This paper uses OLS regression to analyze the relationship between the three indexes and student performance.The form of the OLS regression model is shown in (1), where   is the predicted value of the student's grade;   represents for the factors that determine grades;   is the regression coefficient [15].
This paper first used three OLS regressions to obtain the models of student grades and parental participation, parental satisfaction and students' extracurricular learning time, as shown in (2) (3) (4) below.The regression coefficients are represented by , , and , respectively.
Then, I used multiple linear regression to get the model between students' grades and these three indexes, as shown in the following formula (5).The regression coefficient is represented by .
=  0 +  1  _ +  2  _ +  3  _ (5) Lastly, I first derived a composite variable (  ) through Principal Component Analysis (PCA) of the original three independent variables: Parental involvement in school and family events in this school year (PA_index), Parents' satisfaction with the school in this school year (SA_index), and Students' extracurricular learning time (ST_index).Then, I performed OLS regression of   and grade, as shown in the following formula (6).The regression coefficient is represented by .

Principal Component Analysis (PCA)
To use Principal Component Analysis (PCA) [28] for combining three variables (PA_index, SA_index, ST_index) into a single predictor (  ) in a regression model with the dependent variable   , first standardize each variable using formula (7), where Z is the standardized variable value; X is the original variable value; μ is the mean, and σ is the standard deviation.Z = (X − μ)  (7) Next, compute the covariance matrix, calculate eigenvalues and eigenvectors, and select the principal component (PC1) with the largest eigenvalue.Compute PC1 scores using formula (8), where   are eigenvector elements and   are standardized values.
Finally, fit the regression model with formula (9).

Characteristics of Students
Out of the 15,500 students who participated in the 2019 Parent and Family Involvement (PFI) Educational Survey, a total of 12,766 students were chosen for inclusion in the study after excluding those who left some pertinent questions unanswered or indicated "no grades".

Distribution of Student's Grades
Figure 1 illustrates that out of the 12,766 analyzed, more than half (7,282 students) obtained A grades, which shows that achieving A grades may not be very challenging for students.Therefore, it is crucial to identify the factors that influence students' academic performance to provide guidance on enhancing their grades.Based on the results depicted in Figure 2, it can be seen that among the 24 school or family activities listed in the questionnaire, parents of students who received grades of A or B participated in approximately 13 activities, while parents of students who received grades of C or D participated in approximately 12 activities.The figure displays both the mean (represented by the blue line) and the median (represented by red line).It is evident from the figure that there is a positive correlation between parental involvement in activities and their children's academic performance.Specifically, the more activities parents participate in, the higher their children's grades tend to be.

Distribution of Parental Satisfaction with School
In Figure 3, the median (red line) and mean (blue line) of all parents' satisfaction with the school are above the average level, signifying that parents generally have a favorable assessment of the school.Parents of students with grades A and B rate the school above 4 points (out of a total of 5 points), while parents of students with grades C and D give ratings around 3 points.The higher the parents' overall satisfaction with the school, the better the children's grades.Figure 4 demonstrates that students with A, B, and C grades average around three days per week doing homework outside of school, while students with D grades only spend about two days.Figure 5 reveals that students with A grades dedicate over 5 hours per week to studying outside of school, whereas those with D grades allocate only 3 to 4 hours.A correlation between longer extracurricular study time and better grades can be observed in both Figure 4 and Figure 5.

Determinants of Student's Grades
The results obtained by OLS regression are shown in Table 2.In the first model, when the parents do not participate in any activities, the student's predicted GPA is approximately 3.18.A student's GPA is expected to increase by about 0.02 for each additional school or family activity that parents participate in.In the second model, if the parents are dissatisfied with all aspects of the school's work (for example, school discipline, teachers' teaching level, school's communication style with parents, etc.), the student's predicted GPA is around 2.64.For every additional school job that satisfies parents, students can expect to increase their GPA by approximately 0.20.In the third model, students who do not study at all outside the classroom have a predicted GPA of about 3.03.When these students change to studying outside of class every day or increase the time of studying outside of class every week by 75 hours, their GPA is expected to increase by roughly 0.50.In the fourth model, when parents do not participate in any activities, are dissatisfied with all school work, and students never study outside of class, the student's estimated GPA is approximately 2.24.An additional unit of parental involvement in school or family activities is associated with an increase of about 0.01 in students' GPA; a one-unit increase in parental satisfaction with school work leads to an average increase of around 0.18 in students' GPA; an additional hour of study time outside of school is associated with a roughly 0.41 increase in students' GPA.In the last model, when the three indexes are considered as a whole, every time it increases by one unit, the student's predicted GPA increases by about 0.18.
All the coefficients obtained by the five models are significant and positive.This indicates that greater parental involvement in school or family activities, increased efforts by the school to enhance parent satisfaction through improved work quality, and extended extracurricular study time for students strongly contribute to higher student grades.

Conclusion
In this paper, data from the 2019 Parent and Family Involvement (PFI) in Education Survey by the National Center for Education Statistics Institute was categorized into three indices.These indices were created based on the similarity of the questionnaire items and include parent involvement in school and family events, parent satisfaction with schools, and the length of time students engage in extracurricular learning.By visualizing students' performance in bar charts and histograms of the three indices grouped by different grade levels, I observed a positive correlation between higher student performance and increased median and mean values for parental involvement, parents' satisfaction with school work, and students' extracurricular learning time.To ascertain the relationship between the academic performance of American K-12 students and these three indices, I employed an Ordinary Least Squares (OLS) regression.The separate OLS regression analysis of student performance and the three indices revealed significant positive correlations.The results of a multiple OLS regression analysis, which included student performance and all three indices, were consistent with the initial findings.Finally, I used the Principal Component Analysis (PCA) method to create a new, integrated index from the three original indices and conducted an OLS regression with student performance.The results continued to display a significant positive correlation between the new index and student performance.
In conclusion, parents can help improve their children's academic performance by actively participating in school or family activities.Schools can enhance parental satisfaction with their operations by continually improving the quality of education and administration, thereby further boosting student performance.Meanwhile, students should consciously utilize their extracurricular time for additional learning to elevate their academic achievement.For future research, exploring important factors influencing the academic performance of American K-12 students using non-linear models may prove beneficial, as the OLS regression model employed in this paper is linear.

Figure 3 :
Figure 3: This graph shows the distribution of parental satisfaction grouped by students' grades.4.1.4.Distribution of Students' Extracurricular Learning Time

Figure 4 :
Figure 4: This graph illustrates the distribution of days students spent on homework weekly, categorized by their grades.

Figure 5 :
Figure 5: This graph shows the distribution of hours students spent on homework weekly, categorized by their grades.

Table 1 :
Variables Definition

Table 2 :
OLS Regression Results