Grit as a Predictor of Foreign Language Proficiency: An Investigation of Grit and EFL Proficiency in Japanese University Students

obtained his Ph.D. in Education with a specialization in ESL Northcentral University. He foreign language lecturer Tokyo, Japan. His research interests include self-directed learning, computer-assisted language learning, and the role of individual differences in second language acquisition. Abstract Japanese learners of English as a foreign language often do not attain levels of English proficiency that allow them to conduct even simple conversations in English. If a predictor of foreign language learning outcomes were available, educators could potentially identify and support students at risk of poor outcomes. This study investigated the non-cognitive trait of grit as a possible predictor of foreign language learning outcomes in Japanese university students. An online anonymous survey was conducted at two universities in eastern Japan. In addition to demographic information such as sex, age, and year in school, respondents were asked to self-report their most recent score on a standardized test of English, such as TOEIC or TOEFL, as well as their current GPA. Additionally, participants were administered a Japanese translation of the Grit-S measure. After confirming the validity of both the data and the measure, regression analysis was conducted to determine the relationship between grit and the English proficiency test scores both before and after controlling for prior academic achievement as measured by GPA. It was found that higher grit was predictive of higher English proficiency test scores, even after controlling for GPA. After presenting these findings, the implications of these results and ideas for future research are discussed.

While a great deal of grit research has examined the relationship between grit and learning outcomes in academic contexts, only a handful of studies have investigated the role of grit in foreign language learning (e.g., Giordano, 2019;Robins, 2019;Wei et al., 2019;Teimouri et al., 2020), and of these studies none has investigated the relationship between grit and foreign language proficiency. This is surprising given that learning a foreign language requires persistent and concentrated effort over a long period of time in order for language acquisition to take place (Saville-Troike & Barto, 2017;VanPatten & Williams, 2015). If grit is a predictor of foreign language learning outcomes, it could be used by educators to identify and support students that may be at-risk of struggling in foreign language learning classrooms. Such a predictor would be especially useful in Japan, where most learners do not achieve levels of English proficiency that enable them to conduct even simple conversations in English (Kobayashi, 2019;Lee, 2019). Although Japan continues to try to improve English education, for example by encouraging the development of English-medium undergraduate programs (Brown, 2017), such efforts have yet to produce a noticeable effect on the English language proficiency of Japanese students. In this study, the relationship between grit and foreign language proficiency was investigated in a sample of Japanese university students studying English as a Foreign Language (EFL).

Research Questions
This study set out to investigate the following two research questions.

RQ1:
What is the relationship between grit and the English language proficiency of Japanese university students, as measured by scores on the TOEIC or TOEFL standardized tests of English? RQ2: What is the relationship between grit and the English language proficiency of Japanese university students, as measured by scores on the TOEIC or TOEFL standardized tests of English, after controlling for prior academic achievement as measured by grade point average (GPA)?

Methodology and Design
This study employed a quasi-experimental ex post facto survey design to gather data about the research participants. Although experimental designs are the preferred way to statistically investigate research questions, quasi-experimental designs are more appropriate when the variables being studied are intrinsic to the participants, such as ethnicity, gender, or, as is the case in this study, grit (Price et al., 2015;Silva, 2010). Quasi-experimental ex post facto survey designs have been used in numerous grit studies in order to demonstrate grit's predictive relationship with positive outcomes (e.g. Strayhorn, 2014;Muenks et al., 2017;Robins, 2019;Wei et al., 2019;Wolters & Hussain, 2015), so this study aligns well with prior grit research.

Population and Sample
The population under study in this research is composed of Japanese university students attending four-year colleges. Despite having had six years of English classes in middle school and high school and an additional two years of EFL classes at the university level, Japanese students tend to have poor levels of English proficiency (Kobayashi, 2019;Lee, 2019). The situation is so severe that Japan ranks 41st out of 49 countries in average Test of English for International Communication (TOEIC) scores (ETS, 2018a). There are a variety of reasons for the poor performance of Japanese EFL students, including an overreliance on grammar-translation as an instructional methodology (Morita, 2015), lack of adequate teacher training (Otsu, 2017), and an entrance exam system which promotes rote memorization over the productive use of language (Lee, 2019).
For a study to have external validity, the research sample should represent the population as closely as possible (Mertens, 2015). There is some evidence that grit studies require large sample sizes in order for relationships to be detected. For example, while Palisoc et al. (2017) only investigated 98 students studying at a single pharmacy school and found no significant relationship between grit and student GPA, Pate et al. (2017) sampled 724 pharmacy students across three institutions and found a significant predictive relationship between grit and GPA. Similarly, whereas Salles et al. (2017) found no significant relationship between grit and retention in a surgical residency program with only 73 participants, Hakeem et al. (2020) found with a sample of 427 neurosurgery residents that grit was negatively related with burnout among residents. These studies imply that grit research may require larger sample sizes in order for statistically significant relationships to be detected. Therefore, to ensure a large enough sample size and allow for generalization of the results, participants were recruited from two four-year universities located in the eastern region of Japan. The first university research site specializes in science and technology majors and enrolls approximately 10,000 undergraduate students per year. The second university research site specializes in language majors and enrolls approximately 3,500 undergraduate students per year. The students at both research sites are required to periodically take standardized tests of English such as TOEIC or the paper-based Test of English as a Foreign Language (TOEFL). The scores on these tests are used by the universities to decide placement in streamed EFL programs as well as to determine eligibility for school-sponsored scholarships and study abroad programs.
Participants were recruited via an email announcement from each research site's Academic Affairs office. It was stressed in the recruitment email that participation in the research was voluntary and there was neither a reward for participating nor a penalty for not participating. The recruitment email contained a link to the consent forms and research instrument, which were hosted on SurveyMonkey (2020).

Instrumentation
The instrument used to collect data for this research was an online anonymous survey (Appendix A).
The screening questions, research consent forms, and survey itself were all written in Japanese, the native language of the participants. Potential respondents first needed to confirm that they were a Japanese national, over the age of 18, and currently matriculating as a second-, third-, or fourth-year student at the research site. First-year students were excluded from this study due to the fact that they had yet to establish a university GPA, which was used as a controlling variable representing prior academic achievement during the data analysis phase of the study.
Once respondents indicated they were eligible for participation in the study, they were provided access to the informed consent forms, which explained the purpose of the study and what would be required of participants and given the opportunity to withdraw from the study if they did not agree with the terms. After informed consent was obtained, participants were given access to the survey itself, which was divided into two parts. The first half of the survey collected demographic information about the participants including: sex, age, year in school, major, and self-reported GPA.
Students were also asked which standardized test of English, TOEIC or TOEFL, they had most recently taken and to provide their latest score on that test. Students at both research sites are required, at a minimum, to take either the TOEIC or paper-based TOEFL test once every two years. The scores students receive on these university mandated tests as well as their GPA are available on the students' personal university webpage, ensuring that all participants had access to their most recent scores.
The second half of the survey utilized a Japanese-translation (Nishikawa et al., 2015) of the Grit-S measure (Duckworth & Quinn, 2009) to assess respondents' grit. The Grit-S is a self-report survey which consists of eight Likert-scale items, four of which measure the perseverance of effort subscale and four of which measure the consistency of interest subscale. As the name implies, the perseverance of effort subscale measures the tendency of the respondent to maintain sustained effort over time toward goals whereas the consistency of interest subscale measures the respondents' tendency to stay focused and not be distracted from goals (Crede et al., 2017). Each item in the measure is self-rated by participants on a scale of 1 = not at all like me to 5 = very much like me. The respondent's total grit score is calculated by summing the scores on all items (with consistency of interest items reverse-scored) and dividing by the number of items (i.e., eight) to produce a score between 1 and 5, with higher values indicating that a respondent tends to be grittier.

Ethical Assurances
Data collection for this study was conducted as part of a Ph.D. dissertation research project. Approval from Northcentral University's Institutional Review Board was obtained prior to the collection of data (Appendix B), as was permission from Academic Affairs offices of both research sites. All students agreeing to participate in the research study digitally agreed to an informed consent form which explained the purpose of the research and how data would be collected and managed, as well as promising confidentiality to all participants. The consent forms explicitly stated that students could withdraw from the research at any time without penalty, although no respondents chose to do so. No identifying information about participants was collected during the study and participants were promised that study results would only be published in aggregate form, without any individual answers displayed. Only the primary researcher had access to the study data, which was kept in a password-protected file in a password-protected computer. A password-protected backup of the data was also stored in the cloud.

Results
In total 283 second-, third-, and fourth-year Japanese university students filled in the anonymous online survey to completion. Data were analyzed using IBM SPSS statistics (Version 25). First, the validity of both the data and the instrument were investigated to check for outliers or potential bias.
Next, a descriptive analysis of the data was conducted. Finally, linear regression was utilized to answer the research questions.

Validity of the Data
While reviewing the completed surveys, three cases were identified as problematic due to the reporting of extremely low GPA scores: two students input a GPA of 0 and a third student input a GPA score of 0.8. These are improbably low scores but because the survey was anonymous, there was no way to ascertain the correct GPA value for each of the respondents. Because these cases would heavily skew the planned regression analysis of the data, they were removed from the study.
In addition to the above three cases, several issues were identified with demographic data input by respondents. For example, some non-science majors such as Project Management majors had erroneously identified themselves as science majors and conversely some science majors such as Engineering majors had misidentified themselves as non-science majors. These mistakes were corrected. Additionally, major names were modified to ensure consistency. For example, some Project Management majors identified themselves as simply "Management" majors and some International Communication majors identified themselves by their abbreviated name, "IC Department." All major names were therefore standardized to be consistent.

Validity of the Instrument
The validity of the Grit-S measure has been established in several published studies. Numerous studies have confirmed the construct validity of the Grit-S through Confirmatory Factor Analysis (e.g. Duckworth et al., 2007;Duckworth et al., 2009;Muenks, et al., 2017). Criterion validity has also been demonstrated in studies which have shown grit's value in predicting positive outcomes such as higher GPA scores (Bowman et al., 2015;Duckworth et al., 2007;Duckworth & Quinn 2009;Hwang et al., 2018;Muenks et al., 2017;Schmidt et al., 2019;Strayhorn, 2014;Wolters & Hussain, 2015) and psychological well-being (Datu et al., 2018, Salles et al., 2017Wyszynska et al., 2017). Both Duckworth and Quinn (2009) and Nishikawa et al. (2015) have used the Grit-S with university student populations and found the Grit-S to have from moderate reliability as measured by Cronbach's alpha scores. Furthermore, goodness-of-fit indices showed the Japanese translated version of the Grit-S to be a good fit for Japanese university students (Nishikawa et al., 2015).
To assess the internal reliability of the instrument used in this study, Cronbach's alpha was calculated for both the measure as a whole and each individual sub-factor, as is typically done in grit studies (Eskreis-Winkler et al., 2014). Values of .70 or higher are usually taken to represent adequate reliability in a measure (Field, 2016), although values of .60 or higher are also considered acceptable (Aron et al., 2013). Items 2, 4, 7, and 8 in the Grit-S represent the perseverance of effort subscale and demonstrated moderate reliability with a Cronbach's alpha of .76. The consistency of interest subscale, composed of the remaining items, was slightly lower but still within the acceptable range at a value of .67. The first item in the subscale, "New ideas and projects sometimes distract me from previous ones," demonstrated a slightly weak corrected-item total value of .28. Corrected item-total correlation values of less than .30 indicate an item did not correlate strongly with the total score of the scale and may be an issue (Field, 2016). However, recalculating the internal reliability without the item resulted in only a slight increase of Cronbach's alpha to .69 and therefore the item was kept. The Grit-S as a whole displayed moderately strong internal reliability with a Cronbach's alpha of .78. The Cronbach's alphas found in this study align with those of Duckworth and Quinn (2009), who reported Cronbach's alphas ranging from .60 to .78 for the perseverance of effort subscale, .73 to .79 for the consistency of interest subscale, and .73 to .83 for the Grit-S overall in the populations they studied.

Demographic Questions
The first part of the survey instrument asked students to report on demographic variables including their sex, major, age, and year in school. Table 1 summarizes the results of the demographic questions.
Of particular note is that female respondents composed 63.9% (n = 179) of the sample compared with males who composed only 36.1% (n = 101). Yet in the general population of Japanese undergraduate students, female university students compose only 44% of the population whereas males comprise 56% (Statista, 2020). While the sample used in this study may not be representative of Japanese universities in general, it closely matches the demographic situation at the second research site, which specializes in language majors. This is likely a result of the fact that approximately 2/3 of the survey respondents were language major undergraduates from the second research site (n = 189).
Language majors were overwhelmingly studying English and in either the English department (n = 90) or the International Communications department (n = 39), although a variety of other language majors were represented including Spanish (n = 18), Chinese (n = 10), and Thai (n = 9). Science and technology majors were more evenly distributed among the various majors, with Information Systems (n = 14) and Applied Chemistry (n = 12) being the most prevalent majors. Majors other than languages or sciences included Design (n = 9), Project Management (n = 6), and Urban Planning (n = 5). The majority of respondents indicated that they were sophomores (n = 153) and their age was 20 years old (n = 99). In fact, more sophomores responded to the survey than third-year and fourth-year students combined (n = 127). Likely due to the overwhelming response by sophomores, respondents reporting an age of 19 or 20 (n = 193) outnumbered all other age groups (n = 87) by a ratio of more than 2:1.
In summary, the majority of respondents to this survey were second-year female English majors of approximately 20 years of age. Obviously, this is not representative of Japanese university demographics in general. Therefore, caution must be used when attempting to generalize the results of this study to the larger population of Japanese university students as a whole.

GPA
GPA was utilized as a controlling variable during the data analysis phase of the research. Figure 1 below provides a histogram of the GPA reported by the respondents. As can be seen from the figure, the GPA scores appear non-normally distributed. However, because regression analysis does not require either the independent or dependent variables to be normally distributed (Field, 2016), the non-normal distribution should not be an issue for this study.

Standardized Test Scores
Respondent scores on the TOEIC and paper-based TOEFL tests of English proficiency were used as the dependent variable in the data analysis phase of this study. Before tests scores on the TOEIC and TOEFL could be analyzed, they needed to be standardized to a common measure. Ideally the test scores should have been converted to z-scores by subtracting each score from the population mean and dividing by the population standard deviation (Field, 2016). However, although the population mean for Japanese university students on both the TOEFL and TOEIC were publicly available online (ETS, 2018b; The Institute for International Business Communication, 2019), no information could be found for the population standard deviation on either test. Therefore, TOEFL test scores were converted to their TOEIC near-equivalents using an online universal conversion table (The Edge Learning Center, 2020). Figure 2 shows a histogram of the standardized test scores after the conversion. As with GPA, the standardized test scores appear non-normally distributed. However, as mentioned previously regression analysis does not require either the independent or dependent variables to be normally distributed (Field, 2016).

Figure 2. Histogram of Standardized Test Scores
Note: TOEFL scores have been converted to their TOEIC near-equivalents.

Grit
Grit scores were utilized as an independent variable in the data analysis phase of this study. Regression analysis requires that independent variables be either ordinal with only two values or continuous in nature (Field, 2016). However, Likert-scale data with two or more ordinal choices can be treated as a continuous variable in regression analysis if the data follows a normal distribution (Sullivan & Artino, 2013). As can be seen from Figure 3, the grit scores roughly follow a normal distribution. The normality of the distribution was confirmed by a Kolmogorov-Smirnov test of normality, which was not significant, D(280) = .05, p = .06.

Data Analysis
Regression analysis was used to investigate what relationship grit has with the English proficiency scores of Japanese university students both before and after controlling for the effects of prior academic achievement as measured by GPA. In order for the results of a regression analysis to be valid, four assumptions must be met: the relationship between independent and dependent variables must be linear, the errors should be normally distributed, there should be no collinearity between predictors, and the spread of residuals should be constant (Field, 2016). Each of these assumptions was checked independently. A scatterplot matrix of grit, GPA, and standardized test scores confirmed the linear relationship between the variables (Figure 4). A P-P plot of residuals confirmed that the errors were roughly normally distributed ( Figure 5). Multicollinearity was checked using a bivariate correlation between grit and GPA. Pearson correlation coefficients of greater than .8 are considered indications of problems with multicollinearity (Steyn, 2016). While grit and GPA were correlated to a statistically significant degree (p < .001), the Pearson correlation coefficient was much less than .8 (r = .24). Finally, the spread of residuals was checked with a plot of standardized residuals against predicted values ( Figure 6). The plot did not display any particular pattern, which indicates there are likely no problems with homoscedasticity (Field, 2016).

Research Question #1
Having ensured that all assumptions for running a regression analysis had been met, a simple linear regression was conducted to answer the first research question: what is the relationship between grit and the English language proficiency of Japanese university students, as measured by scores on the TOEIC or TOEFL standardized tests of English? The regression was significant, F(1, 278) = 13.54, p < .001, with an R 2 = .05 indicating that approximately 5% of the variance in the proficiency test scores were explained by grit. Table 2 provides a linear model of predictors for the simple regression analysis that was conducted. Note: 95% bias corrected confidence intervals are reported in brackets. R 2 = .05 (p < .001).

Research Question #2
Having determined that grit by itself predicted proficiency test score outcomes, a multiple regression analysis was run to answer the second research question: what is the relationship between grit and the English language proficiency of Japanese university students after controlling for prior academic achievement as measured by GPA? In Step One of the regression analysis, GPA was input as a controlling variable. The model demonstrated a good fit with F(1, 278) = 18.317, p <.001, and GPA explaining 6% of the variance in the data. In Step Two of the regression analysis, grit was input as the independent variable. This model also demonstrated a good fit, F(1, 277) = 7.84, p < .01. Additionally, grit was shown to explain 2% of the variance in the data beyond GPA. The results of the multiple regression analysis are summarized in Table 3. Note: 95% bias corrected confidence intervals are reported in brackets.

Discussion
This study utilized a quasi-experimental ex post facto survey design to investigate the relationship between grit and foreign language learning in Japanese university students. Grit was found to predict the English standardized test scores of Japanese university students to a statistically significant degree even after controlling for prior academic achievement as measured by GPA. In this section, the implications of these findings as well as suggestions for future research will be discussed.

Implications
Before discussing this study's implications, it must be stressed again that the results from this study should be generalized cautiously. A disproportionate number of respondents in the survey sample were second-year female language majors, which is not representative of the population of Japanese university students as a whole. Nevertheless, this study seems to confirm prior research (Robins, 2019;Wei et al., 2019) that higher grit is predictive of better foreign language learning outcomes. Robins (2019), for example, found grit to be predictive of both the grades and retention of Spanish and Portuguese EFL learners enrolled in an online course, even after controlling for demographic variables such as age, gender, and highest level of education received by the respondents' parents. Meanwhile, Wei et al. (2019) found that grit predicted the scores of Chinese middle-school students on school-wide English exams even after controlling for demographic variables such as age and gender.
In the current study, students with higher grit scores scored higher on standardized tests of English proficiency to a statistically significant degree compared with students with lower scores.
However, it should be noted that the effects sizes in both the current study and prior research (Robins, 2019;Wei et al., 2019) are rather small. Robins (2019), for example, reported that grit explained only about 1% of the variance in grades for the EFL students who participated in the research. Wei et al. (2019), on the other hand, did not regress grit separately from other predictors.
With all predictors, such as enjoyment of English lessons and classroom environment, included in the model, 23% of the variance in the student EFL test scores were accounted for. R 2 values of less than .30 are traditionally interpreted to be small (Field, 2016) and it can be assumed that if grit were regressed separately from the other predictors, the effect size in Wei et al.'s (2019) study would be even smaller. This study found that grit explained 5% of variance in English proficiency test scores by itself and 2% of test scores beyond GPA, both of which are traditionally interpreted as small effects sizes.
Given the results of these studies, it does appear that although grit has a statistically significant effect on language learning outcomes, this effect is rather on the small side. This suggests that while grit may be useful as a predictor of foreign language learning outcomes, it should not be used in any high-stakes settings, such as admissions criterion into foreign language programs.
However, educators may find grit scores useful in the early identification of students that are at risk of poor foreign language learning outcomes. Students with grit scores on the lower end of the scale may need extra support both during and outside of class in order for them to achieve targeted levels of proficiency. Of course, grit scores should not be the only determining factor in deciding when and how to support students but rather should be used alongside other assessments to provide a more holistic picture of student progress.
At this time, it is not recommended that foreign language educators attempt to foster grit in their students in any way. There is no evidence to date that suggests that grit is a malleable trait which is susceptible to interventions (Crede et al., 2017). Indeed, attempts in the United States to foster grit in students have instead resulted in controversy, as the attempts have disproportionately targeted students of color and those with disadvantaged socioeconomic backgrounds in an attempt to gloss over the structural inequalities which exist within the United States education system (Golden, 2017;Herold, 2015;Ris, 2015;Saltman, 2014;Socol, 2014;Stokas, 2015;Thomas, 2017). However, there is some evidence that grit is mediated by self-regulated learning strategies, which have been shown to be receptive to interventions (Wolters & Hussain, 2015). Therefore, foreign language teachers interested in fostering their students' long-term efforts may consider introducing and teaching self-regulated learning strategies in their classes.

Recommendations for Future Research
Although this study found a significant positive relationship between grit and the language proficiency of Japanese university students, research into the role of grit in the foreign language learning process is still in its infancy. Therefore, more research is required before any definitive conclusions can be drawn. There are several aspects of grit's relationship with foreign language learning which need to be clarified. First, there is some debate in the field of grit as to whether grit is a domain-general personality trait which is applied to all aspects of a person's life or a domain-specific trait which varies depending on the context. Although the bulk of grit studies have utilized a domain-general approach to researching grit, there is a growing body of work that has investigated grit from a domain-specific perspective (e.g. Eskreis-Winkler et al., 2014;Mondak, 2020;Morell, 2020, Schmidt et al., 2019. Teimouri et al. (2020) suggest that investigations of grit's relationship to foreign language learning should be conducted from a domain-specific perspective. Future research will need to clarify which of these approaches is more useful when investigating grit. It may in fact be the case that the choice of a domain-specific or domain-general perspective depends on the research question being examined.
Second, future research into grit's role in foreign language learning will need to clarify more specifically how grit interacts with the language learning process. While the bulk of grit research has been cross-sectional in nature (Crede et al., 2017), more longitudinal studies that utilize mixed-methods approaches will likely be required to explicate grit's specific effects on foreign language learning. Clark (2016), for example, used a mixed-methods design to investigate grit's role in career success and found that her interviews with participants helped explain the quantitative results of her study, which did not find a significant relationship between grit and career outcomes. Therefore, future research should consider longitudinal, mixed-methods designs in order to more robustly explore the relationship between grit and foreign language learning and determine the pathways by which grit affects the language learning process.
Therefore, this study investigated the relationship between grit and EFL proficiency of Japanese university students. An anonymous online self-report survey was administered at two Japanese university research sites. Data including demographic information about respondents, GPA, standardized test of English scores (TOEFL/TOEIC), and grit scores were collected. In total, 280 responses were analyzed using regression analysis. The findings indicate that not only was grit predictive of English standardized test scores, it also explained 2% of variance in the data beyond GPA. These findings align with prior research into grit's relationship with foreign language learning outcomes (Robins, 2019;Wei et al., 2019) in that although grit is predictive of outcomes to a statistically significant degree, the effect size tends to be rather small. This implies that grit measures are probably only useful as an addition to other assessments in holistically identifying students at risk of poor foreign language learning outcomes. Much remains unknown about the relationship between grit and foreign language learning, as investigation into this area is still in its early stages. The bulk of grit research thus far has concentrated on quantitative cross-sectional designs (Crede et al., 2017).
Therefore, future research into grit's role in the foreign language learning process may benefit from longitudinal and mixed-methods inquiries designed to shed more light on the perhaps subtle ways the non-cognitive trait of grit influences language learning.