Lexicogrammatical and Semantic Development in Academic Writing of EFL Learners: A Systemic Functional Approach

Many Japanese university students' English writing skills are insufficient despite completing at least six years of English language instruction before entering university. Several researchers have explored this topic. A corpus-based approach to this field, for example, has improved the understanding of the writing skills of learners of English. In Japan, the recent developments in corpus linguistics have enabled instructors and researchers to analyze English linguistic features written by Japanese EFL learners. For example, Mizusawa (2015) referred to the Japanese EFL learner Corpus, a collection of junior high and high school students' English essays, to investigate the linguistic features, such as lexical density, grammatical intricacy, and semantic variations framed by systemic functional linguistics (SFL). This paper aimed to examine English academic writings written by 38 Japanese university students. Their writings were analyzed in terms of lexical density and semantic features within the SFL frameworks. The results highlighted that the critical limitation in Japanese university students' writing skills suggested teaching students the lexicogrammatical differences between written and spoken modes of the English language.

Many instructors support the importance of academic writing. The curricula in many Japanese universities offer English academic writing courses. Most courses, however, are not compulsory. Thus, Japanese students have not enough opportunities to improve their English writing skills. According to a Test of English as a Foreign Language (TOEFL) score data summary of 2015, the writing scores of Japanese TOEFL examinees ranked worst among the Asian countries (ETS, 2016), demonstrating Japanese EFL students' low level of English proficiency.
Developments in corpus linguistics have improved the understanding of the writing skills of Japanese EFL students, for example, the Japanese EFL Learner (JEFLL) Corpus, a seven-million-word collection of written language by Japanese junior high school and high school students developed by Tono (2007), and the International Corpus Network of Asian Learners of English (ICNALE), an international learner corpus developed by Ishikawa (2013). The ICNALE covers more than 10,000 topic-controlled speeches and essays by university students in 10 Asian countries: Japan, China, Indonesia, South Korea, Thailand, and Taiwan, where English is spoken as a foreign language, and the Philippines, Singapore, Pakistan, and Hong Kong, where English is spoken as a second language. The ICNALE also covers speeches and writings by English native speakers (NSs). Further details of the ICNALE are provided in the 'Methods' section.
The corpus-based research of Japanese EFL students has focused mainly on morphology and syntax, and few studies have referred to lexicogrammatical and semantic development of their writing skills. Therefore, this research examines the lexicogrammatical and semantic features of Japanese university EFL learners' academic writings and identifies problems in their writings.

Literature Review
Several researchers have explored academic writings from various perspectives, such as a genre approach, lexical and syntactic improvements, and interdisciplinary work. For example, genre approaches towards academic writings have been developed by Swales (1990), Martin (1992), andHyland (2004). Several works on lexical and syntactic developments have been conducted (Mazgutova & Kormos, 2015;Dulay & Burt, 1973;Krashen, 1977). Mazgutova and Kormos (2015) examined lexical and syntactic changes before and after instructions of academic writing. Some studies (Dulay & Burt, 1973;Krashen, 1977) have revealed orders and sequences in morphology and syntax in the field of second language acquisition. Likewise, Japanese EFL learners' academic writings have been researched extensively; for example, genre approaches to academic writing by Japanese EFL learners (Watanabe, 2016;Kobayashi, 2003). Hirose (2003) identified similarities and differences between English and Japanese organizational patterns in argumentative essays written by Japanese EFL students. McKinley (2013) examined the critical thinking of Japanese EFL learners in their academic writing.
In the corpus-based research on Japanese EFL learners' writing, some researchers have focused on error analysis of essays written by Japanese EFL learners (e.g., Okada, 2005;Yamaguchi & Usami, 2017). Regarding the use of particular vocabulary, McCrostie (2008) focused on the extensive use of the first-person pronouns in argumentative essays written by the Japanese university students compared with the international students from Sweden, France, and the United States.
Lexicogrammatical and semantic developments, however, have not been fully discovered. Johansson (2009) deduced that the lexical density of essays written by NSs of English increases as they age and use different written and spoken language styles. Japanese EFL learners, however, hardly employ written language in their writings (Mizusawa, 2015). Many Japanese students do not recognize differences in the style between spoken and written language. Mizusawa(2015)employed the JEFLL Corpus and identified features of writings by Japanese junior high school and high school students. The result demonstrates that the students used many features of spoken language in their essays. Mizusawa (2015) further examined Japanese EFL university students' writings with the same topic and methods as the JEFLL Corpus and found similar features to the previous research. However, the data size in the study was limited. Therefore, this research expanded the data size by referring to the ICNALE to obtain stable results and examined features of academic writings by Japanese EFL university students.

Methods
This research examined 24 essays written by Japanese EFL learners at the tertiary level adopting frameworks within systemic functional linguistics (SFL). SFL has been developed by M. A. K. Halliday (1978Halliday ( , 1985Halliday ( , 1994Halliday ( , 1999Halliday ( , 2007 to capture language as meaning-making resources in a social context. This section describes the data and analytical frameworks.

Data Collection
The data used in this research were retrieved from the ICNALE-written data. The ICNALE comprises 1.3 million words from essays written by 2,600 university students in 10 Asian countries and 200 English NSs (Ishikawa, 2013). Ishikawa (2014) released the ICNALEspoken data in 2014. Factors likely to impact the language in the ICNALE were controlled when it was collected (Ishikawa, 2013). All participants received the same instructions. For the ICNALE-written data, the participants were asked to write on two topics: Topic A, 'It is important for college students to have a part-time job,' and Topic B, 'Smoking should be completely banned at all restaurants in the country.' The following seven instructions were presented to the participants before they started the writing task.
1. Clarify your opinions and provide reasons for and examples of them. 2. You have 20 to 40 minutes to write each essay. Thus, you must complete two essays in 40 to 80 minutes. 3. You must use MS World or a similar word processor. 4. Do not use dictionaries or other reference tools. 5. Do not plagiarise. 6. The length of each essay should be from 200 to 300 words (not letters).
Essays that are too short or too long are not accepted. Check the length of your essay by using the word count function of MS Word. (Ishikawa, 2013) The participants had not prepared for these topics beforehand. Data were collected on the participants' background, for example, the number of years of English study, grade, university major, academic area, sex, age, and motivation.
Topic B was chosen because Topic B was supposed to elicit a wider range of opinions than Topic A. The data comprised 24 written essays: 12 randomly selected the common European framework of reference for languages (CEFR) A2 level essays and 12 randomly selected CEFR B2 + level essays on Topic B. The average word count for the A2 essays was 2,792, and that for the B2 + essays was 2,681. CEFR level A2 is an elementary level, and B2 + is an upper-intermediate level.

Data Analysis
For the analysis, this research employed two frameworks used in SFL: lexical density to identify features of written and spoken language and rhetorical unit analysis (RUA) for semantic features.
Lexicogrammatical Features. Two written and spoken language indicators are lexical density, a ratio of content words in a clause, and grammatical intricacy, which builds up elaborate clause complexes using limited conjunctions, such as and or. The higher the lexical density, the more a clause tends to be regarded as a written language. By contrast, grammatical intricacy is indicated by paratactic and hypotactic relations. Clauses connected with paratactic and hypotactic relations in a sentence are regarded as a spoken language (Halliday, 1994). Table 1 presents examples of lexical density and grammatical intricacy. Table 1 Lexical Density and Grammatical Intricacy (a)【A clause with high lexical density】 Advances in technology are speeding up the writing of business programs.

(b)【Clauses with high grammatical intricacy】
Because technology is getting better, people are able to write business programs faster. (Halliday & Matthiessen, 2014, p. 727) In Table 1, example (a), with high lexical density, comprises one clause in a sentence, whereas example (b), with high grammatical intricacy, comprises two clauses connected by hypotactic conjunction. In example (a), the hypotactic clause in example (b) is nominalized as the subject Advances in technology. The main clause in example (b) functions as the verb and object of are speeding up the writing of business programs, in example (a). In academic writing, texts with high lexical density are highly evaluated (Schleppegrell, 2004). Thus, nominalization is one of the essential features of advanced academic writing.
The aforementioned research has examined differences in lexical density by the ages of English NSs (Johansson, 2009). Lexical density tends to increase as the individuals age. In addition, at the same age, the lexical density of written language is much higher than that of spoken language. Table 2 summarises the lexical density of NSs, and that of Japanese EFL learners as reported by Mizusawa (2015); NSs have a much higher lexical density than Japanese EFL learners do; no difference is observed in the lexical density of Japanese EFL learners among the different ages. The lexical density of Japanese EFL learners is close to that of the spoken language of 10-year-old NSs. To examine the written language of Japanese EFL learners, this research employed lexical density and grammatical intricacy. The next section, 'Semantic Features', describes the analytical framework used to examine academic writings' semantic features by Japanese university students learning EFL.

Semantic Features.
RUA is an analytical framework within SFL to specify the semantic features of texts. Cloran (1994Cloran ( , 2010 developed RUA by referring to both chronotopes (chronos = time; topos = space) by Bakhtin (1981) and message semantics by Hasan (1996). According to Cloran (1994Cloran ( , 2010, decontextualization of a text from here-and-now is related to semantic features. Highly de-contextualized texts demonstrate abstract cognition by a writer, whereas highly contextualized texts demonstrate concrete cognition by a writer. RUA has been applied to discourse analysis such as English language NS conversations between mothers and their babies (Cloran, 1994(Cloran, , 1999(Cloran, , 2010, English language medical communication (Kealley, 2007), and English language non-NS writings (Mizuzawa, 2015). Mizuzawa (2015) analyzed the work of junior high school and high school students; this study examined semantic features in first-year university students.
Two steps are required to identify a rhetorical unit (RU): central entity (CE) and event orientation (EO). Cloran (1994Cloran ( , 2010 classified 11 categories from contextualization to decontextualization following the combination of CE, and EO EO has two categories: proposal and proposition. In the act of speaking (writing), a speaker (writer) assumes a particular speech role. For example, in putting forth a question, a speaker's role is to seek information while the listener acts as a supplier of the information. These classifications are identified with the combination of 'commodity exchanged' and 'role in exchange'. When the commodity exchanged is 'goods & services', the category is called 'proposal' (an offer or command, depending on its role of exchange). Similarly, when the commodity exchanged is information, the category is called 'proposition' (a statement or question, depending on its role of exchange). Children acquiring their first language tend to acquire proposal after proposition (Halliday, 1994). Table 3 summarises the details of these interactions, called speech functions in SFL. Table 3 Speech Function in Systematic Functional Linguistics

Commodity Exchanged Role in Exchange Goods & Services Information
Giving Offer Statement Demanding Command Question Proposal Proposition (Halliday, 1994, p.69) Hasan (1985) conceptualized this variable as a continuum between language as ancillary to a given task and language that constitutes social activity. Thus, the RUs closest to the ancillary end of this continuum are those in which the rhetorical configuration is such that (a) the CEs are the interactants and (b) the EOs occur concurrently with the moment of speaking or occur immediately as a consequence of the message (e.g., Action). This procedure of considering the RUs can postulate them as relevant categories in the realization of the language role in social processes. Figure 1 presents the relationship between the RU classes and this contextual variable. In Figure 1, Action and Commentary are the most ancillary since they are relevant to here-and-now. Reflection and Observation have the same CEs; that is, participants are somebody or something here in the speech situation. However, unlike Action and Commentary, the events are not now but always. This value of the event moves these classes along the continuum. The other classes: Report, Recount, Plan, and Prediction, are intermediate between ancillary and constitutive and may involve a here or a now value of either the CE (Recount, Plan, Prediction) or the EO (Report). The classes represented at the end of most constitutive involve entirely imaginary or timeless (Conjecture), with CEs that are not situationally present or generalized entities (Account and Generalisation). These RUs function to identify semantic features. The RUs under ancillary tend to be employed as a spoken language, whereas those under constitutive tend to be employed as a written language. When RUs are applied to excerpts from the ICNALE, a sentence categorized under Generalisation is 'A restaurant is a place where people can eat dishes and feel comfortable,' and 'Smoking is bad for health, not just for the smoker but also the people around the smoker.' By contrast, a sentence categorized under Commentary is 'I really hate the smell of smoke.'

Results
The results highlight the limitations of Japanese university students' writing skills and suggest the importance of teaching them lexicogrammatical and semantic variations of the English language.

Lexicogrammatical Features of English Academic Writing
The lexical density of the essays written by Japanese university students was similar to those written by Japanese junior high school and high school students. Besides, no difference in lexical density was observed among the CEFR levels. The lexical density of the essays classified as A2 was 48.24; that of the essays classified as B2 + was 48.00 on average. Table 4 presents the figures after adding this result to the information in Table 2. event orientati on Table 4 Lexical Density of Essays by Japanese university students learning EFL The lexical density figures in Table 4 differ from those of the Japanese junior high and high school students assessed by Mizusawa (2015). Unlike NSs, the lexical density of Japanese EFL learners does not improve with age. The next section refers to the results of semantic features.

Semantic Features of English Academic Writing
The semantic features of the essays by Japanese university students learning EFL were significant. Comparing the semantic features of the essays written by the Japanese junior high school and high school students and those written by Japanese university students revealed many features in the categories under the constitutive heading, considered a characteristic of written language. The ratio of Generalisation in the essays by the A2 Japanese university students learning EFL was 38%, and that of those essays written by the B2 + Japanese university students learning EFL was 25% (Table 5). Table 5 Ratio of Rhetorical Units in the Writing of Japanese University Students Learning EFL In general, the categories under constitutive were frequently observed in the essays.

Lexical Density of English Academic Writing by Japanese EFL Learners
The essays' lexical density by the Japanese university students learning EFL was almost the same as that in those by the Japanese junior high and high school students learning EFL. This finding implies that no progress is achieved in Japanese junior high school, high school, and university students, which demonstrates that Japanese students' English writing skills do not improve much over time.  A high frequency in Japanese university students' writings was also confirmed. Specifically, the use of subordinate clauses starting with 'because' and of the coordinating conjunctions 'and' and/or 'but' was a common feature. In addition, sentences often started with nonmain clauses, such as a subordinate clause because without the main clauses or with words such as 'and' and/or 'but,' rather than with conjunctive adverbs, such as 'in addition,' 'moreover' and 'however,' which demonstrate the thread of a logical structure. The use of coordinate conjunctions such as 'and' and/or 'but,' as noted by Halliday (1994) and Schleppegrell (2004), is a characteristic of spoken language. When the lexical density in Japanese EFL learners' writing was compared with that of the NSs' writing, a significant difference was identified.
The Japanese university students have studied for university entrance exams; they are, thus, likely to have read more English than students in the third year of high school; however, the former group has had little opportunity to learn the difference between written and spoken language, which is likely to account for the degree of lexical density observed. If Japanese university students study abroad, where assessment depends heavily on essays, earning satisfactory grades will be difficult without knowing the difference between spoken and written language. Moreover, university students are often required to read textbooks and academic journals written in English. These materials tend to be written in style with high lexical density (Schleppegrell, 2004). Therefore, students unfamiliar with characteristics to increase lexical density, such as nominalization, may obtain an inadequate understanding of classes' content.
Other lexical features have been identified in the Japanese EFL learner's writing. Schleppegrell (2004) proposed three preferable features of academic writing: 1) Avoidance of emotional reactive feelings and attitudes 2) Adoption of the impersonality of a third-person declarative sentence 3) Presentation of writers as objective experts In academic writing, declarative sentences are preferred to interrogative and exclamatory sentences, and sentences without emotional expressions are also preferred (Schleppegrell, 2004). Notably, Japanese EFL learners were unable to employ these preferred features of academic writing. The data included a sentence with an exclamation mark, 'How can I enjoy the food!' In addition, almost all essays began with the pronoun I. The frequent use of I in Japanese EFL learners' writing has been highlighted by Natsukari (2012) and McCrostie (2008). Schleppegrell (2004) argues that the use of first-person pronouns is discouraged in academic writing. This feature was not observed in the essays by NSs and English as a second language (ESL) learners; for example, those by Hong Kong university students learning ESL began with the following sentences i : HK1) In a small and non-ventilation restaurant, there is full of disgusting smoke when you are having a meal. HK2) Smoking in restaurants is a crime against humanity. HK3) Hong Kong Government had introduced a law banning smoking in all restaurants.
As these examples indicate, most essays do not use first-person pronouns. Furthermore, university students who were native English speakers wrote essays by using a much more elaborate structure in the first sentence: NS1) People who object to smoking in restaurants are closet totalitarians, or they are unthinking sheep only too happy to bend a knee before any other garlicbreathed madman waving a manifesto about his unkempt hair.
Thus, the word number and elaborate structure are distinguishing features of the essays by English native university students, and few emotional and colloquial words were identified in some of them.

Semantic Features of Essays by Japanese University Students Learning EFL
The essays analyzed in this research displayed several features in the categories under the constitutive heading. Compared with the results from the Japanese junior high school and high school students (Mizusawa, 2015), a wider range of 'messages' was used. This result could be interpreted as a semantic development in the essays of Japanese university students learning EFL since the figure for RU. Generalization in the essays written by the students at the B2 + level was 25.19, which is considered to have high English ability; However, this is not an overall characteristic; the case may be that A2 writers tend to adopt a wide range of messages.

Conclusion
This research applied lexicogrammatical and semantic perspectives to investigate the linguistic features of English by Japanese university students. From the lexicogrammatical viewpoint, lexical density was measured to demonstrate written language features and some vocabulary and sentence types preferred in English academic writing. The results show that the Japanese EFL learners have not improved in lexical density despite their six years of English instruction in junior high school and high school. This result indicates that they do not fully understand the language style: the differences between spoken and written language. The Japanese students' lexical density of essays was dramatically lower than that of essays by NSs at the same age. In addition, the features of spoken language, such as emotional expressions and the use of first-person pronouns and limited conjunctions, were often identified in the data.
From the semantic viewpoint, a relatively wide range of RU messages were identified. In particular, the university students relatively employed the messages categorized under constitutive. They could write messages that were are distant from here-and-now. The message of Generalisation, that is, an abstraction, is said to be farthest from here-and-now and to be acquired at the later stage of development as language development progresses (Cloran, 2010). This research demonstrates that Japanese university students used more RUs (a characteristic of written language) than junior high school and high school students.