A Comparative Study of Corpus-based and Corpus-driven Approaches
Keywords:
Corpus-based approach, corpus-driven approach, advantages, disadvantages, similaritiesAbstract
Based on the comparative method, this article seeks to conduct a feature-by-feature comparison between the corpus-based and corpus-driven approaches in corpus linguistics. The similarity between the corpus-based and corpus-driven approaches is present through the use of corpus as the primary tool to collect and analyse data. Meanwhile, the differences between these approaches are present in four aspects: top-down vs. bottom-up approaches, different selection and sampling methods, opposite views towards the corpus annotation, and different paradigmatic claims. The most significant advantages of the corpus-based approach lie in the values added by annotation and flexible size. However, its disadvantages include subjective and incorrect annotation and overreliance on intuition. Moreover, the primary advantages of the corpus-driven approach include objective perspective, novel methodology, and full exploitation of corpus evidence, while its weakness includes the difficulty in collecting meaningful data and formulating a theory based on the corpus and the rejection to annotate corpus.
References
Aichele, D. (2004). English Corpus Linguistics: An Introduction (review). In Language (Vol. 80, Issue 3). https://doi.org/10.1353/lan.2004.0107
Atar, C., & Erdem, C. (2019). The Advantages and Disadvantages of Corpus Linguistics and Conversation Analysis in Second Language Studies. Proceedings of IX Scientific and Practical Internet Conference of Young Scientists and Students, November. https://www.researchgate.net/publication/337858444
Baker, M. (2004). A Corpus-Based View of Similarity and Difference in Translation. International Journal of Corpus Linguistics, 9(2), 167–193. https://doi.org/10.4324/9780429024221-6
Bashir, I., Yunus, K., & Ibrahim, B. (2018). Perspectives on corpus linguistics: The methodological synergy in second language pedagogy and research. Arab World English Journal, 9(3), 84–97. https://doi.org/10.24093/awej/vol9no3.6
Biber, D. (1993). Representativeness in corpus design. Literary and Linguistic Computing, 8(4), 243–257. https://doi.org/10.1093/llc/8.4.243
Biber, D. (2012). Corpus-Based and Corpus-driven Analyses of Language Variation and Use. In B. Heine & H. Narrog (Eds.), The Oxford Handbook of Linguistic Analysis (pp. 159–191). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199544004.013.0008
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge University Press.
Biber, D., Egbert, J., Keller, D., & Wizner, S. (2021). Towards a taxonomy of conversational discourse types: An empirical corpus-based analysis. Journal of Pragmatics, 171, 20–35. https://doi.org/10.1016/j.pragma.2020.09.018
Chareonkul, C., & Wijitsopon, R. (2020). The English present perfect in authentic use and textbooks: A corpus-driven study. Pasaa, 60(December), 275–308.
Giampieri, P. (2020). Online corpora for second language teaching. Linguistics Journal, 14(2), 50–70.
Glynn, D. (2010). Corpus-driven Cognitive Semantics. In D. Glynn & K. Fischer (Eds.), Quantitative Methods in Cognitive Semantics: Corpus-Driven Approaches (pp. 1–42). De Gruyter Mouton.
Goźdź-Roszkowski, S. (2018). Between corpus-based and corpus-driven approaches to textual recurrence: Exploring semantic sequences in judicial discourse. In J. Kopaczyk & J. Tyrkkö (Eds.), Applications of pattern-driven methods in corpus linguistics (pp. 102–104). John Benjamins Publishing Company. https://doi.org/10.2478/icame-2020-0005
Gray, B., & Biber, D. (2015). Phraseology. In D. Biber & R. Reppen (Eds.), English Corpus Linguistics (pp. 125–145). Cambridge University Press. https://doi.org/10.4324/9781315845890
Halliday, M. A. K. (2005). Computational and quantitative studies (J. J. Webster (ed.)). Continuum.
Hsieh, Y., & Reynolds, B. L. (2019). A corpus study of stance adverbs in modern mandarin Chinese – yexu, keneng, and haoxiang. Linguistics Journal, 13(1), 52–72.
Hussein, R. F., Haider, A. S., Ida, S. ’, & Al-Sayyed, W. (2021). A Corpus-Driven Study of Terms Used to Refer to Articles and Methods in Research Abstracts in the Fields of Economics, Education, English Literature, Nursing, and Political Science. Journal of Educational and Social Research, 11(3), 119–131. https://doi.org/10.36941/jesr-2021-0056
Kang, T., & Luo, H. (2020). A corpus-driven contrastive study of the top 100 content words in English and Chinese. Journal of Technology and Chinese Language Teaching, 11(1), 36–56.
Liu, J., & Lu, Y. (2020). A corpus-based comparative study on lexical bundles in native and Chinese scholars' English abstracts-taking linguistics and chemistry as an example. Chinese Journal of Applied Linguistics, 42(4), 488–502. https://doi.org/10.1515/CJAL-2019-0029
Liu, M. (2020). A Corpus-driven Study of Machine Translation Performance on Literary and Non-literary Text. Zhejiang University.
McEnery, T., & Hardie, A. (2012a). Corpus-based studies of synchronic and diachronic variation. In Corpus Linguistics: Method, Theory and Practice (pp. 94–121). Cambridge University Press.
McEnery, T., & Hardie, A. (2012b). Corpus Linguistics: Method, Theory and Practice. Cambridge University Press.
Meyer, C. F. (2014). Corpus-based and corpus-driven approaches to linguistic analysis: One and the same? In I. Taavitsainen, M. Kytö, C. Claridge, & J. Smith (Eds.), Developments in English: Expanding Electronic Evidence (pp. 14–28). Cambridge University Press. https://doi.org/10.1017/CBO9781139833882.004
Nor, N. F. M., & Zulcafli, A. S. (2020). Corpus-driven analysis of news reports about covid-19 in a Malaysian online newspaper. GEMA Online Journal of Language Studies, 20(3), 199–220. https://doi.org/10.17576/gema-2020-2003-12
Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford University Press.
Sinclair, J. (2004). Current issues in corpus linguistics. In J. Sinclair & R. Carter (Eds.), Trust the text: language, corpus and discourse (pp. 185–193). Routledge. https://doi.org/10.4324/9780203594070
Sologub, O., Rezanova, Z. I., & Temnikova, I. G. (2014). The Concept of the Tomsk Regional Corpus: Balance and Representativeness. Procedia - Social and Behavioral Sciences, 154(October), 175–178. https://doi.org/10.1016/j.sbspro.2014.10.131
Swales, J. M. (2006). Corpus Linguistics and English for Academic Purposes. In E. A. Macia, A. S. Cervera, & C. R. Ramos (Eds.), Information Technology in Languages for Specific Purposes: Issues and Prospects (pp. 19–34). Springer.
Tognini-Bonelli, E. (2001). Corpus linguistics at work. John Benjamins Publishing Company.
Tognini-Bonelli, E. (2002). Functionally complete units of meaning across English and Italian: Towards a corpus-driven approach. In B. Altenberg & S. Granger (Eds.), Lexis in Contrast: Corpus-based Approaches (pp. 73–96). John Benjamins.
Wang, B., & Wei, N. (2020). 语料库驱动的中西学者论文事态名词研究———以N that 型式为例(Corpus-driven study of status nouns in Chinese and Western scholars’ papers: Take N that structure as an example). Journal of PLA University of Foreign Languages, 43(5), 20–28.
Wang, H. (2020). A corpus-driven approach to teaching academic writing for college students—taking abstract writing as an example. Foreign Language Research, 1, 49–55.
Zheng, Q. (2020). 语料库驱动下的高职英语词汇教学探究(Corpus-driven study of English vocabulary teaching in higher vocational education). 科教导刊(下旬刊)(KE JIAO DAO KAN (Sicence Tribune), 33, 134–135.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgment of the work’s authorship and initial publication in this journal.