CORPUS-BASED APPROACH TO SPECIALIZED TRANSLATION TRAINING: SKETCH ENGINE TOOLS AND CQL QUERIES
DOI:
https://doi.org/10.32782/2522-4077-2025-214.1-16Keywords:
corpus-based research, teaching specialised texts translation, Sketch Engine, professional text, professional vocabulary, CQL universal query languageAbstract
Modern applied linguistics cannot be imagined without language corpora. They are both its powerful and efficient research tool and a virtually unlimited database of linguistic data for various fields and needs of sciences that use the language arsenal in one way or another. Automation of text corpus generation and research creates new opportunities not only for philology, but also for specialists who use such data for practical purposes. Corpusbased methods play an important role in improving language teaching, particularly translation, as they allow for the accurate and systematic selection of specialised language materials necessary for mastering vocabulary, peculiarities of use and translation of key linguistic units, as well as for identifying current language trends in a particular field. Despite the huge selection of platforms and software for managing corpora and text analysis, Sketch Engine stands out due to its power, as well as the size and versatility of its text collections: it allows you to not only analyse existing corpora, but also create your own, including multilingual ones, research vocabulary, phrases, terminology, translation equivalents, and generate learning materials using CQL queries and built-in linguistic functions. Sketch Engine is particularly effective in teaching professional texts and their translation, as it allows you to effectively research specialised texts, identify key terminology and typical phrases, analyse translation approaches and prepare teaching materials for translation students, while using the CQL query language to make the process of working with specialised texts more focused and flexible. The article explores a comprehensive approach to using Sketch Engine in teaching professional texts translation, focusing on the practical aspects of working with corpora. The author offers systematic methods that cover both basic skills of working with corpus data (creation of specialised corpora, frequency analysis, terminology research) and advanced techniques of analysing professional vocabulary and specific grammatical structures. Particular emphasis is placed on the practical application of these tools in the educational process, which allows translation students to effectively master the key aspects of specialised translation.
References
Peñas A., Verdejo F., Gonzalo J. Corpus-Based Terminology Extraction Applied to Information Access. UCREL Technical Papers, 13. Presented at the Corpus Linguistics 2001 conference, Lancaster University, United Kingdom. pp. 458–465.
Cabré Castellví M.T., Estopà Bagot R., Vivaldi Palatresi J. Automatic Term Detection: A Review of Current Systems. Terminology. 2001. Vol. 7(2). pp. 53–88. DOI: 10.1075/term.7.2.07cab
Hewavitharana S., Vogel S. Enhancing a Statistical Machine Translation System by Using an Automatically Extracted Parallel Corpus from Comparable Sources. Proceedings of the LREC 2008 Workshop on Building and Using Comparable Corpora. Marrakech, Morocco, 2008. pp. 7–10.
Domhan T., Hasler E., Tran K., Trenous S., Byrne B., Hieber F. The Devil Is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2022). 2022. Association for Computational Linguistics. pp. 1840–1851. https://doi.org/10.18653/v1/2022.naacl-main.136
Van Eck N.J., Waltman L., Noyons E.C.M., Buter R.K. Automatic Term Identification for Bibliometric Mapping. Scientometrics. 2010. Vol. 82(3). pp. 581–596. DOI: 10.1007/s11192-010-0173-0
Gries S. T. Analyzing Linguistic Data: A Practical Introduction to Statistics Using R (2nd ed.). Cambridge University Press. 2021. 374 pages.
Akkoyunlu Aslı, Kilimci Abdurrahman. Application of Corpus to Translation Teaching: Practice and Perceptions. International Online Journal of Education and Teaching. 2017. Vol. 4. pp. 369–396.
Lusta A., Demirel Ö., Mohammadzadeh B. Language Corpus and Data Driven Learning (DDL) in Language Classrooms: A Systematic Review. Heliyon. 2023. Vol. 9. e22731. 10.1016/j.heliyon.2023.e22731.
Anokhina T., Kobyakova I., Schvachko S. Innovative Methodology for Teaching European Studies Using a Corpus Approach. Philological Treatises. 2023. Vol. 15. No. 2. pp. 7–16.
Matvieieva S. A., Lemish N. Ye., Zernetska A. A., Babych V. I., Torgovets M. S. English-Ukrainian Parallel Corpus: Prerequisites for Building and Practical Use in Translation Studies. Studies about Languages. 2022. Vol. 1. pp. 61–74.
Lemish N. Ye., Aleksieieva O. M., Denysova S. P., Matvieieva S. A., Zernetska A. A. Linguistic Corpora Technology as a Didactic Tool in Training Future Translators. Information Technologies and Learning Tools. 2020. Vol. 79. No. 5. pp. 242–259.
Kilgarriff A., Baisa V., Bušta J., Jakubíček M., Kovář V., Michelfeit J., Rychlý P., Suchomel V. The Sketch Engine: Ten Years On. Lexicography. 2014. Vol. 1(1). pp. 7–36. DOI: 10.1007/s40607-014-0009-9.







