COMPILATION OF A CORE VOCABULARY FOR SPECIALISED PROFESSIONAL TEXTS USING THE SKETCH ENGINE SOFTWARE FUNCTIONALITIES

Authors

DOI:

https://doi.org/10.32782/2522-4077-2025-212-4

Keywords:

applied linguistics, corpus, corpus-based research, professional text, core vocabulary, Sketch Engine, CQL universal query language

Abstract

One of the most powerful tools of applied linguistics is corpora, which are created and used in various fields of human activity. Automation of the process of selecting, compiling and analysing text corpora of virtually unlimited size provides new opportunities not only for researchers in the realm of philology, but also for experts who use the data as the basis for successful completion of practical tasks. Thus, corpus-based research has great potential for improving the effectiveness of language teaching, and translation in particular, as it allows for a more accurate and efficient selection of linguistic material in a particular highly specialised field, which is necessary for the future translators to successfully master the lexical minimum of professional texts, learn the peculiarities of functioning and translation of such commonly used units, and analyse existing and new linguistic trends in a particular field. Sketch Engine, which is one of the most famous and renowned software products for compiling and managing corpora, is the best suited to the tasks that arise when working with professional texts for translation training, as it allows not only analysing the corpora available on the platform, but also creating your own, including multilingual ones, for the purpose of quick and qualitative analysis of industry-specific texts, selection of active vocabulary and significant terminology and typical collocations, analysis of translation peculiarities and difficulties in rendering certain units, creation of glossaries and exercises to develop and improve the translation skills of future translators. A thorough analysis of all the functions of the Sketch Engine corpus manager can significantly increase the efficiency of methodological work with professional texts, and the possibility to create search queries in CQL can improve the accuracy of the linguistic results obtained. The proposed study describes the main capacities and methods of searching, analysing and selecting typical lexical material from professional texts based on the example of a corpus of English-language texts of legal support for IT products, namely the texts of licence agreements and contracts.

References

Bucur Ana-Maria, Dincă Andreea, Chitez Madalina, Rogobete Roxana. Automatic Extraction of the Romanian Academic Word List: Data and Methods. Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing. 2023. Varna, Bulgaria: INCOMA Ltd., Shoumen, Bulgaria. pp. 234–241.

Domhan T., Hasler E., Tran K., Trenous S., Byrne B., Hieber F. The Devil Is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2022). 2022. Association for Computational Linguistics. pp. 1840–1851. https://doi.org/10.18653/v1/2022.naacl-main.136

Schmitt N. Vocabulary in Language Teaching (2nd ed.). Cambridge University Press. 2020. 304 pages.

Gries S. T. Analyzing Linguistic Data: A Practical Introduction to Statistics Using R (2nd ed.). Cambridge University Press. 2021. 374 pages.

Akkoyunlu Aslı, Kilimci Abdurrahman. Application of Corpus to Translation Teaching: Practice and Perceptions. International Online Journal of Education and Teaching. 2017. Vol. 4. pp. 369–396.

Lusta A., Demirel Ö., Mohammadzadeh B. Language Corpus and Data Driven Learning (DDL) in Language Classrooms: A Systematic Review. Heliyon. 2023. Vol. 9. e22731. 10.1016/j.heliyon.2023.e22731.

Anokhina T., Kobyakova I., Schvachko S. Innovative Methodology for Teaching European Studies Using a Corpus Approach. Philological Treatises. 2023. Vol. 15. No. 2. pp. 7–16.

Matvieieva S. A., Lemish N. Ye., Zernetska A. A., Babych V. I., Torgovets M. S. English-Ukrainian Parallel Corpus: Prerequisites for Building and Practical Use in Translation Studies. Studies about Languages. 2022. Vol. 1. pp. 61–74.

Леміш Н. Є. Англо-український паралельний корпус текстів для студентів спеціальності «Переклад». Актуальні проблеми романо-германської філології та прикладної лінгвістики. 2018. Чернівці. Вип. 1 (15). С. 207–210.

Lemish N. Ye., Aleksieieva O. M., Denysova S. P., Matvieieva S. A., Zernetska A. A. Linguistic Corpora Technology as a Didactic Tool in Training Future Translators. Information Technologies and Learning Tools. 2020. Vol. 79. No. 5. pp. 242–259.

Hewavitharana S., Vogel S. Enhancing a Statistical Machine Translation System by Using an Automatically Extracted Parallel Corpus from Comparable Sources. Proceedings of the LREC 2008 Workshop on Building and Using Comparable Corpora. Marrakech, Morocco, 2008. pp. 7–10.

Gamallo Otero P. Evaluating Two Different Methods for the Task of Extracting Bilingual Lexicons from Comparable Corpora. Proceedings of the LREC 2008 Workshop on Comparable Corpora. Marrakech, Morocco, 2008. pp. 19–26.

Baños R., Borja A. The Application of a Parallel Corpus English-Spanish to the Teaching of Translation (ENTRAD Project). New Trends in Translation and Cultural Identity / Ed. Muñoz-Calvo M., Buesa-Gómez C., Ruiz Moneva M. A. Cambridge Scholars Publishing, 2008. pp. 433–444.

Kübler N., Mestivier A., Pecman M. Teaching Specialised Translation Through Corpus Linguistics: Translation Quality Assessment and Methodology Evaluation and Enhancement by Experimental Approach. Meta, 2018. Vol. 63. No. 3. pp. 807–825.

Saralegi X., San Vicente I., Gurrutxaga A. Automatic Extraction of Bilingual Terms from Comparable Corpora in a Popular Science Domain. Proceedings of the Workshop on Building and Using Comparable Corpora, 6th International Conference on Language Resources and Evaluation (LREC). 2008. pp. 27–32.

Downloads

Published

2025-04-10

How to Cite

Tarnavska, M. M. (2025). COMPILATION OF A CORE VOCABULARY FOR SPECIALISED PROFESSIONAL TEXTS USING THE SKETCH ENGINE SOFTWARE FUNCTIONALITIES. Наукові записки. Серія: Філологічні науки, (212), 31–37. https://doi.org/10.32782/2522-4077-2025-212-4