TO THE QUESTION OF CHOOSING FREE TOOLS TO STUDY DATA MINING COURSES IN HIGHER EDUCATION INSTITUTIONS
DOI:
https://doi.org/10.32782/cusu-pmtp-2024-2-12Keywords:
data mining; analytical packages; free software; cluster analysis, training of specialistsAbstract
Data Mining (DM) is one of the most important areas in the development of information technologies, so disciplines related to DM are included into the educational standard for the specialists in the field of computer sciences. However, the choice of training software are relevant, because, on the one hand, tools that are commonly used in the practical activity of enterprises, large IT companies, agencies specializing on data analysis are proprietary and quite expensive. On the other hand, the future specialists should be formed knowledge and skills in the use of basic methods and algorithms of data analysis, features of data preparation for the different types of analysis, formats for presenting the results of the analysis and the ability to interpret the results. In this case, the usage of free means will be quite acceptable for the educational purposes, provided that their functionality complianes with the objectives of the discipline. The article examines such types of software as spreadsheet programs, specialized packages and programming languages - for the usage of data analysis during the training. At the article some of these tools were compared. Examples of using SPSS, RapidMiner, Knime, Orange, Jasp and R for cluster analysis were given. However, the results of the pedagogical experiment show that the quality of learning of educational material does not depend on which software were used during studying the discipline. Therefore, when choosing software, it is advisable to evaluate their cost and functionality (coverage of methods of data mining, visualization tools, quality of results, etc.). A conclusion about the possibility of using free software if its functionality matchs the objectives of the learning was made.
References
Стандарт вищої освіти України першого (бакалаврського) рівня ступеня «бакалавр» за галуззю знань 12 «Інформаційні технології» спеціальністю 122 «Комп’ютерні науки» : МОН України, 2019. URL: https://mon.gov.ua/storage/app/media/vishcha-osvita/zatverdzeni%20standarty/2019/07/12/122-kompyut.nauk.bakalavr-1.pdf.
Chakrabarty P., Halder K., Rao P. Tools and Methods of Educational Data Mining: A Review. Easy Chair Preprint №9763, 2023. URL: https://easychair.org/publications/preprint_download/zQDg.
Dol S. M., Jawandhiya P. M. A Review of Data Mining in Education Sector. Journal of Engineering Education Transformations. 2023. no. 36 (Special Issue 2), pp. 13–22. https://doi.org/10.16920/jeet/2023/v36is2/23003.
Shrivastava A., Jain J. K., Chauhan D. “Literature Review on Tools & Applications of Data Mining. International Journal of Computer Sciences and Engineering, 2023. vol.11, Issue 4, pp. 46–54. https://doi.org/10.26438/ijcse/v11i4.4654. URL: https://www.ijcseonline.org/pdf_paper_view.php?paper_id=5560&8-IJCSE-09093.pdf.
Altalhi A. H., Luna J. M., Vallejo M. A., Ventura S. Evaluation and comparison of open source software suites for data mining and knowledge discovery. WIREs Data Mining and Knowledge Discovery, 2017. Vol. 7, Issue 3. https://doi.org/10.1002/widm.1204.
Pawar S., Stanam A. Scalable, Reliable and Robust Data Mining Infrastructures/. 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), London, UK, 2020. pp. 123–125. https://doi.org/10.1109/WorldS450073.2020.9210388.
Almeida P., Gruenwald L., Bernardino J. Evaluating Open Source Data Mining Tools for Business. Proceedings of the 5th International Conference on Data Management Technologies and Applications – DATA, 2016. pp. 87–94. http://dx.doi.org/10.5220/0005939900870094.
Almeida P., Bernardino J. A Survey on Open Source Data Mining Tools for SMEs. In: Rocha, Á., Correia, A., Adeli, H., Reis, L., Mendonça Teixeira, M. (eds) New Advances in Information Systems and Technologies. Advances in Intelligent Systems and Computing, Springer, Cham, 2016. vol 444, pp. 253–262. https://doi.org/10.1007/978-3-319-31232-3_24.
Özkan S. B., Apaydin S. M. F., Özkan Y., Düzdar I. Comparison of Open Source Data Mining Tools: Naive Bayes Algorithm Example. 2019 Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science (EBBT), Istanbul, Turkey, 2019. pp. 1–4. https://doi.org/10.1109/EBBT.2019.8741664.
Jovic A., Brkic K., Bogunovic N. An overview of free software tools for general data mining. 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 2014. pp. 1112–1117. https://doi.org/10.1109/MIPRO.2014.6859735.
Al-Odan H. A., Al-Daraiseh A. A. Open Source Data Mining tools. 2015 International Conference on Electrical and Information Technologies (ICEIT), Marrakech, Morocco, 2015, pp. 369–374. https://doi.org/10.1109/EITech.2015.7162956.
Журан О. А., Донченко К. В. Методи та засоби інтелектуальної обробки інформації. Інформатика. Культура. Технології: матеріали VІІІ-ї Міжнародної науково-практичної конференції, Одеса, Україна, 2021, с. 14–16. URL: http://dspace.op.edu.ua/jspui/bitstream/123456789/12104/1/%D0%86%D0%9A%D0%A2-2021%20%20%D1%81%D0%B1%D0%BE%D1%80%D0%BA%D0%B0%203-14-16.pdf.
Лупан І. В. , Авраменко О. В., Акбаш К. С. Комп’ютерні статистичні пакети: навчально-методичний посібник. Кіровоград, Україна: «КОД», 2015. 236 с. URL: https://dspace.cusu.edu.ua/server/api/core/bitstreams/37868982-7a62-4c67-a0c6-acf17c99b48c/content.
Лупан І. В. Інтелектуальний аналіз даних Data Mining: навчально-методичний посібник. Кропивницький, Україна: М. А. Піскова, 2022. 112 с. URL: https://dspace.cusu.edu.ua/server/api/core/bitstreams/9df9df5f-ff91-4d35-8497-9a8ac98de872/content.
Талах Т., Талах В. Використання функцій Excel в аналітичних дослідженнях та в економічній аналітиці”. Економіка та суспільство, №50, 2023. http://doi.org/10.32782/2524-0072/2023-50-58.
Data analytics and AI platform | Altair RapidMiner. URL: http://altair.com/altair-rapidminer.
KNIME Analytics Platform. URL: https://www.knime.com/knime-analytics-platform.
Orange Data Mining. URL: http://orangedatamining.com.
JASP. A fresh way to do statistics. URL: http://jasp-stats.org.