по
Security Issues
12+
Journal Menu
> Issues > Rubrics > About journal > Authors > About the Journal > Requirements for publication > Editorial collegium > Peer-review process > Policy of publication. Aims & Scope. > Article retraction > Ethics > Online First Pre-Publication > Copyright & Licensing Policy > Digital archiving policy > Open Access Policy > Article Processing Charge > Article Identification Policy > Plagiarism check policy
Journals in science databases
About the Journal

MAIN PAGE > Back to contents
Publications of Romanova Ekaterina Vladimirovna
Software systems and computational methods, 2022-3
Pleshakova E.S., Gataullin S.T., Osipov A.V., Romanova E.V., Marun'ko A.S. - Application of Thematic Modeling Methods in Text Topic Recognition Tasks to Detect Telephone Fraud pp. 14-27

DOI:
10.7256/2454-0714.2022.3.38770

Abstract: The Internet has emerged as a powerful infrastructure for worldwide communication and human interaction. Some unethical use of this technology spam, phishing, trolls, cyberbullying, viruses caused problems in the development of mechanisms that guarantee affordable and safe opportunities for its use. Currently, many studies are being conducted to detect spam and phishing. The detection of telephone fraud has become critically important, as it entails huge losses. Machine learning and natural language processing algorithms are used to analyze a huge amount of text data. Fraudsters are identified using text mining and can be implemented by analyzing the terms of a word or phrase. One of the difficult tasks is to divide this huge unstructured data into clusters. There are several thematic modeling models for these purposes. This article presents the application of these models, in particular LDA, LSI and NMF. A data set has been formed. A preliminary analysis of the data was carried out and signs were constructed for models in the task of recognizing the subject of the text. The approaches of keyword extraction in the tasks of text topic recognition are considered. The key concepts of these approaches are given. The disadvantages of these models are shown, and directions for improving text processing algorithms are proposed. The evaluation of the quality of the models was carried out. Improved models thanks to the selection of hyperparameters and changing the data preprocessing function.
Other our sites:
Official Website of NOTA BENE / Aurora Group s.r.o.