|
MAIN PAGE
> Back to contents
Cybernetics and programming
Reference:
Borovskii A.A. —
Prospects for the use of machine learning techniques in processing large volumes of historical data
// Cybernetics and programming.
– 2015. – № 1.
– P. 77 - 114.
DOI: 10.7256/2306-4196.2015.1.13730 URL: https://en.nbpublish.com/library_read_article.php?id=13730
Prospects for the use of machine learning techniques in processing large volumes of historical data
Borovskii Aleksandr Aleksandrovich
Chief Technology Director, LLC "EST"
119034, Rossiya, Moskva, g. Moskva, per. Sechenovskiy, 9, of. 18
|
aborovsky@est-it.ru
|
|
 |
DOI: 10.7256/2306-4196.2015.1.13730
Review date:
18-11-2014
Publish date:
20-01-2015
Abstract. In relation to the problems of development of information-analytical platform “The history of modern Russia” the author researches analytical capabilities of the modern methods of machine learning and perspectives of its’ practical use for processing and analyzing large volumes of historical data. The article reviews different strategies of applying machine learning techniques taking into account peculiarities of the studied data. Special attention is given to a problem of interpretability of different types of results, obtained using the machine learning algorithms, as well as the ability to recognize trends and anomalies. As a methodological basis of the research the author uses a theory of information systems, database theory, induction, deduction, comparative, systematic, formal logic, and other methods. The author concludes that the algorithms of machine learning can be used to effectively solve a large class of problems, related to the analysis of historical data, including finding hidden dependencies and patterns. It is noted that establishment of large-scale digital repositories of evidence of historical events makes it possible to examine and analyze the data as a specific time series allowing to investigate the change of state of the social system in time.
Keywords:
history of modern Russia, data analysis, Data Mining, Digital Humanities, machine learning, algorithms, identification of patterns, interpretability of results, time series, anomalies
This article written in Russian. You can find full text of article in Russian
here
.
References
1.
|
Cios K. J., Pedrycz W., Swiniarski R. W., Kurgan L. A. Data Mining: A Knowledge Discovery Approach. Springer Science & Business Media, 2007. 606 p.
|
2.
|
Witten I. H., Frank E., Hall M. A. Data Mining: Practical Machine Learning Tools and Techniques. 3rd ed. Morgan Kaufmann, 2011. 630 p. (The Morgan Kaufmann series in data management systems).
|
3.
|
Fauler M., Sadaladzh P.Dzh. NoSQL. Novaya metodologiya razrabotki nerelyatsionnykh baz dannykh. M.: Vil'yams, 2013. 192 s.
|
4.
|
Uait T. Hadoop. Podrobnoe rukovodstvo. M.: Piter, 2013. 672 s.
|
5.
|
Taniar D. Data Mining and Knowledge Discovery Technologies. Hershey, New York: IGI Publishing, 2008. 370 p.
|
6.
|
MachineLearning.ru. Professional'nyi informatsionno-analiticheskii resurs, posvyashchennyi mashinnomu obucheniyu, raspoznavaniyu obrazov i intellektual'nomu analizu dannykh. URL: http://www.machinelearning.ru/wiki/index.php?title=Zaglavnaya_stranitsa (data obrashcheniya: 10.11.2014).
|
7.
|
Pal S.K., Mitra P. Pattern Recognition Algorithms for Data Mining. Scalability, Knowledge Discovery and Soft Granular Computing. CRC Press, 2004. 244 p.
|
8.
|
Zhou J., Chen J., Ye J. MALSAR: Multi-tAsk Learning via StructurAl Regularization. Arizona State University, 2012. URL: http://www.public.asu.edu/~jye02/Software/MALSAR (data obrashcheniya: 10.11.2014).
|
9.
|
Abbass H. A., Sarker R. A., Newton Ch. S. Data mining: a heuristic approach. IGI Global, 2002. 310 p.
|
10.
|
He G., Qin Sh., Chin W.-N., Luo Ch. Automated Specification Discovery in a Combined Abstract Domain. URL: http://www.comp.nus.edu.sg/~chinwn/papers/icfem13-cdomain.pdf (data obrashcheniya: 11.11.2014)
|
11.
|
Nigro H. O., Cisaro S. E. G., Xodo D. H. Data Mining with Ontologies: Implementations, Findings, and Frameworks. Hershey, PA: Information Science Reference, 2008. 312 p.
|
12.
|
Derbenev N. V., Tolcheev V. O. Sravnitel'nyi analiz koeffitsientov assotsiativnosti dlya vyyavleniya nechetkikh dublikatov tekstovykh dokumentov // Trudy 18-i Mezhdunar. nauchno-tekhnich. konf. «Informatsionnye sredstva i tekhnologii». M.: Izd-vo MEI, 2010. C. 266– 270.
|
13.
|
Derbenev N. V., Kozlyuk D. A., Nikitin V. V., Tolcheev V. O. Eksperimental'noe issledovanie metodov vyyavleniya nechetkikh dublikatov nauchnykh publikatsii // Mashinnoe obuchenie i analiz dannykh. 2014. T. 1. № 7. S. 875-884.
|
14.
|
Korobeinikov A.G., Grishentsev A.Yu., Kutuzov I.M., Pirozhnikova O.I., Sokolov K.O., Litvinov D.Yu. Razrabotka matematicheskoi i imitatsionnoi modelei dlya rascheta otsenki zashchishchennosti ob''ekta informatizatsii ot nesanktsionirovannogo fizicheskogo proniknoveniya // NB: Kibernetika i programmirovanie. - 2014. - 5. - C. 14 - 25. DOI: 10.7256/2306-4196.2014.5.12889. URL: http://www.e-notabene.ru/kp/article_12889.html
|
15.
|
Luchinin Z.S. Struktura dannykh dlya dokumento-orientirovannykh baz dannykh // Programmnye sistemy i vychislitel'nye metody. - 2013. - 3. - C. 230 - 232. DOI: 10.7256/2305-6061.2013.3.10772.
|
16.
|
Batura T.V. Metody opredeleniya avtorskogo stilya tekstov i ikh programmnaya realizatsiya // Programmnye sistemy i vychislitel'nye metody. - 2014. - 2. - C. 197 - 216. DOI: 10.7256/2305-6061.2014.2.11705.
|
Link to this article
You can simply select and copy link from below text field.
|
|