+7 (495) 987 43 74 ext. 3304
Join us -              
Рус   |   Eng

Authors

Meksheneva Zhanna V.

Degree
Cand. Sci. (Econ.), Associate Professor, Information Management and Information and Communication Technologies Department named aft Professor V. V. Dik, Synergy University
E-mail
zhmeksheneva@synergy.ru
Location
Moscow, Russia
Articles

Text sentiment analysis in banking

The paper presents the author's approach to solving the problem of sentiment analysis of online Russian-language messages about the activities of banks. The study data are customer reviews about banks in general and their products, services and quality of service posted on the Banki.ru portal. In this paper, the problem of text sentiment analysis is considered as a binary classification task based on a set of positive and negative reviews. A vector model with a tf-idf weighting scheme was used to represent the collected and preprocessed texts. The following algorithms with the selection of optimal parameters on the grid were used for binary classification task: naive Bayesian classifier, support vector machine, logistic regression, random forest and gradient boosting. Standard statistical metrics, such as accuracy, completeness, and F-measure, were used to evaluate the quality of solving the classification problem. For the indicated metrics, the best results were obtained on the classification model developed with the use of Support Vector Machine. Thematic text modeling was also carried out using the Dirichlet latent placement method to define the most typical topics of customer messages. As a result, it was concluded that the most popular message topics are "cards" and "quality of service". The obtained results can be used in the activities of banks to automate its reputation monitoring in the media and when routing client requests to solve various problems. When solving problems, the features of the Python programming language were actively used, namely, libraries for web scraping, machine learning, and natural language processing. Read more...

Accuracy estimating of highly noisy signals digital processing using heuristic algorithms

Heuristic algorithms are often used as an alternative when solving problems of high computational complexity or lacking an exact solution, allowing to quickly obtain the desired result. Usually, they do not have a strict mathematical justification, but their application is justified in terms of practicality. Formally, algorithms that use approximate methods can be classified as heuristic. However, when applying them, the problem of determinism lack is often arises, which does not always allow one to evaluate the solution obtained accuracy. The paper considers a methodical approach to assessing the accuracy of heuristic algorithms designed to determine the useful signal shape and parameters on the strong noise component background. It is based on the method of analogy and consists in modeling an artificial signal with given parameters and a background noise interference similar in its characteristics to additive white Gaussian noise. In this case, the noise component is formed by software using a pseudo-random number sequence generator. Such generators are included in the packages of almost all high-level programming languages built-in functions. A comparative analysis of the real and artificial noise characteristics is presented, that shown the problem solving by numerical modeling possibility. The results of accuracy estimation in determining the artificial signal parameters, that is separated from the noise component using piecewise linear approximation and averaging heuristic algorithms, are obtained. The problem of empirical data smoothing with the discrete signal equivalent replacement by a quadratic functions whose parameters provide a piecewise parabolic approximation its shape is also considered. This procedure eliminates the residual signal bounce that inevitably occurs as a result of linearization and allows further recording at any sampling rate. Thus, the proposed approach allows us to quantify the accuracy of heuristic algorithms used in determining the expected signal parameters. Read more...