№ 3(87)
06 june 2020 year
Rubric: Algorithmic efficiency Authors: Borisov V. V., Bulygina O. V., Dli M. I., Kozlov P. |
One of the key areas of informatization of public authorities is to develop and implement the systems of automated processing the electronic appeals (applications, complaints, suggestions) of individuals and legal entities that arrive on official websites and portals of government. The rubrication plays an important role in solving this problem. It consists in the appeals’ distribution according to thematic rubrics determining the directions of the activity of departments carrying out processing and preparation of the corresponding response. The results of the analysis of the specific features of such text messages (small size, markup lack, the errors’ presence, thesaurus unsteadiness, etc.) confirmed the impossibility of using traditional approaches to rubrication and justified the feasibility of using data mining methods. The article proposes a new approach to the analysis and rubrication of electronic unstructured text documents arrived on official websites and portals of public authorities. It involves the formation of a tree-like structure of the rubric field, based on fuzzy relationships of differences between the syntactic characteristics of documents. The analysis is based on determining the fuzzy correspondence of these documents by their syntactic characteristics with the values of the clusters’ centers. It is carried out sequentially from the root to the leaves of the constructed fuzzy decision tree. The proposed rubrication method is programmatically implemented and tested in the automated processing and analysis of appeals (applications, complaints and suggestions) of citizens entering the Administration of Smolensk Region. This made it possible to ensure prompt and high-quality updating of rubrics and document analysis under conditions of non-stationary composition of the thesaurus and the importance of rubric words. Continue... |
---|---|
№ 3(99)
31 may 2022 year
Rubric: Performance management Authors: Stroev S., Lyublinskaya N., Meksheneva Z., Nechaev A., Shokolov V., Zakharov A. |
The paper presents the author's approach to solving the problem of sentiment analysis of online Russian-language messages about the activities of banks. The study data are customer reviews about banks in general and their products, services and quality of service posted on the Banki.ru portal. In this paper, the problem of text sentiment analysis is considered as a binary classification task based on a set of positive and negative reviews. A vector model with a tf-idf weighting scheme was used to represent the collected and preprocessed texts. The following algorithms with the selection of optimal parameters on the grid were used for binary classification task: naive Bayesian classifier, support vector machine, logistic regression, random forest and gradient boosting. Standard statistical metrics, such as accuracy, completeness, and F-measure, were used to evaluate the quality of solving the classification problem. For the indicated metrics, the best results were obtained on the classification model developed with the use of Support Vector Machine. Thematic text modeling was also carried out using the Dirichlet latent placement method to define the most typical topics of customer messages. As a result, it was concluded that the most popular message topics are "cards" and "quality of service". The obtained results can be used in the activities of banks to automate its reputation monitoring in the media and when routing client requests to solve various problems. When solving problems, the features of the Python programming language were actively used, namely, libraries for web scraping, machine learning, and natural language processing. Continue... |
Research and development (R&D) ensure stable functioning and forms the innovative potential of most companies in the production sector. Ineffective R&D management leads to the fact that many initiated projects go beyond planned deadlines and budgets, and much of the intermediate R&D results are not completed. The complexity of R&D management is associated with high information uncertainty regarding the performance of R&D and the productivity of employees. The paper considers a multi-model method of decision support for R&D management in companies. To reduce information uncertainty in solving various management problems it is proposed to use an ontological model of intellectual capital of the company, simulation models of R&D processes and individual stages, fuzzy logic models to obtain integral assessments of management decisions. The method provides a basis for making decisions on the possibility and expediency of using previously obtained R&D results (scientific and technological reserve); on the feasibility of the proposed project based on the assessment of its feasibility; on the project organization (volume-calendar planning); on the allocation of resources to tasks; on the incentives for performers; on the planning of activities for additional training and organization of information support. The paper provides a general description of the method, as well as an example of its use to support decision-making on the feasibility of an R&D project based on its assessment. Two structures for organizing the R&D process in a manufacturing company are considered as alternatives. After selecting the best structure, the impact of staffing quality on the integral feasibility assessment is evaluated. Continue... | |
The problems of high-level synthesis of very large integrated circuits (VLSI) are considered. The review of the subject area shows that the use of the imperative model and corresponding programming languages does not provide efficient parallelization of algorithms and the possibility of efficient parallelization of programs. This leads to the impossibility of providing the required technical characteristics. This is due to the specifics of VLSI, which is essentially a scheme of parallel processing of information flows. An original VLSI synthesis method is presented. The method based on the functional-streaming paradigm of parallel computing. This method allows ensuring architectural independence and maximum coverage of implementation options. The route map of VLSI functional-flow method is outlined. The problem of estimating the requested hardware resources and clock frequency, necessary for solving, is formulated. This problem must be solved at the early stages of design. A method for estimating resources in the process of functional-flow synthesis is proposed. The method is based on the use of an additional meta-layer (HDL-graph). Taking into account the polymorphism of the solution of the resource estimation problem, it is proposed to use machine learning technologies in the new method. It is shown that the application of the indicated method in the synthesis process makes it possible to provide the most accurate assessment of resources. This is possible, because the HDL graph is a data flow graph typed and structured in accordance with the functional-flow model of parallel computing. Machine learning allows to most effectively obtain a solution to the problem of optimal selection of the required resources. The classes of resources for which an assessment is required are highlighted. Selected parameters for building a resource assessment model. The software implementation and comparison of the proposed resource estimation method based on linear regression models, neural networks and gradient boosting with known approaches is performed. It is shown that when using the technology of functional-flow synthesis when applying the proposed method for estimating the required resources and performance, an increase in the accuracy of the estimate at the high-level stage. Continue... | |
№ 3(99)
31 may 2022 year
Rubric: Models and methods Authors: Trubin A., Aleksahin A., Batishchev A., Filimonova E., Morozov A., Ozheredov V. |
In this article, the construction and analysis of machine learning models were performed for short-term forecasting in the cryptocurrency market on the example of bitcoin – one of the most popular cryptocurrencies in the world. The initial data for the study leads to the conclusion that over the long period of its existence, bitcoin has shown a high degree of volatility, especially evident in comparison with traditional financial instruments. The article substantiates that this market is influenced by a multitude of factors. No one can say for sure what makes up the value of a particular cryptocurrency, as it involves a range of reasons, which cannot be fully taken into account. To overcome this problem, we have considered the principle of recurrent neural network. It is described why networks with memory are better at making predictions on the time series than conventional autoregressive model and standard forward propagation networks. The initial data processing algorithm and transformation methods are defined. The sample was reduced in order to increase the speed of the network, by reducing the number of recalculations of weights. The algorithm of the family of recurrent neural networks was built and trained to test the hypothesis about their better adaptivity due to short-term and long-term memory. The model is evaluated on the test data representing the bitcoin exchange rate for 2021–2022, since this period is characterized by high volatility. It is concluded that it is reasonable to use a similar type of models for short-term forecasting of cryptocurrency rates. Continue... |
The work presents analysis of possible application of self-generating neural networks, which can independently generate a topological map of neuron connections while modelling biological neurogenesis, in multi-threaded information communication systems. A basic optical neural network cell is designed on the basis of the applied layered composition performing data processing. A map of neuron connections represents not an ordered structure providing a regular graph for exchange of information between neurons, but a set of cognitive reserve represented as an unconnected set of neuromorphic cells. Modelling of neuron death (apoptosis) and creation of dendrite-axon connections makes it possible to implement a stepwise neural network growth algorithm. Despite challenges in implementing this process, creating a growing network in an optical neural network framework solves the problem of initial forming of the neural network architecture, which greatly simplifies the learning process. Neural network cells used with the network growth algorithm resulted in neural network structures that use internal self-sustaining rhythmic activity to process information. This activity is a result of spontaneously formed closed neural circuits with common neurons among neuronal cells. Such organisation of recirculation memory leads to solutions with reference to such intra-network activity. As a result, response of the network is determined not only by stimuli, but also by the internal state of the network and its rhythmic activity. Network functioning is affected by internal rhythms, which depend on the information passing through the neuron clusters, which results in formation of a specific rhythmic memory. This can be used for tasks that require solutions to be worked out based on certain parameters, but they shall be unreproducible when the network is repeatedly stimulated by the same influences. Such tasks include ensuring information transmission security when using some set of carriers. The task of determining a number of frequencies and their frequency plan depends on external factors. To exclude possible repeating generation of the same carrier allocation, it is necessary to use networks of the configuration under consideration that can influence generation of solutions through the gathered experience. Continue... |