+7 (495) 987 43 74 ext. 3304
Join us -              
Рус   |   Eng


  

“Journal of Applied Informatics” is a peer-reviewed science journal with international representation of editorial board and authors, covering a significant part of Russian IT-area. The topics of the publications are connected to the aspects of theory and application of computer modeling and information technologies in various professional areas. The journal is indexed by Russian Science Citation Index on Web of Science platform.

More
In accordance with the decision of the Higher Attestation Commission of the Ministry of Education and Science of Russian Federation, journal is included in the «List of Leading Peer-Reviewed Scientific Journals and Publications authorized to publish main dissertation results»

A method for predicting bank customer churn based on an ensemble machine learning model

The results of research are presented, the purpose of which was to develop a method for predicting the outflow of clients of a commercial bank based on the use of machine learning models (including deep artificial neural networks) for processing client data, as well as the creation of software tools that implement this method. The object of the study is a commercial bank, and the subject of the study is its activities in the B2C segment, which includes commercial interaction between businesses and individuals. The relevance of the chosen area of research is determined by the increased activity of banks in the field of introducing digital services to reduce non-operating costs associated, in particular, with retaining clients, since the costs of attracting new ones are much higher than maintaining existing clients. The scientific novelty of the research results is the developed method for predicting the outflow of commercial bank clients, as well as the algorithm underlying the software that implements the proposed method. The proposed ensemble forecasting model is based on three classification algorithms: k-means, random forest and multilayer perceptron. To aggregate the outputs of individual models, it is proposed to use a learning tree of fuzzy inference systems of the Mamdani type. Training of the ensemble model is carried out in two stages: first, the listed three classifiers are trained, and then, based on the data obtained from their outputs, a tree of fuzzy inference systems is trained. The ensemble model in the proposed method implements a static version of the forecast, the results of which are used in a dynamic forecast performed in two versions – based on the recurrent least squares method and based on a convolutional neural network. Model experiments carried out on a synthetic dataset taken from the Kaggle website showed that the ensemble model has a higher quality of binary classification than each model individually.

A neural network algorithm for identifying and removing outliers in noisy data sets

Outliers in statistical data, which are the result of erroneously collected information, are often an obstacle to the successful application of machine learning methods in many subject areas. The presence of outliers in training data sets reduces the accuracy of machine learning models, and in some cases, makes the application of these methods impossible. Currently existing outlier detection methods are unreliable. They are fundamentally unable to detect some types of outliers, while observations that are not outliers are often classified as outliers by these methods. Recently emerging neural network methods for outlier detection are free from this drawback, but they are not universal, since the ability of neural networks to detect outliers depends both on the architecture of the neural network itself and on the problem being solved. The purpose of this study is to develop an algorithm for creating and using neural networks that can correctly detect outliers regardless of the problem being solved. This goal is achieved by using the property of some specially created neural networks to demonstrate the largest training errors on those observations that are outliers. The use of this property, as well as the implementation of a series of computational experiments and the generalization of their results using a mathematical formula, which is a modification of the consequence of the Arnold – Kolmogorov – Hecht-Nielsen theorem, made it possible to achieve the stated goal. The use of the developed algorithm turned out to be especially effective in solving the problems of forecasting and controlling interdependent thermophysical and chemical-energy-technological processes of processing ore raw materials, occurring at existing serial metallurgical enterprises, where the presence of outliers in statistical data is almost inevitable, and without their identification and exclusion, the construction of neural network systems that are acceptable in accuracy models are generally impossible.

Algorithm for identifying threats to information security in distributed multiservice networks of government bodies

The results of studies are presented, the purpose of which was to develop an algorithm for identifying information security threats in distributed multiservice networks that provide information interaction of regional government bodies, as well as their communication with the population of the region. The relevance of the research topic is due to a significant increase in various types of cyber attacks on the computer networks of public authorities and the need to increase the level of security of these networks by intellectualizing methods for combating information security threats. The algorithm is based on the use of machine learning methods to analyze incoming traffic in order to identify events that affect the state of information security of public authorities. The algorithm provides for input traffic preprocessing, as a result of which a set of images (signatures) obtained from Wasm binary files is formed, and then the image classifier is launched. It contains a sequential inclusion of deep neural networks – a convolutional neural network for signature classification and a recurrent network that processes the sequences obtained at the output of the convolutional network. Features of the formation of signatures in the proposed algorithm, as well as sequences at the input to the recurrent network, make it possible to obtain the resulting assessment of information security, taking into account the history of its current state. The output of the recurrent network is aggregated with the result of comparing the actual signatures with those available in the database. The aggregation is performed by the fuzzy inference system of the second type, using the implication according to the Mamdani algorithm, which generates the final assessment of information security threats. Software was developed that implements the proposed algorithm, experiments were carried out on a synthetic data set, which showed the efficiency of the algorithm, confirmed the feasibility of its further improvement.

Algorithm for steganographic information protection in video files based on a diffusion-probabilistic model with noise reduction

The results of a study are presented, the purpose of which was to develop a steganography algorithm for hiding text messages in video files. The algorithm is based on the use of a diffusion-probability model with noise reduction, which is implemented by a deep artificial neural network. The algorithm consists of two parts – for the parties sending and receiving the message. On the transmitting side, the following is carried out: synthesis of handwritten images of symbols (signatures) of the line of the hidden message, alignment of their frequency; applying direct diffusion to signatures, resulting in the generation of a noisy image that is deposited into a video stego container. At the receiving end, signatures are extracted from the video content, back diffusion is performed to obtain signatures of handwritten string characters, which are recognized using a convolutional neural network. The novelty of the research lies in the original developed algorithm for steganographic information protection in video files, as well as in a modified method of signature deposition based on the method of replacing the least significant bits. The method consists of bitwise embedding of bytes characterizing the pixel brightness level in the signature into the same blue brightness digits in a sequence of 8 frames of a video stego container. This method made it possible to significantly reduce the visible changes made to the video content when replacing not the least significant bits, but the middle significant bits in the stego container. This, in turn, provides greater resistance to compression attacks when transmitting information over the stegochannel. The practical significance of the research results lies in the developed software, with the help of which the algorithm for steganographic information protection in video files was tested, which showed high values of the peak signal-to-noise ratio and the index of structural similarity of images when embedding information in the middle bits of the bytes that set the brightness of the pixels of the stego container.

Algorithms for composing efficient business models

Solving the problems of effective business management is associated with a variety of current goals facing the same and, by implication, requires the construction of appropriate models of efficient business. The article presents two problems of doing business which, apart from their common target being an improvement of business efficiency, have different current goals. The creation or development of any business involves the construction of a specific business plan for it, including a list of those areas of business development, the implementation of which will increase its efficiency. The first problem considered in the article is related to the phased implementation of all areas of efficiency improvement in order to ultimately obtain the greatest efficiency of their realization. The second one solves the problem of increasing efficiency by partially implementing efficiency improvement directions from the initial list, taking into account certain limitations, for example, in conditions of limited company resources. For the construction of models which would meet the problems set, an efficiency criterion is substantiated and proposed in the article, and Algorithms 1 and 2 are developed which made it possible to build the efficient business models which take into account the difference in its current goals. The authors have developed a multi-stage Algorithm 1 for the generation of individual sets of areas for improvement of efficiency to be used to solve the tasks at hand. Algorithm 2 implemented at each stage of Algorithm 1 has been developed by the authors by using the Pareto optimality method but supplemented by taking into account the features and objectives of the current tasks set for the business. The use of such algorithms has made it possible to build efficient business models enabling not only to obtain an economic effect inherent to each efficiency improvement area, but also to ensure additional growth thereof driven by the properties of the developed algorithms.

An approach to the design of a neural network for the formation of an individual trajectory of knowledge testing

The paper discusses the issues of implementing an adaptive testing system based on the use of artificial neural network (INS) modules, which should solve the problem of intelligent choice of the next question, forming an individual testing trajectory. The aim of the work is to increase the accuracy of the INS to form the level of complexity of the next test question for two types of architectures – direct propagation (FNN – Feedforward Neural Network) and recurrent with long-term short-term memory (LSTM – Long-Short Term Memory). The data affecting the quality of training are analyzed, the architectures of the input layer of the direct propagation INS are considered, which have significantly improved the quality of neural networks. To solve the problem of choosing the thematic block of the question, a hybrid module structure is proposed, including the INS itself and a software module for algorithmic processing of the results obtained from the INS. A study of the feasibility of using direct propagation ANNs in comparison with the LSTM architecture was carried out, the input parameters of the network were identified, various architectures and parameters of the ANN training were compared (algorithms for updating weights, loss functions, the number of training epochs, packet sizes). The substantiation of the choice of a direct distribution network in the structure of the hybrid module for selecting a thematic block is given. The above results were obtained using the Keras high-level library, which allows you to quickly start at the initial stages of research and get the first results. Traditionally, learning has taken place over a large number of eras.

Analysis and testing of neural network TCP/IP packet routing algorithms in private virtual tunnels

One of the most important components of the global Internet are traffic control and management systems. In order to achieve uninterrupted information and communication interaction, the organization of the process is constantly changing, covering not only individual subnets, but also p2p network architectures. The dominant areas for improving the network structure include 5G, IoT and SDN technologies, but their implementation in practice leaves the issue of ensuring the information security of networks built on their basis without a satisfactory solution. Current virtual tunnel deployment topologies and intelligent traffic distribution components provide only partial solutions, particularly in the form of access control based on user traffic and security through dedicated user certificates. The deployment of a tunnel is of particular importance in cases where it is necessary to ensure consistency and coordination of the work of complex socio-economic systems, an example of which is the information and communication exchange between participants in scientific and industrial clusters formed to implement projects for the creation of innovative products. However, existing solutions have disadvantages such as the need to purchase a license for full-featured access to the software product and specialized configuration of client-server authentication that provides secure access to a remote network route. The approach proposed by the authors, based on neural network distribution of traffic between clients of a private dedicated network, allows us to eliminate the noted shortcomings. Based on this principle, a multi-module system for intelligent packet routing was created and tested through unit testing. An analysis of the effectiveness of using a trained network address distribution model is presented in comparison with the use of a DHCP server based on the isc-dhcp-server package, distributed as the dhcpd service.

Application of an ontological approach to the problems of energy consumption data exchange

The author: Fomin I.
The article describes the technical concept of organizing data exchange between a specialized settlement center that carries out billing of consumed heat energy, and an energy sales company that supplies heat energy to industrial enterprises, government agencies and the population. The article describes the features of the technical problem of data exchange, which determine the parameters of mathematical models for calculating the volumes and costs of consumed energy resources, and then reviews approaches to solving this class of problems. To solve the technical problem, the features of the data preparation stage for the initial data exchange were formalized and schemes for organizing a regular data flow based on an ontological data model were proposed. The originality of the proposed approach was expressed in the definition of classes and their properties for concepts reflecting sets of information about the parameters of energy supply facilities, parameters for calculating volumes, prices and costs of energy resources, which made it possible, using an ontology editor, to form graphically formalized semantics, which became the basis for the formation of rules data processing for information exchange. The concepts of the ontological model were related to each other by sets of classified predicates, the use of which was illustrated by examples of descriptive logic queries. The implemented data exchange process based on the ontological model is illustrated with a data flow diagram. The ontological approach to solving the described problem made it possible to organize an end-to-end connection between the formalized reflection of the calculation models required for billing and the exchange data model, which made it possible to balance and comply with management and information technology requirements for this procedure.