+7 (495) 987 43 74 ext. 3304
Join us -              
Рус   |   Eng

Journal archive

№2(110) April 2024 year

Content:

IT management

Performance management

The recruitment industry is at an inflection point: the integration of artificial intelligence has already made its impact on traditional recruitment processes and has the potential to revolutionize it. This article presents an approach to classify resumes into job categories, using semantic similarity search to improve the candidate selection mechanism in recruiting. Our method differs from traditional keyword-based systems and is a deep learning framework that understands and processes the complex semantics of work-related documents. The purpose of the study is to develop a method for classifying resume texts with a complex organizational structure. This study solves several problems at once: increasing the accuracy of resume classification and finding the most stable model for solving the problem of resume classification. We compared standard machine learning methods with neural network ones and showed the effectiveness of the latter. The results indicate an improvement over traditional ML models, suggesting an approach that can be used for pre-screening artificial intelligence recruiting that selects suitable candidates from other applicants. Further, we discovered problems with instability of results when retraining large language models, when the model, even with the same values of the hyperparameters, gives different results. To better understand this phenomenon, we conducted a series of experiments with the main BERT models, varying two parameters – learning rate and seed. As a result, we find a significant increase in performance at a certain threshold parameter, and we quantify which of the found models perform better.

Software engineering

Algorithmic efficiency

At present, the automation of production processes, including the use of computer vision, machine learning and artificial intelligence methods, is of relevance at light industry enterprises due to the fourth industrial revolution. The key role in the production processes is played by the quality of manufactured products – textile fabrics, which is directly affected by the process of defectoscopy. Due to the development of digital technologies and the growth of computing power, it is possible to automate the process of defectoscopy of textile fabrics using computer vision to reduce labor costs and increase the accuracy of defect detection. The purpose of this paper is to conduct experimental studies of the marking and detection of specific classes of textile defects using a hardware-software complex of computer vision and using a neural network approach. To achieve this goal, the paper describes the existing classification of textile web defects, describes the used hardware-software system, and presents the application of the neural network model of the Mask R-CNN architecture to solve the problem of exemplar defect segmentation. As part of the study, a manual partitioning of more than 400 tissue photographs into two classes of defects was performed as an extension of the training sample: “weft crack” and “water damage”, the obtained results of the neural network model were evaluated by IoU metrics: the best result for the class “weft crack” DIoU = 0.2, for the class “water damage” DIoU = 0.87. Based on the results of the experimental studies, conclusions are made about the existing potential of using neural network approach for defectoscopy of similar classes of defects. The presented results can be used for training and retraining of various models of object detection, the gained experience can be applied in other spheres of industry.

Information security

Models and methods

Due to the enormous changes that have occurred in the world over the last few years, for many companies the issue of economic efficiency of their activities has become vitally important. The events industry was not an exception. The economic side of mass events has not yet described adequate in the scientific literature, which makes it difficult to create high-quality IT tools. Such a toolkit should be based on a mathematical model, which will be built considering the economic features of the activities, which in turn requires the study of these features. Thus, the purpose of the study is to formulate a target problem of integer programming based on factors influencing economic indicators during in the process of mass event. The work is based on an analysis of the results of “field” observations, internal documentation of organizing companies, materials of large public events that are in the public domain, as well as scientific and popular literature on the organization and conduct of mass events. In the article, based on the characteristics of mass events, factors influencing economic indicators during the event are determined, and on their basis, the target problem of integer programming is formed. The three-level model proposed by the author for optimizing the distribution of resources in the process of holding a mass event, based on the criteria of fulfilling the plan for a mass event, reducing the costs of attracting resources and their redistribution, can be further considered as the basis for creating IT tools for automating the processes of mass events, which will provide the greatest flexibility and rational use of available resources in various situations.

Nonlinear regression models are an important tool in agricultural research, as many biological processes are theoretically and experimentally described by nonlinear functions. In addition to accurately describing experimental data, nonlinear models have the property of physical interpretability of parameters and are more robust outside the domain of the studied sample. Currently, existing methods for calculating model coefficients – such as Ordinary Least Squares, Weighted Least Squares, and Generalized Least Squares – have several drawbacks. The most advanced Generalized Least Squares method relies on a large number of axioms, which are often not adhered to in real examples, and the theoretical proof is not apodictic. This article introduces a flexible, robust, and accurate method for calculating coefficients for arbitrary single-factor regression models based on the maximum likelihood estimation method. The method is theoretically justified with a minimal number of axioms, and examples of results from the software implementation are provided for the logistic function and the Michaelis function using synthetic test data and experimental samples of dry grass mass production depending on the volume of nitrogen fertilizers. The main advantage of the method lies in the simplicity of theoretical proof and the small number of theoretical constraints on the input parameters of the problem. Unlike Generalized Least Squares, the proposed method deterministically converges to the absolute minimum, thanks to the use of the DIRECT algorithm. It can account for heteroscedasticity and does not require manual tuning of optimization parameters to ensure convergence. Considerations for possible extensions of the method to multifactorial regression analysis and potential improvements for heteroscedasticity estimation are also presented.

Software engineering

In the paper requirements are formulated for representation models, algorithms for obtaining, complexing and processing weakly formalized heterogeneous data to build a spatial model of the research object. An approach to aggregation of multispectral data has been developed using the example task of combining visual data from aerial photography and geographic coordinates of objects obtained using unmanned aerial vehicles. An algorithm for combining visual data is proposed based on the recurrent combining of aerial photography images, which includes key point’s detection in the images and building a RANSAC regression model based on these points. An algorithm for comparing geographic coordinates with points of the combined image is also proposed. The algorithm is based on the idea of equivalent transformations over visual data and geographic coordinates of objects. The proposed algorithms are implemented as a software tool, it is tested on several sets of aerial photography data. Prospects for the development of the proposed approach and the shortcomings of its algorithms that need to be eliminated are identified. It has been established that further optimization of memory use when combining aerial photography images and further research in the direction of compensating for perspective distortion are necessary. The applicability of the proposed approach is shown in the problems of obtaining, complexing, processing and visualizing weakly formalized multispectral data in the field of aerial photography of images of various ranges (thermal imaging, optical, etc.), as well as in other areas of data processing and analysis, such as detection and semantic segmentation objects in aerial photography images. Additional spatial information can improve the accuracy of classification and segmentation of objects in images.

Author: Dmitry Chitalov

The purpose of the presented research is to refine the initial release of the graphical shell for the OpenFOAM package by designing and connecting an additional module focused on numerical experiments using the twoPhaseEulerFoam solver in the field of modeling problems in continuum mechanics. This module, unlike existing analogue applications, has the status of an open source software product, does not require the purchase of maintenance services, and has a Russian-language interface. In the presented software, to simplify further support and modification, the source code of the external part of the application is separated from the code that provides the operating logic. The key original approaches proposed by the author also include a subsystem for serializing design parameters, which allows you to convert the parameters of a design case into json and csv objects and perform the reverse process. This allows the user to switch between different parameter sets for one design case. In addition, it is worth emphasizing the presence in the created software module of a mechanism for checking the completeness of the design case before starting the numerical experiment. Some features of the solver and the principles of its use in preparing calculation cases are considered. The purpose of the study was determined and a list of required tasks was compiled. The selected technology stack is described, as well as development aids. A process diagram is provided to demonstrate how the application works, along with a description of each step. The results of the study were tested using the example of one of the fundamental problems of continuum mechanics and are presented in the form of an updated version of the graphical shell, publicly available on the GitHub resource. Based on the results of the study, the effectiveness of the selected technology stack for achieving development goals was confirmed, and the completed tasks were noted. The practical significance of the results is formulated, expressed in the potential saving of working time for engineers and researchers, minimizing modeling errors and simplifying the process of preparing a design case.

The results of a study are presented, the purpose of which was to create an intelligent machine learning system for modeling the processes of charge agglomeration during processing of phosphate ore raw materials. The relevance of the study is justified by the need to improve the information support of technological systems management processes in the context of the digital transformation of the production environment, carried out within the framework of the Fourth Industrial Revolution and characterized by the massive introduction of the industrial Internet of things, which leads to an avalanche-like increase in the volume of technological data. Their processing using modern analysis methods, including artificial intelligence methods, can improve the quality of decisions made and provide competitive advantages. The scientific novelty of the research results is the structure of the proposed hybrid intelligent machine learning system for modeling phosphate ore processing processes, which is based on the joint use of a dynamic model of the sintering process in the Simulink environment and a deep neural network. The architecture of the neural network was developed taking into account the specifics of the mathematical description of the agglomeration process and includes input fully connected layers that receive measurement results of process variables, as well as a recurrent layer that processes the combined sequence from the outputs of fully connected layers. The integration of a Simulink model and a deep neural network makes it possible to quickly adapt an intelligent system to a specific sintering machine through the use of a two-stage machine learning procedure – first on a Simulink simulation model, and then on a real object. Taking into account the significant inertia of the processes accompanying agglomeration, this approach ensures prompt changes in the settings of the hybrid intelligent machine learning system for the new composition of raw materials and technological parameters. A program has been developed that provides a convenient graphical interface for preparing and using an intelligent system, and simulation experiments have shown that the process of additional training for new technological parameters is much faster than initial training while maintaining high accuracy of the obtained modeling results.

Laboratory

Researching of processes and systems

The speed, precision and recall of information search in e-commerce are critical indicators for business success. A large number of academic studies are aimed at increasing these indicators through more efficient utilization of hardware and the development of machine learning models with new “layers”, training data and loss functions. In this study, the author focused on the practical task of speeding up the search by using knowledge about the nature of the load and data. Widely used Approximate Nearest Neighbors methods use artificial clusters to reduce the processing time of a search query. At the same time, the recall of the list of candidates found worsens. This approach is justified by its universality in relation to data. But in the case of an electronic online trading platform, the data are products and their modalities – name, description, product class, images, which makes it possible to use this knowledge about data to create more effective search structures and algorithms. In conditions of high dynamics of changes in product data, it is also necessary to take into account the speed, accuracy and completeness for offline and online processes. Therefore, the author considered the task of forming the completeness and accuracy of search results within the framework of an end-to-end process, and not only as an retrieval phase. As a result, the author received an improvement in the recall and precision of the retrieved product information by more than 50% without reducing the speed of search query processing.

REVIEWS AND COMMENTARIES

Author: V. Meshalkin

A review for the new edition of the textbook “Competition in Entrepreneurship” for higher educational institutions, published by the University publishing house “Synergy”, is presented. The following main advantages of the textbook are noted: the presence of the author’s view on such issues as the process of competitiveness, its objects, subjects, actors, results, resources, types and character of competitive interaction, types of competitive actions, methods of operational interaction between competitive parties, competitive stability and competitive ability, competitive positions, competitive status, competitive strategies, tactical competitive operations and combinations, competitive situations, competitive tricks and puzzles, management of competitive actions, competitive analysis. In the context of digital economy and the application of intelligent management technologies, the use of the considered publication in the educational process will allow university students to develop the knowledge and skills necessary to conduct successful business activities in modern economic conditions. The tools described in the textbook can be effectively used at the strategic, operational and situational levels of participation in competition, considering the potential for its implementation as part of the organization corporate information systems.