Degree
|
Cand. Sci. (Phys.-Math.), Head of Neural Networks and Machine Learning Department, Institute of Applied Mathematics and Automation KBSC RAS |
---|---|
E-mail
|
lylarisa@yandex.ru |
Location
|
Nalchik, Russia |
Articles
|
Forecasting mudflow characteristics with incomplete and imprecise data based on machine learning modelsIn this paper we propose a method for analyzing incomplete and inaccurate data in order to identify factors for predicting the volume of mudflows. The analysis is based on the mudflow activity inventory data for the south of Russia, which is poorly formalized, has missing values in the mudflow types field, and requires significant additional processing. Due to the lack of information on the mudflow type in the cadastral records, the primary objective of the study is to develop and apply a methodology for classifying mudflow types to fill in the missing data. For this purpose, a comparative study of machine learning methods was performed, including neural networks, support vector machines, and logistic regression. The experimental results indicate that the neural network-based model has the highest prediction accuracy among the methods considered. However, the support vector machine method demonstrated a higher sensitivity rate for classes represented by a small number in the test sample. In this regard, it was concluded that an integrated approach is appropriate, combining the strengths of both methods, which can help improve the overall classification accuracy in this subject area. Forecasting the volume of material removal and data clustering showed the presence of nonlinear dependencies, incompleteness and poor structuring of data even after filling in missing values of the mudflow type, which required a transition from numerical data to categorical data. This transition increased the model’s resistance to outliers and noise, allowing for a highly accurate forecast of a one-time removal. Since the forecast does not reveal the factors influencing its result, an analysis was conducted to identify these factors and present the found patterns in the form of logical rules. The formation of logical rules was carried out using two methods: the associative analysis method and the construction of a logical classifier. As a result of applying associative analysis, rules were found that reflect some patterns in the data, which, as it turned out, need significant correction. The use of the developed logical methods made it possible to clarify and correct the patterns identified using associative rules, which, in turn, ensured the determination of a set of factors influencing the volume of the mudflow. Read more... |