Classification problems represent the group of machine learning methods where each instance is
associated with a certain category or label. An individual classifier like Neural Networks, or Decision
Trees is conventionally trained on a pre-marked or processed data set. Depending on the parameters
distributions the data sets may feature issues when all the indicators are not learned efficiently by such
a classifier, and this results in an inconsistent performance on the test sets. Ensemble classifiers denote
a set of individual classifiers algorithms that are simultaneously trained in a classification problem. The
paper aim is twofold. We present an ensemble of classifiers approach with a high predictive power for
the Russian trade-related companies bankruptcy prediction. At the first stage we split the data into a
train set (70%) and a test set (30%). At the second stage the precision of standard algorithms is measured
as applied to the empirical indicators of the data. The algorithms are trained and tested, and then
compared via the performance metrics. The standard algorithms include: random forest, decision trees
and the modifications: the chi-square automatic interaction detection (CHAID), classification and regression
trees (CRT, C5), Quick, Unbiased, Efficient, Statistical Tree (QUEST), discriminant analysis
LDA, support vector algorithms (LSVM, SVM), neural networks (multilayer and radial). Based on the
ROC-curve metrics and the prediction ability of the algorithms we select the most efficient methods
that form the ensemble of classifiers algorithm. The empirical data set included 713 trade companies
(334 — known bankrupts). The results feature the efficiency of the ensemble of classifiers algorithms
based on the simple voting (the precision metric outperforms the one of the other individual algorithms,
e.g. random forest, SVM, Logit). We also show that including the macroeconomic factors improves
the prediction power of almost all studied algorithms by at least 8%. Given that, more sophisticated
variations of the classifiers such as multilayer neural networks and random forests demonstrate higher
precision and recall with the external variables employed in the training process.
Key words
bankruptcy prediction, classifier ensemble, machine learning, information systems, combined algorithm