Degree
|
PhD (math), OJST «Lafarge cement»
|
Location
|
Chimky
|
Articles
|
In this paper we solve the problem of literature texts classification on genres and authors by means of statistical methods. The main instrument of this analysis is the text distribution function by letters and also the sample distribution function. Every text under consideration has sufficiently large volumes, so that its distribution can be treated as stationary one with the accuracy about 3%. For such texts the distances between their distributions are calculated in the space of integrable functions. Classification criterion is based on the closeness between two-lateral text distributions. This method enables to determine the author with the accuracy of 5% and genre — with the accuracy of 15%. One-lateral Read more...
|