Degree
|
PhD in Technique, Samsung R&D Institute Rus |
---|---|
E-mail
|
itrub@yandex.ru |
Location
|
Moscow |
Articles
|
Simulation of hierarchical bitmap-indicesThe article describes simulation model of search queries to database with logical operations on hierarchical
bitmap indices. Proposed approach considers bitmaps on time property and combines multileveling
of bitmaps with their binning on sequence of different time units. The input data are distribution
of creation new records flow and distribution of query time interval, the varying condition is
bitmap hierarchy, the model output is average number of OR/XOR logical operations to satisfy query.
The target is to choose bitmap indices hierarchy, that minimizes this number. Simulation technique for
such kind of model is very special, that is why model is manually implemented on C language in order
to carefully cover all its features. Much attention is payed to output analysis as stopping rule when
required confidence interval for target parameter is obtained. Mathematical background of model includes
random number generators and some numerical methods, such as least square method, integration
and spectral analysis. Model verification is provided by several ways, in particular, by comparison
with existing analytical model results when input distribution is exponential. Several experiments
confirm existence of optimal hierarchy and were carried out for different kinds of distributions including
heavy-tailed. Graphical illustrations of simulation results are presented.
Read more...
Advanced simulation based algorithm for search optimal size of bitmap indexThe article continues learning of database hierarchical bitmap indices simulation model, that was
earlier built by author. Target of research is to find optimal time unit size of the second level index
for arbitrary distributions of new database records flow, query time length interval and granularity of
query interval edges. The optimized value is average number of OR/XOR bitstring operations to satisfy
range query. The usual way is to run model many times with different values of second level index
size in accordance with chosen optimization method. Article proposes heuristic algorithm which
allows to get very good initial approximation to optimal value with only one model run and in practice
it quite can be assigned as the final solution. This result is obtained by some facts from renewal
theory but mainly by consideration of special integral function from random model input data and replacement
model output by values of this function. Mathematical properties of this function, that help
to do such trick, are formulated and proved by means of simple number theory suggestions. It is noted,
that this approach is similar to one, in which well known slice indexes are introduced . Proposed algorithm
is confirmed by some numerical results for some kinds of input data distributions. Graphical
illustrations of results are also presented.
Read more...
|