Community informatics and the formation of social networking
Computer analysis of texts
Information Technology
Soloviev A.V., Tishchenko V.A. The problems of constructing of alphabetical classifier (on an example of an array of NIKA DBMS)
Systemic regulation of national and regional economy
Risk management and safety
Soloviev A.V., Tishchenko V.A. The problems of constructing of alphabetical classifier (on an example of an array of NIKA DBMS)

Abstract.

The problems arising in the construction of an alphabetic classifier of large enough arrays of text keys are considered. Because of the uneven distribution of words (text keys) in alphabetic combinations, there is a problem associated with constructing the optimal structure of an alphabetic classifier for switching to a given key. The haracteristics of the classifier, such as the random distribution of the key length and the random distribution of the number of vertices in a group are considered. A regression dependence model of average key length in a group of the maximum number of vertices in a group using orthogonal polynomials is proposed. An example of constructing such a dependence for the field name is given. On different examples of dependencies, their type and range of applications are analyzed. An example of a dependence constructed on the basis of a model of fuzzy regression analysis is given.

Keywords:

multilevel alphabetic classifier, regression dependence, key length in the classifier, number of vertices in the group.

pp. 63-73 

References

1. Bast H., Weber I. 2006. Type less, find more: fast autocompletion search with a succinct index. The 29th annual international ACM SIGIR conference
on Research and development in information retrieval. Proceedings. Seattle. 364–371.
2. Tischenko V.A. 2013. Primenenie avtozapolneniya dlya perehoda po klyuchevim slovam na iskomie znacheniya v massive SUBD NIKA [Application of autocomplete to navigate by keywords to the desired values in the NIKA database]. Materiali XXIII Ezhegodnoj bogoslovskoj konferentsii PSTGU [The XXIII Annual theological conference of the PSTGU. Proceedings]. 1:325–328.
3. Knuth D.E. 1998. The Art of Computer Programming. Sorting and Searching. 2-nd ed. N.Y.: Addison-Wesley. Vol.3. 782 p.
4. Godunov A.N., Emel’yanov N.E., Kos’minin A.N., Soldatov V.A. 1991. SUBD NIKA [NIKA system]. Sistemi upravleniya bazami dannih i znanij [Database and knowledge management systems]. M.: «Finansi i statistika». 209–248.
5. Cramér H. 1946. Mathematical Methods of Statistics. Princeton: Princeton University Press. 575 p.
6. Emelyanov N.E., Tischenko V.A. 2010. Metodologiya postroeniya mnogourovnevogo indeksa klyuchevogo massiva po leksikograficheskomu priznaku na osnove metoda regressionnogo analiza na primere SUBD NIKA [Methodology for constructing a multilevel index of a key array based on the lexicographic haracteristic based on the regression analysis method on the example of the NIKA database]. Trudy ISA RAN “Obrabotka informatsionnih i graficheskih resursov” [ISA RAS “Processing of information and graphics resources” Proceedings]. 58:6–17.
7. Draper N.R., Smith H. 1966. Applied regression analysis. 2nd ed. N.Y.: John Wiley & Sons.
8. Orlov A.I. 2005. Prikladnaya statistika [Applied statistics]. M.: Exam. 672 p.
9. Bolshev L.N., Smirnov N.V. 1983. Tablitsi prikladnoj statistiki [Tables of applied statistics]. M .: Nauka. 416 p.
10. Kobzar A.I. 2006. Prikladnaya matematicheskaya statistika. Dlya inzhenernih i nauchnih rabotnikov [Applied mathematical statistics. For engineering and scientific workers]. M.: Fizmatlit. 816 p.
11. Orlov A.I., Lutsenko E.V. 2016. Metodi snizheniya razmernosti prostranstva statisticheskih dannih [Methods for reducing the dimensionality of the space of statistical data] Politematicheskij setevoj elektronnij nauchnij zhurnal Kubanskogo gosudarstvennogo agrarnogo universiteta [Polymatic Network Electronic Journal of the Kuban State Agrarian University]. N119. Available at: http://ej.kubagro.ru/2016/05/pdf/05. pdf (accessed February 2, 2018).
12. Mogilenko A.V. 2004. Teoriya nechetkih mnozhestv. Nechetkij regressionnij analiz [The theory of fuzzy sets. Fuzzy regression analysis]. Tomsk: Pechat. Manufaktura. 61 p.

 

2024-74-1
2023-73-4
2023-73-3
2023-73-2

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".