Macrosystem dynamics
Intellectual systems and technologies
Information Technology
System analysis in medicine and biology
Е.Н. Кузнецов, А.А. Анашкина , А.А. Дорофеюк, Ю.А. Дорофеюк, Н.Г. Есипова, А.Г. Спиро, В.Г. Туманян "Кластерный анализ ДНК-белковых пространственных контактов с использованием процедуры Вороного-Делоне"
Е.Н. Кузнецов, А.А. Анашкина , А.А. Дорофеюк, Ю.А. Дорофеюк, Н.Г. Есипова, А.Г. Спиро, В.Г. Туманян "Кластерный анализ ДНК-белковых пространственных контактов с использованием процедуры Вороного-Делоне"


Предлагается классификация аминокислотных остатков по признакам контактов аминокислот белков с нуклеотидами ДНК, рассматриваются классификации с разными типами размытости. Для определения количества и площади контактов каждой аминокислоты с каждым нуклеотидом использовалось разбиение Вороного-Делоне. Показано существование инвариантов кластеризации аминокислот, а также то, что размытая классификация аминокислот на 6 классов является оптимальной для задачи белок-нуклеинового распознавания.

Ключевые слова:

кластерный анализ, размытая классификация, контакты аминокислота–нуклеотид, разбиение Вороного–Делоне, свойства аминокислотных остатков.

Стр. 85-96.

Полная версия статьи в формате pdf. 

E.N. Kuznetsov, A.A. Anashkina, A.A. Dorofeyuk, J.A. Dorofeyuk, A.G. Spiro

"Cluster analysis of DNA-protein spatial contacts using the Voronoi-Delaunay procedure"

Abstract. The paper deals with the amino acid residues classification on the basis of the amino acids proteins - DNA nucleotides contacts parameters. Amino acid residues have many different properties and functions, and can simultaneously belong to different classes. Therefore, it was interesting to use the classification of amino acids with different types of fuzzing. Voronoi-Delaunay tessellation was used to determine the contacts number and area for each amino acid with each nucleotide in 1937 complexes. General variation approach was used for the amino acids classification of with dif¬ferent types of fusion. Results: It was shown that about 30% of all contacts between amino acids and nucleotides in protein-DNA complexes are not random. Crisp classification methods showed the existence of clustering invariants of amino acids at the lowest level of association. It was shown by fuzzy classification methods that six classes are optimal for protein-DNA recog¬nition task. Conclusions: Fuzzy classification of amino acids data can be used to construct the substitution matrix for DNA-binding protein sequences and protein-DNA binding analysis.

Keywords: cluster analysis, crisp classification, fuzzy classification, protein-DNA interactions.


1. Gurskii G. V., Tumanian V. G., Zasedatelev A. S., Zhuze A. L., Grokhovskii S. L., Gottikh B. P. A code governing specific binding of regulatory proteins to DNA and structure of stereospecific sites of regulatory proteins // Mol Biol (Mosk), 1975. Vol. 9, No. 5. pp. 635-651.
2. Gurskii G. V., Zasedatelev A. S. Precise relationships for calculating the binding of regulatory proteins and other lattice ligands in double-stranded polynucleotides // Biofizika, 1978. Vol. 23, No. 5. pp. 932-946.
3. Shen B., Bai J., Vihinen M. Physicochemical feature-based classification of amino acid mutations // Protein Eng Des Sel, 2008. Vol. 21, No. 1. pp. 37-44.
4. Venkatarajan M. S., Braun W. New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical-chemical properties // Journal of Molecular Modeling, 2001. Vol.  7, No. 12. pp. 445-453.
5. Kosiol C., Goldman N., Buttimore N. H. A new criterion and method for amino acid classification // J Theor Biol, 2004. Vol. 228, No. 1. pp. 97-106.
6. Rogov S. I., Nekrasov A. N. A numerical measure of amino acid residues similarity based on the analysis of their surroundings in natural protein sequences // Protein Eng, 2001. Vol. 14, No. 7. pp. 459-463.
7. Davies M.N., SeckerA., Halling-Brown M., Moss D. S., Freitas A. A., Timmis J., Clark E., Flower D. R. GPCRTree: online hierarchical classification of GPCR function // BMC Res Notes, 2008. Vol. 1, p. 67.
8. May A. C. Towards more meaningful hierarchical classification of amino acid scoring matrices // Protein Eng, 1999. Vol. 12, No. 9. pp. 707-712.
9. Davies M. N., Secker A., Freitas A. A., Clark E., Timmis J., Flower D. R. Optimizing amino acid groupings for GPCR classification // Bioinformatics, 2009. Vol. 11, No. 1. pp. 111-122.
10. Anashkina A., Kuznetsov E., Esipova N., Tumanyan V. Comprehensive statistical analysis of residues interaction specificity at protein-protein interfaces // Proteins, 2007. Vol. 67, No. 4. pp. 1060-1077.
11. Anashkina A. A., Tumanyan V. G., Kuznetsov E. N., Galkin A. V., Esipova N. G. Geometrical Analysis of DNA-protein interactions on the basis of the Voronoi-Delaunay method. // Biofisics, 2008. Vol. 53, No. 3. pp. 402-406. (in Russian)
12. Medvedev N.N. Voronoi-Delaunay method in the study of structure of non-crystalline systems. // Novosibirsk: NIC OIGGM SO RAS. 2000. (in Russian)
13. Rauschenbach G.V. Measures of proximity and similarity. // Analysis of non-numerical information in sociological research. Moscow: Nauka, 1985. pp. 169¬-203. (in Russian)
14. Mirkin B. G. Cluster-analysis methods for decision making support: an overview. // Moscow: HSE. 2011. (in Russian)
15. Gunasekaran K., Ramakrishnan C., Balaram P. Disallowed Ramachandran conformations of amino acid residues in protein structures // J. Mol. Biol/, 1996. Vol. 264, No. 1. pp. 191-198.
16. Bauman E. V. Fuzzy classification methods: a variational approach. // Automation and remote control, 1988. Issue 12. pp.143 156. (in Russian) 
17. Diday E. et al. Optimisation en Classification Automatique. // Le Chesnay, France: INRIA. 1979.
18. Zadeh L.A. Fuzzy sets as a basis for a theory of possibility // Fuzzy sets and systems, 1978. Vol. 1, pp. 3-28.
19. Bezdek J.C. A convergence theorem for the fuzzy ISODATA clusters algorithms // IEEE Transactions on pattern analysis and machine intelligence. PAMI-2., 1980. pp. 1-8.
20. Pabo C.O, Sauer R. T. Protein-DNA recognition // Annu. Rev. Biochem., 1984. Vol. 53, pp. 293¬- 321.
21. Kawashima S., Pokarowski P., Pokarowska M., Kolinski A., KatayamaT., Kanehisa M. AAindex: amino acid index database, progress report 2008 // Nucleic Acids Res, 2008. Vol. 36, Database issue. pp. D 202-205.
22. Von Hippel P.H. Protein-DNA recognition: new perspectives and underlying themes // Science, 1994. Vol. 263, No. 5148. pp. 769-770.


© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".