Математические модели социально-экономических процессов
Системная диагностика социально-экономических процессов
Информатика сообществ и формирование социальных сетей
Наукометрия и управление наукой
Компьютерный анализ текстов
А.О. Шелманов, М.А. Каменская "Обучение анализатора для определения ролевых структур высказываний в текстах на русском языке на автоматически размеченном корпусе"
А.О. Шелманов, М.А. Каменская "Обучение анализатора для определения ролевых структур высказываний в текстах на русском языке на автоматически размеченном корпусе"

Аннотация.

В работе исследованы подходы к определению ролевых структур высказываний, использующие принципы машинного обучения с частичным привлечением учителя. Представлен способ повышения качества семантического анализа за счет обучения на корпусе, автоматически размеченном словарным (основанным на правилах) семантическим анализатором. Предложен подход к определению ролевых структур высказываний для «неизвестных» предикатных слов, которые отсутствуют в семантическом словаре словарного анализатора. В работе также представлен гибридный семантический анализатор, в котором используется две модели машинного обучения для «известных» и «неизвестных» предикатных слов, а также словарный семантический анализатор. Проведены экспериментальные исследования на вручную размеченном русскоязычном корпусе, которые показывают, что предложенные модификации повышают полноту и общее качество определения ролевых структур высказываний.

Ключевые слова:

определение ролевых структур высказываний, машинное обучение с частичным привлечением учителя, семантический анализ, векторные представления слов.

Стр. 104-120.

Полная версия статьи в формате pdf. 


REFERENCES

1. Fillmore C. J. The case for case // Universals in Linguistic Theory / Ed. by Emmon Bach, Robert T. Harms. — New York, 1968. — P. 1–88.
2. Gildea D., Jurafsky D. Automatic labeling of semantic roles // Computational Linguistics. — 2002. — Vol. 28, no. 3. — P. 245–288.
3. Plungyan, V. A. 2011. Vvedenie v grammaticheskuyu semantiku: grammaticheskie znacheniya i gramma-ticheskie sistemy yazykov mira: uchebnoe posobie [Introduction to grammatical semantics: grammatical meanings and grammatical system of the world’s languages]. Moscow: RSUH Publs. 672 p.
4. Kashkin, E. V., Lyashevskaya, O. N. 2013. Semanticheskie roli i set’ konstruktsiy v sisteme FrameBank [Semantic roles and constructs network in FrameBank system]. Trudy mezhdunarodnoy konferentsii “Dialog 2013” [International Conference “Dialogue-2013”]. Moscow. 325–343.
5. Shen D., Lapata M. Using semantic roles to improve question answering // Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). — Association for Computational Linguistics, 2007. — P. 12–21.
6. Kaisser M., Webber B. Question answering based on semantic roles // Proceedings of the Workshop on Deep Linguistic Processing. — Association for Computational Linguistics, 2007. — P. 41–48.
7. Shelmanov, A.O., Kamenskaya, M.A., Anan’eva, M.I., Smirnov, I.V. 2016. Semantikosintaksicheskiy analiz tekstov v zadachakh voprosno-otvetnogo poiska i izvlecheniya opredeleniy [Semantic-syntactic analysis for question-answering and definition extraction]. Iskusstvennyy intellekt i prinyatie resheniy [Artificial intelligence and decision-making]. (In the press.)
8. Liu D., Gildea D. Semantic role features for machine translation // Proceedings of the 23rd International Conference on Computational Linguistics. — Association for Computational Linguistics, 2010. — P. 716–724.
9. Relation alignment for textual entailment recognition / Mark Sammons, VG Vinod Vydiswaran, Tim Vieira et al. // Text Analysis Conference (TAC). — 2009.
10. Xue N., Palmer M. Calibrating features for semantic role labeling // Proceedings of EMNLP 2004. — Association for Computational Linguistics, 2004. — P. 88–94.
11. Shallow semantic parsing using support vector machines / Sameer S Pradhan, Wayne H Ward, Kadri Hacioglu et al. // HLT-NAACL 2004: Main Proceedings. — Association for Computational Linguistics, 2004. — P. 233–240.
12. Toutanova K., Haghighi A., Manning C. D. Joint learning improves semantic role labeling // Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. — Association for Computational Linguistics, 2005. — P. 589–596.
13. Punyakanok V., Roth D., Yih W.-t. The importance of syntactic parsing and inference in semantic role labeling // Computational Linguistics. — 2008. — Vol. 34, no. 2. — P. 257–287.
14. Palmer M., Gildea D., Kingsbury P. The proposition bank: An annotated corpus of semantic roles // Computational linguistics. — 2005. — Vol. 31, no. 1. — P. 71–106.
15. Fillmore C. J., Johnson C. R., Petruck M. R. Background to FrameNet // International journal of lexicography. — 2003. — Vol. 16, no. 3. — P. 235–250.
16. The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages / Jan Hajic, Massimiliano Ciaramita, Richard Johansson et al. // Proceedings of the Thirteenth Conference on Computational NaturalLanguage Learning: Shared Task. — Association for Computational Linguistics, 2009. — P. 1–18.
17. Fung P., Chen B. BiFrameNet: bilingual frame semantics resource construction by cross-lingual induction // Proceedings of the 20th international conference on Computational Linguistics. — Association for Computational Linguistics, 2004.
18. Cross-language frame semantics transfer in bilingual corpora / Roberto Basili, Diego De Cao, Danilo Croce et al. // International Conference on Intelligent Text Processing and Computational Linguistics / Springer. — 2009. — P. 332–345.
19. Pado S., Lapata M. Cross-lingual annotation projection for semantic roles // Journal of Artificial Intelligence Research. — 2009. — Vol. 36. — P. 307–340.
20. Johansson R., Nugues P. A FrameNet-based semantic role labeler for Swedish // Proceedings of the COLING/ACL. — Association for Computational Linguistics, 2006. — P. 436–443.
21. Kozhevnikov M., Titov I. Cross-lingual transfer of semantic role labeling models // Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). — Association for Computational Linguistics, 2013. — P. 1190–1200.
22. Das D., Smith N. A. Semi-supervised framesemantic parsing for unknown predicates // Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. — Association for Computational Linguistics, 2011. — P. 1435– 1444.
23. Burchardt A., Erk K., Frank A. A WordNet detour to FrameNet // Sprachtechnologie, mobile Kommunikation und linguistische Resourcen. — 2005. — Vol. 8. — P. 408–421.
24. Miller G. A. WordNet: A lexical database for English // Communications of the ACM. — 1995. — Vol. 38, no. 1. — P. 39–41.
25. Johansson R., Nugues P. Using WordNet to extend FrameNet coverage // In Proceedings of the Workshop on Building Frame-semantic Resources for Scandinavian and Baltic Languages at the 16th Nordic Conference of Computational Linguistics (NODALIDA). — 2007. — P. 27–30.
26. Automatic induction of FrameNet lexical units / Marco Pennacchiotti, Diego De Cao, Roberto Basili et al. // Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. — Association for Computational Linguistics, 2008.
27. Furstenau H., Lapata M. Semi-supervised semantic role labeling via structural alignment // Computational Linguistics. — 2012. — Vol. 38, no. 1. — P. 135–171.
28. Do Q. T. N., Bethard S., Moens M.-F. Domain adaptation in semantic role labeling using a neural language model and linguistic resources // IEEE/ACM Transactions on Audio, Speech, and Language Processing. — 2015. — Vol. 23, no. 11. — P. 1812–1823.
29. Garg N., Henderson J. Unsupervised semantic role induction with global role ordering // Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. — Association for Computational Linguistics, 2012. — P. 145–149.
30. Lang J., Lapata M. Similarity-driven semantic role induction via graph partitioning // Computational linguistics. — 2014. — Vol. 40, no. 3. — P. 633–669.
31. Titov I., Khoddam E. Unsupervised induction of semantic roles within a reconstruction error minimization framework // In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. — 2015.
32. Shelmanov A. O., Smirnov I. V. Methods for semantic role labeling of Russian texts // Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (2014). — No. 13. — 2014. — P. 607–620.
33. Kuznetsov I. Semantic role labeling for Russian language based on Russian FrameBank // International Conference on Analysis of Images, Social Networks and Texts / Springer. — 2015. — P. 333–338.
34. Sokirko, A. V. 2001. Semanticheskie slovari v avtomaticheskoy obrabotke teksta (po materialam sistemy DI-ALING) [Semantic dictionaries in automatic text processing]. PhD Thesis. Moscow.
35. Osipov, G. S., Shelmanov, A. O. 2015. Metod povysheniya kachestva sintaksicheskogo analiza na osnove vzaimodeystviya sintaksicheskikh i semanticheskikh pravil [Method of improving the quality of parsing based on the interaction of syntactic and semantic rules]. Trudy shestoy mezhdunarodnoy konfe-rentsii “Sistemnyy analiz i informatsionnye tekhnologii” (SAIT) [6th Conference “Systems Analysis and Information Technologies”]. Svetlogorsk. p. 229–240.
36. Smirnov, I. V., Shelmanov, A. O., Kuznetsova, E. S., Khramoin, I. V. Semantiko-sintaksicheskiyanaliz estestvennykh yazykov. Chast’ II. Metod semantiko-sintaksicheskogo analiza tekstov [The semantic-syntactic analysis of natural languages. Part II. The method of semantic and syntactic analysis of texts]. Is-kusstvennyy intellekt i prinyatie resheniy [Artificial intelligence and decision-making]. 1: 11–24.
37. Osipov G. S., Smirnov I. V., Tikhomirov I. A. Reliacionno-situacionnyi metod poiska i Analisa tekstov I ego prilogienia // Iskusstvennyy intellekt i prinyatie resheniy [Artificial intelligence and decision-making]. 2008. — No 2. — p. 3–10.
38. Zolotova, G.A., Onipenko, N.K., Sidorova, M.Yu. 2004. Kommunikativnaya grammatika russkogo yazyka [Communicative Grammar of the Russian Language] // Moscow: Russian Vinogradov Language Institute of RAS. 544 p.
39. Apresyan, Yu. D., Boguslavskiy, I. M., Iomdin, B. L., i dr. 2005. Sintaksicheski i semanticheski annotirovannyy korpus russkogo yazyka: sovremennoe sostoyanie i perspektivy [Syntactically and semantically annotated corpus of Russian language: current status and prospects]. Natsional’nyy korpus rus-skogo yazyka [National Corpus of Russian Language]. P. 193–214.
40. Avtomaticheskaya obrabotka teksta [Automatic Text Processing]. Available at: http://www.aot. ru/ (Accesssed November 20, 2016).
41. MaltParser: A language-independent system for data-driven dependency parsing / Joakim Nivre, Johan Hall, Jens Nilsson et al. // Natural Language Engineering. — 2007. — Vol. 13, no. 2. — P. 95–135.
42. Distributed representations of words and phrases and their compositionality / Tomas Mikolov, Ilya Sutskever, Kai Chen et al. // Advances in neural information processing systems. — 2013. — P. 3111–3119.
43. Mnih A., Kavukcuoglu K. Learning word embeddings efficiently with noise-contrastive estimation // Advances in Neural Information Processing Systems. — 2013. — P. 2265–2273.
44. Kutuzov A., Andreev I. Texts in, meaning out: neural language models in semantic similarity task for Russian // Proceedings of the Dialog Conference. — 2015.
 

2023-73-4
2023-73-3
2023-73-2
2023-73-1

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".