Community informatics and the formation of social networking
Computer analysis of texts
M.I. Ananyeva, D.A. Devyatkin, M.A. Kamenskaya, M.V. Kobozeva, I.V. Smirnov Extraction of financial and economic information from texts in Russian
Information Technology
Systemic regulation of national and regional economy
Risk management and safety
M.I. Ananyeva, D.A. Devyatkin, M.A. Kamenskaya, M.V. Kobozeva, I.V. Smirnov Extraction of financial and economic information from texts in Russian

Abstract. 

In this article we consider some problems that arise when developing methods and system for automatic extraction of economic events like investment of capital (e.g. in ecological projects), financial provision (e.g. of regions), purchase (e.g. of equipment), etc. In our research we focus on a particular geographical area – the Arctic Region. The aim of the project is to develop a pilot decision support system that analysis Internet media. In this article we propose a method for extraction of economic events, spent sums, investors, and location of an object to be financed. We created an experimental dataset in Russian which includes materials from electronic media and journals dedicated to the Arctic. The quality of the proposed method was confirmed experimentally on this dataset.

Keywords:

information extraction, detection of economic events, decision support.

pp. 23-30

References

1. Maes J. et al. 2012 Mapping ecosystem services for policy support and decision making in the European Union. Ecosystem Services. Vol. 1. №. 1. pp. 31-39.
2. Starostin A.S., Smurov I.M., Stepanova M.E. 2014. A production system for information extraction based on complete syntactic-semantic analysis. Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference "Dialogue". Available at: http://www. dialog-21.ru/digests/dialog2014/materials/pdf/StarostinAS.full.pdf
3. Kharabet Ya.K. 2015. Avtomaticheskoye vydeleniye kolichestvennykh konstruktsiy v russkoyazychnyk nauchno-populyarnykh tekstakh [Automatic allocation of quantitative constructions in Russian-language popular science texts]. Sbornik trudov VIII Vserossiyskoy ob"yedinennoy konferentsii IMS-2015 [Proceedings of the VIII All-Russian Joint Conference IMS-2015]. pp.100-102.
4. Khayrova N., Sharonova N., Gautam A.P.S. 2015. Logiko-lingvisticheskaya model' generatsii faktov iz tekstovykh potokov informatsionnoy korporativnoy sistemy [Logico-linguistic model for fact generation from texts of the corporate information system]. Information Theories and Applications. № 2. Vol. 22. pp. 142-152.
5. Gershenzon L.M., Nozhov I.M., Pankratov D.V. 2005. Sistema izvlecheniya i poiska strukturirovannoy informatsii iz bol'shikh tekstovykh massivov SMI. Arkhitekturnyye i lingvisticheskiye osobennosti [A system for search and extraction of structured information from large-scale media collections. Architectural fnd linguistic features]. Sbornik “Komp'yuternaya lingvistika i intellektual'niye tekhnologii” [The journal "Computer Linguistics and Intellectual Technologies"]. Available at: http://www.dialog-21. ru/Archive/2005/Gershenzon%20Nozhov%20Pankratov/ Gershenzon_Nozhov_Pankratov.pdf
6. Kormalev D.A., Kurshev E.P., Suleymanova E.A., Trofimov I.V. 2009. Izvlecheniye informatsii iz teksta v sisteme ISIDA-T [Information extraction from the text by the ISIDA-T system]. Trudy 11-y Vserossiyskoy nauchnoy konferentsii «Elektronnyye biblioteki: perspektivnyye metody i tekhnologii, elektronnyye kollektsii» RCDL’2009 [Proceedings of the 11th All-Russian Scientific Conference "Digital Libraries: Advanced Methods and Technologies, Digital Collections" RCDL’2009]. Available at: http://resources.krc.karelia.ru/math/doc/rcdl2009/247_253_Section07-2.pdf
7. Vlasova N.A. 2013. Izvlecheniye informatsii o situatsiyakh otstavok-naznacheniy v novostnykh tekstakh. Opyt razmetki kollektsii. Rezul'taty testirovaniya [Extraction of the resignations-appointments evets from news texts. Experience in marking the collection. Test results]. Trudy 13-y Vserossiyskoy nauchnoy konferentsii «Elektronnyye biblioteki: perspektivnyye metody i tekhnologii, elektronnyye kollektsii» RCDL’2013 [Proceedings of the 13th All-Russian Scientific Conference "Digital Libraries: Advanced Methods and Technologies, Digital Collections" RCDL’2013]. Available at: http://ceur-ws.org/Vol-1108/paper6.pdf
8. Zharikov A., Kristalovsky K., Pivovarov V. 2011. Information Retrieval System for News Articles in Russian // Proceedings of the Fifth Russian Young Scientists Conference in Information Retrieval. St. Petersburg. pp. 5-14. Available at: http://elar.urfu.ru/bitstream/10995/3707/3/RuSSIR_2011_01.pdf
9. O'Connor B., Stewart B., Smith N.A. 2013. Learning to extract international relations from political context. URL: https://brenocon.com/oconnor+stewart+smith.irevents.acl2013.pdf
10. Hogenboom A., Hogenboom F., Frasincar F., Schouten K., O. van der Meer. Semantics-based information extraction for detecting economic events. Available at: http://link.springer.com/article/
10.1007/s11042-012-1122-0/fulltext.html
11. Nastase V., Strube M. 2013. Transforming Wikipedia into a large scale multilingual concept network // Artificial Intelligence. Vol. 194. pp. 62-85.
12. Al-Rfou R., Kulkarni V., Perozzi B., Skiena S. 2015. Polyglot-NER: Massive multilingual named entity recognition //Proceedings of the 2015 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics. pp. 586-594.
13. Dmitriev A.S., Soloviev I.S., Zablenova-Zotova A.V. 2015. Izvlecheniye vzaimosvyazey mezhdu ob'yektami i terminami v tekstakh na ekonomicheskuyu tematiku [Extraction of interrelations between objects and terms in economic texts]. Izvestiya Volgogradskogo gosudarstvennogo tekhnicheskogo universiteta [Bulletin of Volgograd State Technical University]. No. 13. pp. 55-60.
14. Sokirko A.V. 2004. Morfologicheskiye moduli na sayte www.aot.ru [Morphological modules on the site www.aot.ru]. Komp'yuternaya lingvistika i intellektual'nyye tekhnologii: Trudy mezhdunarodnoy konferentsii «Dialog’2004» [Computer linguistics and intellectual technologies: Proceedings of the international conference "Dialogue' 2004"]. Available at: http://www.dialog-21.ru/media/2569/sokirko.pdf
15. Padró L., Stanilovsky E. 2012. Freeling 3.0: Towards wider multilinguality. LREC2012. Available at: http://www.lrecconf.org/proceedings/lrec2012/pdf/430_Paper.pdf
16. Suvorov RE, Sochenkov I.V. 2013. Opredeleniye svyazannosti nauchno-tekhnicheskikh dokumentov na osnove kharakteristiki tematicheskoy znachimosti [Measuring similarity of scientific and technical documents using thematic importance characteristic] // Iskusstvennyy intellekt i prinyatiye resheniy [Artificial intelligence and decision-making]. Moskva: ISA RAN [Moscow: ISA RAS]. No. 1. pp. 33-40.
17. Smirnov I.V., Shelmanov A.O. Kuznetcova Е.S., Khramoin I.V. 2014. Semantiko-sintaksicheskiy analiz yestestvennykh yazykov Chast' II. Metod semantiko-sintaksicheskogo analiza tekstov [Semantic-syntactic analysis of natural languages. Part II. Method for semantic-syntactic analysis of texts]. №. 1. pp. 11-24.
18. Flach P. 2012. Machine learning: the art and science of algorithms that make sense of data // Cambridge University Press. 395 p.

2024-74-3
2024-74-2
2024-74-1
2023-73-4

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".