Mathematical models of socio-economic processes
System analysis in medicine and biology
Cognitive technology
Methods of artificial intelligence and intelligent systems
E.V. Chistova, A.O. Shelmanov, I.V. Smirnov Natural language dialogue modelling with deep learning
E.V. Chistova, A.O. Shelmanov, I.V. Smirnov Natural language dialogue modelling with deep learning


Building natural language dialogue systems that can converse coherently with user is an actual problem of artificial intelligence. This paper presents an overview of the open-domain generative neural network dialogue models. The main problems of constructing dialogue models based on machine learning and methods for their solution are considered. An experimental comparison of the vanilla neural network encoder-decoder model with its attention mechanism modification was carried out on the Russian-language data.


dialogue systems, natural language processing, natural language generation, neural networks, artificial intelligence, deep learning, encoder-decoder model.

PP. 105-115.

DOI: 10.14357/20790279190110


1. Ritter A., Cherry C., Dolan W.B. Data-driven response generation in social media //Proceedings of the conference on empirical methods in natural language processing. – Association for Computational Linguistics, 2011. – P. 583-593.
2. Sordoni A. et al. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses //Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. – 2015. – P. 196-205.
3. Cho K., van Merrienboer B., Bahdanau D., Bengio Y. On the properties of neural machine translation: Encoder-decoder approaches. //Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, 2014. – P. 103- 111.
4. Sutskever I., Vinyals O., Le Q.V. Sequence to sequence learning with neural networks //Advances in neural information processing systems. – 2014. – P. 3104-3112.
5. Schmidhuber Jürgen, and Sepp Hochreiter. “Long short-term memory.” Neural Comput 9.8 (1997): 1735-1780.
6. Serban I.V. et al. Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation //AAAI. – 2017. – P. 3288-3294.
7. Graves A. Generating sequences with recurrent neural networks //arXiv preprint arXiv:1308.0850. – 2013.
8. Tiedemann J. News from OPUS-A collection of multilingual parallel corpora with tools and interfaces //Recent advances in natural language processing. – 2009. – P. 237-248.
9. Vinyals O., Le Q. A neural conversational model // ICML Deep Learning Workshop - 2015.
10. Serban I.V. et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models //AAAI. – 2016. – P. 3776-3784.
11. Li J. et al. A Diversity-Promoting Objective Function for Neural Conversation Models // Proceedings of NAACL-HLT. – 2016. – P. 110- 119.
12. Bahdanau D., Cho K., Bengio Y. Neural machine translation by jointly learning to align and translate // ICLR. – 2015.
13. Luong M.T., Pham H., Manning C.D. Effective approaches to attention-based neural machine translation //arXiv preprint arXiv:1508.04025. – 2015.
14. Shao L. et al. Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models //Proceedings of EMNLP. – 2017.
15. Li J. et al. A persona-based neural conversation model. //Proceedings of ACL. – 2016. – P. 994-1003
16. Al-Rfou R. et al. Conversational contextual cues: The case of personalization and history for response ranking //arXiv preprint arXiv:1606.00372. – 2016.
17. Asghar N. et al. Deep active learning for dialogue generation //Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (* SEM 2017). – 2017. – P. 78-83.
18. Li J. et al. Deep Reinforcement Learning for Dialogue Generation //Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. – 2016. – P. 1192-1202.
19. Papineni K. et al. BLEU: a method for automatic evaluation of machine translation //Proceedings of the 40th annual meeting on association for computational linguistics. – Association for Computational Linguistics, 2002. – P. 311-318.
20. Galley M. et al. deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets //Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). – 2015. – Vol. 2. – P. 445-450.
21. Banerjee S., Lavie A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments //Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/ or summarization. – 2005. – P. 65-72.
22. Liu C.W. et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation //Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. – 2016. – P. 2122-2132.
23. Li J. et al. Adversarial Learning for Neural Dialogue Generation //Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. – 2017. – P. 2157-2169.
24. Turing A. Computing intelligence and machinery //Mind. – 1950. – Vol. 59. – N. 2236.
25. Venkatesh A. et al. On Evaluating and Comparing Conversational Agents //arXiv preprint arXiv:1801.03625. – 2018.
26. Walker M.A. et al. PARADISE: A framework for evaluating spoken dialogue agents //Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics. – Association for Computational Linguistics, 1997. – P. 271-280.
27. Lowe R. et al. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems //Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. – 2015. – P. 285-294.
28. Ghazvininejad M. et al. A Knowledge-Grounded Neural Conversation Model //arXiv preprint arXiv:1702.01932. – 2017.
29. Kingma D.P., Ba J. Adam. A method for stochastic optimization //arXiv preprint arXiv:1412.6980. – 2014.


© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".