Журнал «Труды Института системного анализа Российской академии наук» - Junzhe Song, D.E. Namiot "A Survey of Model Inversion Attacks and Countermeasures"

This article provides a detailed overview of the so-called Model Inversion(MI) attacks. These attacks aim at Machine-Learning-as-a-Service (MLaaS) platforms, and the goal is to use some well-prepared adversarial samples to attack target models and gain sensitive information from ML models, such as items from the dataset on which ML model was trained or ML model's parameters. This kind of attack now becomes an enormous threat to ML models, therefore, it is necessary to research this attack, understand how it will affect ML models, and based on this knowledge, we can propose some strategies that may improve the robustness of ML models.

1. Fredrikson, M., Jha, S., & Ristenpart, T. (2015, October). Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security (pp. 1322-1333).

2. Wu, X., Fredrikson, M., Jha, S., & Naughton, J.F. (2016, June). A methodology for formalizing model-inversion attacks. In 2016 IEEE 29th Computer Security Foundations Symposium (CSF) (pp. 355-370). IEEE.

3. Yeom, S., Giacomelli, I., Fredrikson, M., & Jha, S. (2018, July). Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF) (pp. 268-282). IEEE.

4. Basu, S., Izmailov, R., & Mesterharm, C. (2019). Membership model inversion attacks for deep networks. arXiv preprint arXiv:1910.04257.

5. Zhao, X., Zhang, W., Xiao, X., & Lim, B.Y. (2021). Exploiting Explanations for Model Inversion Attacks. arXiv preprint arXiv:2104.12669.

6. Zhang, Y., Jia, R., Pei, H., Wang, W., Li, B., & Song, D. (2020). The secret revealer: Generative model-inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 253-261).

7. Chen, S., Kahla, M., Jia, R., & Qi, G.J. (2021). Knowledge-Enriched Distributional Model Inversion Attacks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 16178-16187).

8. He, Z., Zhang, T., & Lee, R.B. (2019, December). Model inversion attacks against collaborative inference. In Proceedings of the 35th Annual Computer Security Applications Conference (pp. 148-162).

9. Wang, T., Zhang, Y., & Jia, R. (2020). Improving robustness to model inversion attacks via mutual information regularization. arXiv preprint arXiv: 2009.05241.

10. Titcombe, T., Hall, A. J., Papadopoulos, P., & Romanini, D. (2021). Practical Defences Against Model Inversion Attacks for Split Neural Networks. arXiv preprint arXiv:2104.05743.

11. Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K. & Raffel, C. (2021). Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21) (pp. 2633-2650).

12. Yang, Z., Shao, B., Xuan, B., Chang, E. C., & Zhang, F. (2020). Defending model inversion and membership inference attacks via prediction purification. arXiv preprint arXiv:2005.03915.

13. Ateniese, G., Mancini, L. V., Spognardi, A., Villani, A., Vitali, D., & Felici, G. (2015). Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. International Journal of Security and Networks, 10(3), 137-150.

14. Xu, R., Baracaldo, N., & Joshi, J. (2021). Privacy-Preserving Machine Learning: Methods, Challenges and Directions. arXiv preprint arXiv: 2108.04417.

15. Hidano, S., Murakami, T., Katsumata, S., Kiyomoto, S., & Hanaoka, G. (2017, August). Model inversion attacks for prediction systems: Without knowledge of non-sensitive attributes. In 2017 15th Annual Conference on Privacy, Security and Trust (PST) (pp. 115-11509). IEEE.

16. Wang, Y., Si, C., & Wu, X. (2015, June). Regression model fitting under differential privacy and model inversion attack. In Twenty-Fourth International Joint Conference on Artificial Intelligence.

17. Wang, K. C., Fu, Y., Li, K., Khisti, A. J., Zemel, R., & Makhzani, A. (2021, May). Variational Model Inversion Attacks. In Thirty-Fifth Conference on Neural Information Processing Systems.

18. Alves, T. A., França, F. M., & Kundu, S. (2019, May). MLPrivacyGuard: Defeating Confidence Information based Model Inversion Attacks on Machine Learning Systems. In Proceedings of the 2019 on Great Lakes Symposium on VLSI (pp. 411-415).

19. W. Hickey. FiveThirtyEight.com DataLab: How americans like their steak. http://fivethirtyeight.com/datalab/how-americans-like-their-steak/, May 2014.

21. Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142.

22. H. Xiao, K. Rasul, and R. Vollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. CoRR, abs/1708.07747, 2017.

23. Yann Lecun, Leon Bottou, Y Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86:2278 – 2324, 12 1998.

24. Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, and Ronald M Summers. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thoraxdiseases. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2097–2106,2017.

25. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.

26. Christer Loob, Pejman Rasti, Iiris Lusi, Julio CS Jacques, Xavier Baro, Sergio Escalera, Tomasz Sapinski, Dorota Kaminska, and Gholamreza Anbarjafari. Dominant and complementary multi-emotional facial expression recognition using c-support vector classification. In 2017 12th IEEE International Conference on Automatic Face \& Gesture Recognition (FG 2017), pages 833–838. IEEE, 2017.

28. H.-W. Ng, S. Winkler. A data-driven approach to cleaning large face datasets. Proc. IEEE International Conference on Image Processing (ICIP), Paris, France, Oct. 27-30, 2014.

29. Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository (Technical report, University of California, Irvine, School of Information and Computer Sciences)

34. Vepakomma, P., Gupta, O., Dubey, A., & Raskar, R. (2019). Reducing leakage in distributed deep learning for sensitive health data.

35. Sharif Abuadbba, Kyuyeon Kim, Minki Kim, Chandra Thapa, Seyit A Camtepe, Yansong Gao, Hyoungshick Kim, and Surya Nepal. Can we use split learning on 1d cnn models for privacy preserving training? arXiv preprint arXiv:2003.12365, 2020.

36. Cynthia Dwork. Differential privacy: A survey of results. In International conference on theory and applications of models of computation, pp. 1–19. Springer, 2008.

37. Fatemehsadat Mireshghallah, Mohammadkazem Taram, Prakash Ramrakhyani, Ali Jalali, Dean Tullsen, and Hadi Esmaeilzadeh. Shredder: Learning noise distributions to protect inference privacy. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 3–18, 2020.

38. Zhang, Jun & Zhang, Zhenjie & Xiao, Xiaokui & Yang, Yin & Winslett, Marianne. (2012). Functional Mechanism: Regression Analysis under Differential Privacy. Proc. VLDB Endowment. 5.

39. Namiot, D., Ilyushin, E., & Pilipenko, O. (2022). On Trusted AI Platforms. International Journal of Open Information Technologies, 10(7), 119-127 (in Russian).

40. Namiot, D., Ilyushin, E., & Chizhov, I. (2022). On a formal verification of machine learning systems. International Journal of Open Information Technologies, 10(5), 30-34.