A. A. Sorokin, S. I. Malkovsky Performance Evaluation of Heterogeneous Computing Systems Based on Modern IBM POWER Processors
A. A. Sorokin, S. I. Malkovsky Performance Evaluation of Heterogeneous Computing Systems Based on Modern IBM POWER Processors

The article is devoted to the complex study of hardware and software of heterogeneous computing systems based on modern IBM POWER processors and NVIDIA Tesla graphics coprocessors. Using the various parallel programming technologies, the performance of the memory subsystem and central processors is investigated in parallel mode. The effectiveness of the functioning of math libraries has been studied, including those providing offloading calculations to the coprocessor. Basic recommendations on the use of this class equipment for solving various scientific problems are given, based on the results of the work carried out.


heterogeneous computing system, computer architecture, IBM POWER8, IBM POWER9, Intel Xeon Platinum 8160, GPU, math library, simultaneous multithreading, performance, benchmark.

PP. 27-40.

DOI 10.14357/20718632210303

1. Brodtkorb A.R., Dyken C., Hagen T.R., Hjelmervik J.M., Storaasli, O.O. 2010. State-of-the-art in Heterogeneous Computing. Scientific Programming. 18(1):1–33. DOI: 10.1155/2010/540159.
2. Sinharoy B., Van Norstrand J.A., Eickemeyer R.J., Le H.Q., Leenstra J., Nguyen D.Q., Konigsburg B., Ward K., Brown M.D., Moreira J.E., Levitan D., Tung S., Hrusecky D., Bishop J.W., Gschwind M., Boersma M., Kroener M., Kaltenbach M., Karkhanis T., Fernsler K.M. 2015. IBM POWER8 processor core microarchitecture. IBM Journal of Research and Development. 59(1):2:1–2:21. DOI: 10.1147/JRD.2014.2376112.
3. Eggers S.J., Emer J.S., Levy H.M., Lo J.L., Stamm R.L., Tullsen D.M. 1997. Simultaneous multithreading: a platform for next-generation processors. IEEE Micro. 17(5):12–19. DOI: 10.1109/40.621209.
4. Starke W.J., Stuecheli J., Daly D.M., Dodson J.S., Auernhammer F., Sagmeister P.M., Guthrie G.L., Marino C.F., Siegel M., Blaner B. 2015. The cache and memory subsystems of the IBM POWER8 processor. IBM Journal of Research and Development. 59(1):3:1-3:13. DOI: 10.1147/JRD.2014.2376131.
5. Foley D., Danskin J. Ultra-Performance Pascal GPU and NVLink Interconnect 2017. IEEE Micro. 37(2):7–17. DOI: 10.1109/MM.2017.37.
6. Sadasivam S.K., Thompto B.W., Kalla R., Starke W.J. 2017. IBM Power9 Processor Architecture. IEEE Micro. 37(2):40–51. DOI: 10.1109/MM.2017.40.
7. Starke W.J., Dodson J.S., Stuecheli J., Retter E., Michael B.W., Powell S.J., Marcella J.A. 2018. IBM POWER9 memory architectures for optimized systems. IBM Journal of Research and Development. 62(4/5):3:1–3:13. DOI: 10.1147/JRD.2018.2846159.
8. Choquette J., Giroux O., Foley D. 2018. Volta: Performance and Programmability. IEEE Micro. 38(2):42–52. DOI: 10.1109/MM.2018.022071134.9. Mulnix D. Intel Xeon Processor Scalable Family Technical Overview. 2017. Available at:
scalable-family-technical-overview (accessed April 8, 2020).
10. Mal’kovskii S. I., Sorokin A. A., Korolev S. P., Zatsarinnyi A. A., Tsoi G. I. 2019. Performance Evaluation of a Hybrid Computer Cluster Built on IBM POWER8 Microprocessors. Programming and Computer Software. 45(6):324-332. DOI: 10.1134/S0361768819060057.
11. Malkovsky S.I., Peresvetov V.V. 2009. Ocenka proizvoditel’nosti vichislitel’nogo klastera na chetyrehyadenyh processorah [Evaluating the performance of a computing cluster on quad-core processors]. Materialy mezhregional’noi nauchno-prakticheskoy konferencii “Informacionnye I kommunikacionnye tehnologii v obrazovanii i nauchnoy deyatel’nosti” [Scientific and Practical Conference (Interregional) “Information and Communication Technologies in Education and Scientific Activity” Proceedings]. Khabarovsk. 261–268.
12. McCalpin J.D. 1995. Memory Bandwidth and Machine Balance in Current High Performance Computers. IEEE Technical Committee on Computer Architecture Newsletter. 19-25.
13. Bailey, D.; Barszcz, E.; Barton, J.; Browning, D.; Carter, R.; Dagum, L.; Fatoohi, R.; Fineberg, S.; Frederickson, P.; Lasinski, T.; Schreiber, R.; Simon, H.; Venkatakrishnan, V.; Weeratunga, S. The NAS Parallel Benchmarks. RNR Technical Report RNR 94-007. Available at: (accessed May 7, 2020).
14. Steinbach P., Werner M. 2017. gearshifft – The FFT Benchmark Suite for Heterogeneous Platforms. In: Kunkel J., Yokota R., Balaji P., Keyes D. (eds) High Performance Computing. ISC 2017. Lecture Notes in Computer Science. Vol 10266. Springer, Cham. 199–216. DOI: 10.1007/978-3-319-58667-0_11.
15. DGEMM. 2015. Available at:
dgemm/ (accessed April 8, 2018).
16. Nikitin O.U., Lukyanova O.A. 2019. Analiz uskoreniya glubokogo obucheniya na osnove vichislitelnoy sistemy IBM POWER8 [Analysis of Deep Learning Acceleration with IBM POWER8 Computing System]. Materyaly V mezdunarodnoy nauchno-prakticheskoy kovferencii “Informacionnye tehnologii i visokoproizvoditel’nye vichislenia” [5th Scientific and Practical Conference (International) “Information Technologies and High Performance Computing” Proceedings]. Khabarovsk. 199–203.
17. Kartsev A.I., Malkovsky S.I., Sorokin A.A., Volovich K.I. 2019. Issledovanie proizvoditel’nosti i masshtabiruemosti paketa Quantum ESPRESSO pri izuchenii nizkorazmernyh system na gibridnyh vichislitelnyh sistemah [Scaling and Productivity of Quantum ESPRESSO Package Based on the GPU-enabled Systems: the Case Study of Lowdimensional Systems Design]. Materialy I mezdunarodnoi konferencii “Matematicheskoe modelirovanie v materialovedenii elektronnyh komponentov” [1th Conference (International) “Mathematical Modeling in Materials Science of Electronic Components” Proceedings]. Moscow. 18–20.
18. Volkov K.N., Dobrov Yu.V., Karpenko A.G., Malkovsky S.I., Sorokin A.A. Simulation of Gas Dynamics of Hypersonic Aircrafts with the Use of Model of High- Temperature Air and Graphics Processor Units. Numerical Methods and Programming (Vychislitel’nye Metody i Programmirovanie). 22:29–46. DOI: 10.26089/NumMet.v22r103.
19. Sorokin A.A., Makogonov S.V., Korolev S.P. 2017. The Information Infrastructure for Collective Scientific Work in the Far East of Russia. Scientific and Technical Information Processing. 4:302–304. DOI: 10.3103/S0147688217040153.
20. Polozenie o CKP “Informatika” [Regulations on the Center for Collective Use "Informatics"]. Available at: (accessed January 22, 2020).

2024 / 04
2024 / 03
2024 / 02
2024 / 01

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".