Research and development of inefficiency patterns in MPI, UPC applications

Most of developed tools for analysis for various libraries (MPI, OpenMP) and languages for parallel programming use low level approaches to analyze the performance of parallel applications. There are a lot of profiling tools and trace visualizers which produce tables, graphs with various statistics...

Descripció completa

Dades bibliogràfiques
Autors principals: M. S. Akopyan, N. E. Andreev
Format: Article
Idioma:English
Publicat: Ivannikov Institute for System Programming of the Russian Academy of Sciences 2018-10-01
Col·lecció:Труды Института системного программирования РАН
Matèries:
Accés en línia:https://ispranproceedings.elpub.ru/jour/article/view/950
Descripció
Sumari:Most of developed tools for analysis for various libraries (MPI, OpenMP) and languages for parallel programming use low level approaches to analyze the performance of parallel applications. There are a lot of profiling tools and trace visualizers which produce tables, graphs with various statistics of executed program. In most cases developer has to manually look for bottlenecks and opportunities for performance improvement in the produced statistics and graphs. The amount of information developer has to handle manually, increase dramatically with number of cores, number of processes and size of problem in application. Therefore new methods of performance analysis fully or partially handling output information will be more beneficial. To apply the same analysis tool to various parallel paradigm (MPI applications, UPC programs) paradigm-specific inefficiency patterns has been developed. In this paper code patterns resulting in performance penalties are discussed. Patterns of parallel MPI applications for parallel computing systems with distributed memory as well as for parallel UPC programs for systems with partial global address space (PGAS) are considered. A method for automatic detection of inefficiency patterns in parallel MPI applications and UPC programs is proposed. It allows to reduce the tuning time of parallel application.
ISSN:2079-8156
2220-6426