A Method to Analyze and Optimize Parallel Programs on CPU/MIC Heterogeneous Architecture

Yun-chun LI, Tian-yuan WANG


Cooperating with the IntelĀ® Many Integrated Core Architecture which was announced in 2010 as a massively parallel coprocessor, heterogeneous node has been broadly applied in petascale supercomputers, such as TianHe-II. The performance analysis for massive parallel applications under the heterogeneous architecture is playing an important role for next generation exascale supercomputers. In this paper, we proposed a method to analyze and optimize parallel programs running on CPU/MIC heterogeneous computing node with offload programming mode. To monitor runtime behaviors in offload process, we used TAU to instrument the offload code region to collect performance events. Then we compared the performance of different programming modes with NAS Parallel Benchmarks. The results indicated that asynchronous offload mode performs better than synchronous offload mode.


CPU/MIC architecture, Performance analysis, Program optimization


Full Text:



  • There are currently no refbacks.