Using high level GPU tasks to explore memory and communications options on heterogeneous platforms Conference

Liu, C, Bhimani, J, Leeser, M. (2017). Using high level GPU tasks to explore memory and communications options on heterogeneous platforms . 21-28. 10.1145/3085158.3086160

cited authors

  • Liu, C; Bhimani, J; Leeser, M

authors

abstract

  • Heterogeneous computing platforms that use GPUs for acceleration are becoming prevalent. Developing parallel applications for GPU platforms and optimizing GPU related applications for good performance is important. In this work, we develop a set of applications based on a high level task design, which ensures a well defined structure for portability improvement. Together with the GPU task implementation, we utilize a uniform interface to allocate and manage memory blocks that are used by both host and device. In this way we can choose the appropriate types of memory for host/device communication easily and flexibly in GPU tasks. Through asynchronous task execution and CUDA streams, we can explore concurrent GPU kernels for performance improvement when running multiple tasks. We developed a test benchmark set containing nine different kernel applications. Through tests we can learn that pinned memory can improve host/device data transfer for GPU platforms. The performance of unified memory differs a lot on different GPU architectures and is not a good choice if performance is the main focus. The multiple task tests show that applications based on our GPU tasks can effectively make use of the concurrent kernel ability of modern GPUs for better resource utilization.

publication date

  • June 26, 2017

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 13

start page

  • 21

end page

  • 28