A GPU performance estimation model based on micro-benchmarks and black-box kernel profiling
Abstract
Over the last decade GPUs have been established in the High Performance Computing sector as compute accelerators. The primary characteristics that justify this modern trend are the exceptionally high compute throughput and the remarkable power efficiency of GPUs. However, GPU performance is highly sensitive to many factors, e.g. the type of memory access patterns, branch divergence, the degree of parallelism and potential latencies. Consequently, the execution time of a kernel on a GPU is a difficult to predict measure. Unless the kernel is latency bound, a rough estimate of the execution time on a particular GPU could be provided by applying the roofline model, which is used to map the program’s operation intensity to the peak expected performance on a particular processor.Though this approach is straightforward, it cannot not provide accurate prediction results. In this thesis, after validating the roofline principle on GPUs by employing a micro-benchmark, an analytical throughput or ...
show more
![]() | |
![]() | Download full text in PDF format (2.75 MB)
(Available only to registered users)
|
All items in National Archive of Phd theses are protected by copyright.
|
Usage statistics

VIEWS
Concern the unique Ph.D. Thesis' views for the period 07/2018 - 07/2023.
Source: Google Analytics.
Source: Google Analytics.

ONLINE READER
Concern the online reader's opening for the period 07/2018 - 07/2023.
Source: Google Analytics.
Source: Google Analytics.

DOWNLOADS
Concern all downloads of this Ph.D. Thesis' digital file.
Source: National Archive of Ph.D. Theses.
Source: National Archive of Ph.D. Theses.

USERS
Concern all registered users of National Archive of Ph.D. Theses who have interacted with this Ph.D. Thesis. Mostly, it concerns downloads.
Source: National Archive of Ph.D. Theses.
Source: National Archive of Ph.D. Theses.