Graphics Processing Units (GPUs) have enabled significant improvements in computational performance compared to traditional CPUs in several application domains. Until recently, GPUs hace been programmed using C/C++ based methods such as CUDA (NVIDIA) and OpenGL (NVIDIA and AMD). Fortran Numerical Weather Prediction (NWP) codes had to be completely re-written to take advantage of GPU performance gains. Emerging commercial Fortran compilers may allow NWP codes to take advantage of GPU processing power with much less software development effort. At NOAA's Global Systems Division, we have been investigating the application of GPUs to NWP since 2008. At that time, there were no commercial Fortran compilers, so we developed a translator, "F2C-ACC", to convert our Fortran codes to CUDA. With help from F2C-ACC, we created a hand-optimized CUDA version of the Non-Hydrostatic Icosahedral Model (NIM), a prototype dynamical core for global NWP. As commercial Fortran compilers have become available from CAPS and PGI, we have used F2C-ACC and NIM as a baseline for performance comparison.
In the seminar, we will present a brief overview of GPU hardware and software technology as applied to scientific computation. We will examine commercial Fortran directive-based GPU compilers, comparing code porting effort and computational performance with hand-optimized CUDA on NVIDIA GPUs. We will describe ongoing efforts to parallelize segments of the WRF physics code, develop a GPU-efficient tri-diagonal solver used in the NIM, and discuss strategies we are using to minimize inter-GPU communications. We will also briefly mention AMD's GPU offerings, OpenGL as a potential solution to performance-portability, and Intel's Many Integrated Core (MIC) architecture that may soon compete with GPUs.