Project
Converting CUDA programs to run on AMD GPUs
Master's thesis on converting CUDA code and external CUDA libraries to HIP so the same project can run on AMD and Nvidia GPUs.
Project
23-04-2024
Project
Master's thesis on converting CUDA code and external CUDA libraries to HIP so the same project can run on AMD and Nvidia GPUs.
Project
23-04-2024
Highlights
This thesis is about a practical portability problem: a lot of scientific GPU code is written for CUDA, but new systems are not always Nvidia-based. I focused on converting existing CUDA programs, including external CUDA libraries, into HIP so the same codebase could run on both AMD and Nvidia hardware.
The motivation was concrete. Nvidia has historically dominated GPU computing, but systems like LUMI in Finland use AMD GPUs in their GPU partition. That makes portability a real requirement for research teams that want to keep using existing CUDA code on newer hardware.
I describe a general conversion process where both the project's own CUDA sources and external CUDA library code are hipified. In practice, this means converting CUDA APIs to HIP equivalents, updating build scripts and Makefiles to use hipcc, and then compiling and testing the full stack together. One important conclusion from the implementation is that this is not a one-click migration: some manual CUDA-to-HIP porting is still needed depending on how each project and dependency is written.
I evaluated the method with a Quasi-Minimal Residual (QMR) solver using data from DREAM (Disruption and Runaway Electron Analysis Model), which models runaway electrons in tokamak fusion scenarios. In this specific QMR problem, quadruple precision arithmetic was required to get convergence at the target tolerance; with one million variables, the solver did not converge at without it.
The implementation showed that CUDA programs with external CUDA-library dependencies can be converted and compiled to run on either vendor's GPUs through the outlined process. For the QMR case, runtimes on consumer AMD GPUs were reported as comparable to those on a professional Nvidia Tesla V100-16GB. The broader takeaway is that HIP is a viable path for cross-vendor GPU execution, but reliable results depend on careful build-system work, validation of dependencies, and project-specific adjustments.
The full text and figures are available via the project link below.