With the help of some folks at NVIDIA I finally figured out the issue I was running into with our reductions. It turns out that the reduction kernels required more registers in debug mode and the GPU didn’t have enough. I fixed this by setting the kernel launch parameters using the occupancy API instead of just using the maximum number of threads and blocks the GPU supports.
My MHD code passes all the compute sanitizer checks, after I made sure to initialize all the GPU memory. I’m now working on getting it to pass all the hydro tests again since it no longer passes all of them. Once that’s done I’m going to implement some analytical MHD tests so that I know when I have exactly correct results, also, the analytical cases are much simplier.
- Consulted on the refactoring of the I/O routines
- Update this website to v5.3.0