Post

Scaling Plots & Paper Writing

Scaling Tests

I finished running the scaling tests this week and ran another set of them to see if the degraded performance at scale was real or a just a fluke. The second run confirmed it to be real but also showed degraded performance at a different number or ranks. There’s also sometimes a ~17ms difference between the min and max time step time in the “slow” runs. We’re not sure exactly what’s going on, it might be network congestion, a slow GPU, an issue on the code, or just that running at scale is slow sometimes.

First Run

ms_per_gpu_run_1 cells_per_second_run_1

Second Run

ms_per_gpu_run_2 cells_per_second_run_2

Paper Writing

I wrote the section on the scaling plots and did some general revisions.

Testing

I wrote up a test for the _ctSlope function. Given that I’m thinking about rewriting that function to utilize templates I’m holding off on merging the test until I decide what I actually want to test.

Other

  • Finalized and submitted PR for test coverage stuff (PR #318)
  • Fixed some issues with the function that checks the configuration of Cholla. PR #319
This post is licensed under CC BY 4.0 by the author.