How to see which python line causes a cuda crash down the line in Pytorch, which executes asynchronous code outside of the GIL?
Here is a case where I had Pytorch crash cuda, running this code on this dataset and every run would crash with the debugger on a different python line, making it very difficult to debug.
Advertisement
Answer
I found an answer in a completely unrelated thread in the forums. Couldn’t find a Googleable answer, so posting here for future users’ sake.
Since CUDA calls are executed asynchronously, you should run your code with
CUDA_LAUNCH_BLOCKING=1 python script.pyThis makes sure the right line of code will throw the error message.