also still need to test it on multiple cpus/gpus
b
also still need to test it on multiple cpus/gpus