Pytorch dataloaders are often horribly inefficient, a lot of stuff there can benefit from Rust/C++