I think it’s the move towards GPU-based computing is probably more significant - the constraints put in place by GPU programming (no branching, try not to update tensors in place, etc) sync up with the constraints put in place by differentiable programming.
Once people had a sufficiently compelling reason to write differentiable code, the frameworks around differentiable programming (theano, tensorflow, torch, JAX) picked up a lot of steam.