I suspect the act of running nvidia-smi itself prevents the GPU from being put into a low-power state.

From memory this is true and nvml (Nvidia management library) is the way to get stats that doesn't cause the GPU to wake.