For LLMs and other pure memory-bound workloads, but for eg. diffusion models their FPU SIMD performance is lacking.