110+ Tok/s as another data point on the RTX 5090 (Gemma 4 31B QAT + MTP at UD-Q4_K_XL) (at peak used 27 GB of vram)

The real lovely thing was getting 300+ Tok/s (Gemma 4 26B QAT + MTP at UD-Q4_K_XL) (at peak, I think I saw vram usage reach 21 GB of vram)