The bigger E4B model is pretty fast on my Galaxy S21 Ultra even with thinking enabled. Maybe GPU acceleration was not enabled?

I think there's quite the performance difference between the S21 Ultra (Snapdragon 888) and the S21 Ultra (Exynos 2100).

Qualcomm has optimized libraries for running LLMs on their chips that I don't believe Samsung has bothered with.