Did you try the MLX model instead? In general MLX tends provide much better performance than GGUF/Llama.cpp on macOS.