One nice development recently was ollama's support for MLX optimization on Mac hardware. It's not obvious how to know you're using a model that works with it, yet, so it's rough around the edges.

https://ollama.com/blog/mlx