I'm running an M4 Max as well and I found that using project goose works decently well with qwen3 coder loaded on LM Studio (Ollama doesn't do MLX yet unless you build it yourself I think) and configured as an openai model as the api is compatible. Goose adds a bunch of tools and plugins that make the model more effective.