This is not different from mlx-lm other than it uses a closed-source inference engine.

[dead]