MLX does not run on NPUs AFAIK; just gpu and cpu. You have to use CoreML to officially run code on the neural engine.

Even then there is no transparency on how it decides what runs on the ANE/GPU etc

Correct. OS level stuff get first priority, so you can’t count on using it.

Turns out third party actually gets priority for ANE