I'm having trouble understanding what the problem is -- as in, what are the actual symptoms that users are seeing? How much latency can the app tolerate and how much are you seeing in practice? It would be helpful (to me at least) in thinking about potential solutions if that information were available up front.

Perhaps there's something in this video that might help you? They made a lot of changes to scheduling and resource allocation in the M3 generation:

https://developer.apple.com/videos/play/tech-talks/111375/

It's a real-time audio app, so if it falls behind real time, no audio. You get cracks, pops, and the whole thing becomes unusable. If the user is doing audio at 48 kHz, the required latency is 1/48,000 seconds per sample, or realistically somewhat less than that to account for variance and overhead.

I find it hard to believe that users would notice latency under 1ms. Probably not even under 5ms.

Have you tried buffering for 5ms? Was result bad? 1 ms?