I'm skeptical that we would need determinism in a supervisor in order for it to be useful. I realize it's not exactly analogous, but the current human parallel, with senior/principal/architect-level SWEs reviewing code from less experienced devs (or even similarly-/more-experienced devs) is far from deterministic, but certainly improves quality
Think about how differently a current agent behaves when you say "here is the spec, implement a solution" vs "here is the spec, here is my solution, make refinements" - you get very different output, and I would argue that the 'check my work' approach tends to have better results.