I had an idea similar to this for a model that allows you to parameterize a performance vs. accuracy ratio, essentially an imbalanced MoE-like approach where instead of the "quality score" in your example, you assign a score based on how much computation it used to achieve that answer, then you can dynamically request different code paths be taken at inference time.