> measure when a deep learning system is making stuff up or hallucinating

That's a great problem to solve! (Maybe biased, because this is my primary research direction). One popular approach is OOD detection, but this always seemed ill-posed to me. My colleagues and I have been approaching this from a more fundamental direction using measures of model misspecification, but this is admittedly niche because it is very computationally expensive. Could still be a while before a breakthrough comes from any direction.

> Could still be a while before a breakthrough comes from any direction.

It would be valuable enough that getting significant funding to work on it is probably possible. Especially with all the money being thrown at AI.

Could you elaborate on what you mean by OOD detection seeming ill-posed?