Thank you for sharing! Based on your experience, do you think a two-model system might fare better? For example, two models in serial where the second model is trained to "sniff out" potential hallucinations and fact check them (and possibly iterate with the first model)?