From the mistakes actually found and confirmed, how likely do you think they could be progressively transformed into well defined rules that don't depend on LLM?