The findings are based on older models and assuming recent models behave similarly, what kind of prompt style one should use then to improve the outcome to avoid the increase in variance especially when you ask a model to solve really complex problems?