Your model can absolutely improve

How would that work out barring a complete retraining or human in the loop evals?