> In fact I think long-term autonomy (in the range of several hours) and self-correcting is going to be where we see most improvements in coming years.
Right, model intelligence defines the scope of things they can one shot
I also suspect that users naturally calibrate to a model's useful scope, gradually getting positive/negative feedback and gradually making their requests bigger/smaller than before