> I've had _every_ model fail this
That seems to be because LLMs don't seem to be able to follow procedures (e.g. reliably counting).
> I've had _every_ model fail this
That seems to be because LLMs don't seem to be able to follow procedures (e.g. reliably counting).