Agree, especially if they're not given access to the web, or if they're not strongly prompted to use the web to gather context. It's tough to judge models and harnesses by pure feel until you understand their proclivities.
Agree, especially if they're not given access to the web, or if they're not strongly prompted to use the web to gather context. It's tough to judge models and harnesses by pure feel until you understand their proclivities.