Hacker News

Y

Hacker News

new | ask | show | jobs

hgaddipa001 4 days ago [ - ]

We did a lot of internal testing but no official benchmark.

We find that the less the agent knows, the more it hallucinates