Hacker News

noosphr a day ago [ - ]

To this day frontier models think that A and not B means A and B when the sentence gets pushed far enough back in their context window. The context length that model can reason over without obvious errors is much smaller than the advertised context. Between a 1/4th to a 1/20th what is advertised on the tin.

antonvs a day ago [ - ]

Critiques like this tend to focus very hard on what models can't do. It's true, they have limitations.

But they're also superhuman in so many other ways. It's valid to point out limitations, but that doesn't support the conclusion that models are not incredibly powerful and capable of the functional equivalent of reasoning at human or superhuman levels in many scenarios.

noosphr a day ago [ - ]

They may be better than humans at reasoning but they are substantially worse than the first generation logic programs from the 1950s.

cheevly 16 hours ago [ - ]

These types of comments help demonstrate first-hand how human reasoning stacks up against what an LLM would say in this situation.

leecommamichael 5 hours ago [ - ]

Agreed. Both are true. I sometimes think of the calculator as being superhuman as well.

antonvs 2 hours ago [ - ]

Yes, although the calculator couldn't "reason" the way ML models can.

All the political and emotional reactions to LLMs seem to obscure how absolutely amazing this technology is. I've pointed them at codebases I wrote entirely myself and had them find bugs, point things out I had missed, plan and implement refactorings to improve code quality, etc. I may be "smarter" than the models in some ways but there's no question they're smarter than me in others. They're unlike any tool we've ever had access to.

Yes, the politics and economics around them leaves a lot to be desired (read: is absolutely terrible), and there are a lot of valid justifications for the "AI backlash", but there's a very important baby in that bathwater.

Npovview a day ago [ - ]

Do you also happen to remember what you ate last thrusday?

ethin 19 hours ago [ - ]

Do you have a point? Because last time I checked, AIs were supposed to be better than us fragile faulty humans, and weren't designed to emulate us and all our faults.

Npovview 10 hours ago [ - ]

If you have been following the news, harness is also a scaling direction now. Prompt your AI better not to forget relevant stuff or write them in a file which it can refer later. This way context can be refreshed, this is cached facts method or rolling window method of refreshing your memory just like you would ask a colleague to explain a concept again. These are solved problems.

ethin 4 hours ago [ - ]

Are they though? Because I really shouldn't have to use Claude Code (and I don't) just to get even decent results. As I said, I thought one of the biggest advantages AI was supposed to have was that it wouldn't need such constant reminding of things because it wasn't trying to emulate us faulty, forgetful, fragile humans who do have memory loss?

Npovview 3 hours ago [ - ]

You can convert your best practices into a skill or best practices md file and CC will keep that in purview.

UncleEntity a day ago [ - ]

"If you have a question look in the specification for the answer and don't just guess" seems a fairly important thing to remember for more than a couple of minutes...

Npovview a day ago [ - ]

I had a coding session where I was doing stuff across two repositories. And CC forgot in exactly which repository a particular file was so it was grepping the parent directory. I just asked it to write all important key-value pairs which it thinks are important to a file and it never did parent directory grepping.

leecommamichael a day ago [ - ]

Is that the same gap as what you’re responding to? To me, it seems his critique is about advertised capability and logical statements, and your rhetorical(?) question is about memory.