Hacker News

This is solved easily by one additional sanity check API call to a different AI. I’m not sure why people think these bugs are like, complete showstopper insurmountable things. It’s a quick fix.

shakna 2 days ago [ - ]

If Anthropic couldn't achieve that with Project Vend [0], why do you seem to think that everyone else could?

> Claudius, believing itself to be a human, told customers it would start delivering products in person, wearing a blue blazer and a red tie. The employees told the AI it couldn’t do that, as it was an LLM with no body.

> Alarmed at this information, Claudius contacted the company’s actual physical security — many times — telling the poor guards that they would find him wearing a blue blazer and a red tie standing by the vending machine.

[0] https://techcrunch.com/2025/06/28/anthropics-claude-ai-becam...

Zagreus2142 3 days ago [ - ]

Taco Bell knows and controls it's own menu and the valid options are already directly encoded in their POS system, including purchase limits. Why would you call out to a different non-deterministic model instead of validating against the complete and deterministic data you have? Taco Bell can afford 1-2 engineers to manage that

zamadatix 2 days ago [ - ]

This would be better off if the LLM was used for the human interface but traditional logic was used for the ordering API and its sanity checks. I.e. let it be fine the LLM can bug out on occasion, but keep rigorous boundaries around the amount of risk that's associated with.

saagarjha 2 days ago [ - ]

AIs are not resilient against deliberate attacks, even if you use multiple different models.

diamond559 2 days ago [ - ]

Maybe they shouldn't work w/ customers then, retail workers have to deal w/ hostile customers all the time.

zamadatix 2 days ago [ - ]

The net result would surely be retail workers only get to deal with hostile (or just difficult, even) customers while the LLM deals with the easy ones. That's what has happened with every other technology introduced to retail - less "business as usual" and "overhead" work and more "oddball" handling. E.g. the electronic PoS and intercom system already have the same kind of effect.

jameslk 3 days ago [ - ]

It seems so, and yet here we are

There’s other videos out there (not just of Taco Bell’s implementation per se) of these systems bugging out

ggghgcdd 3 days ago [ - ]

[flagged]