While I'm certainly sceptical of pure LLM (re)-written software, I would have to assume in the case of the cyberattack vector that Anthropic used their new Mythos model to adequately test against.
Maybe someone has more info of them mentioning that.
While I'm certainly sceptical of pure LLM (re)-written software, I would have to assume in the case of the cyberattack vector that Anthropic used their new Mythos model to adequately test against.
Maybe someone has more info of them mentioning that.
> to adequately test against
How does one determine what "adequate" looks like for a million lines of code?
You can't fit a million lines of code in a 1M token context window unless every line of code is one token. So you're just sort of praying you spend enough time/money burning tokens to shake out all the stuff that's bad or wrong.
I wouldn't be surprised if the kinds of security issues LLMs tend to create are the exact types of security issues LLMs are bad ar detecting.
so they are defending the LLM-generated code using another one of their LLMs, against attacks from yet other LLMs? So regardless of the outcome and impact on us, they win?
Jarred said this had nothing to do with Mythos or Anthropic.
I have a very, very hard time believing that. Surely the acquisition left his wealth largely in the form of Anthropic stock, so his personal definition of success is "rep Anthropic so my stock goes up" and at that point he has succeeded.
Me, I still have to be competent to succeed. I don't just get to declare that because I used AI the effort was a success, and I have 0 desire to work with those kinds of people.
The concept of a "useful fool" is apt here.