There is going to be a flurry of this sort of stuff as the AIs get smart enough to find them. It will naturally die down as the legitimate ones are fixed. Yes, there will always be some level of this, but I’d expect it to be low and the exploits found to be increasingly complex. This is a time of transition.

> a flurry of this sort of stuff as the AIs get smart enough to find them.

I really think this characterization is misleading. It's not "getting smart", only more tailored toward a specific usage, better curated dataset, better harness, better prompts, better labeling of results, documentation of failures and success, etc.

The outcome is (hopefully) overall better but this anthropomorphized wording makes it sound like AI itself is somehow changing or evolving. No, both academia doing fundamental research, industry making it available commercially, and finally security researchers making the entire tooling and process packaged as a service are actively shaping it to make it better. There is no "it".

Do you have a definition of "smart" such that there is something an AI could do to prove itself intelligent?

Or are you just defining "fast" as something only horses can do, and considering that a useful insight about cars?

A future AI may be intelligent, but LLMs are clearly not. They have no agency, no ability to reason, and no world model. The most effective way to use them is to treat them as next token prediction machines, because that’s what they are.

edit: downvotes but no rebuttals. feel free to show me where the agency, reasoning from first principles, world model etc exists. or you can ask an llm and they'll tell you they don't have those.

[deleted]

Yes, of course. I’m definitely anthropomorphizing as a shorthand. I’m the first one to say that these models are just a lot of matrix math.

> It will naturally die down as the legitimate ones are fixed.

Every software update introduces and reintroduces them

> It will naturally die down as the legitimate ones are fixed.

Seems like we're already in the middle of this phase, but rather than dying down, the 'reports' have just gotten more noisy and obtuse, making it more difficult to establish the actual degree of threat / attack vector.

And if you are a state agency who'd like to keep the undisclosed zero-days you rely on secret, spamming maintainers with reports makes sense.

As a bonus if you find any actual zero-days in your mass-generated ones you don't report it and get a new one to play with.

I mean. Makes sense until adversary states start walking through the same doors you’re using. At which point you might regret that maintainers are too flooded to deal with it.

Assuming, of course, said state agency is operating under sufficiently strategic governance and management…

Honestly execution complexity is over time becoming a lower and lower barrier too.