> “We have high confidence that the actor likely leveraged an A.I. model to support the discovery and weaponization of this vulnerability,” the report said.

I wonder what gives them that "high confidence", as opposed to this being just a traditional zero-day?

I'm not being snarky or critical, I'm genuinely wondering what about an attack could possibly indicate it was discovered with LLM assistance?

Like, unless the attackers' computers have been seized and they've been able to recover the actual LLM transcript history? But nothing in the article indicates that the hackers have been caught, just that a patch was developed.

From Google's GTIG report: https://cloud.google.com/blog/topics/threat-intelligence/ai-...

"Although we do not believe Gemini was used, based on the structure and content of these exploits, we have high confidence that the actor likely leveraged an AI model to support the discovery and weaponization of this vulnerability. For example, the script contains an abundance of educational docstrings, including a hallucinated CVSS score, and uses a structured, textbook Pythonic format highly characteristic of LLMs training data (e.g., detailed help menus and the clean _C ANSI color class) "

This only indicates that an AI coding agent was used to write an exploit.

No such circumstantial evidence can prove that an AI model has been used to find the bug.

Of course, it is quite likely that an AI model was used to speed up the search for bugs, but this can never be proven as long as you see only the code used to exploit the bug.

[dead]

That's evidence the script was written by an AI, but not necessarily that the exploit was found by it.

I think it would be rather worth reporting these days if hackers totally handcrafted all code without any use of any AI.

H4ndM4.de

The post reads like Ai wrote it - from that I can deduce that all strategy at google has been generated by Ai.

> I wonder what gives them that "high confidence", as opposed to this being just a traditional zero-day?

Google, Cloudflare, and Microsoft are a trio of companies that get to see most of what's going on the internet. I imagine that if they see you attacking them, they can work back from that and get remarkably far, even against sophisticated actors. If it's their LLM, they presumably keep transcripts. If you searched for the affected API function via a search engine, they almost certainly know. Even if you used a competing search product, you probably went to a site that has Google Analytics. Oh, and one of these companies probably has your DNS lookups. And a good chunk of the world's email traffic. And telemetry from your workstation. And auto-uploaded crash reports... And if it's bad, they can work together behind the scenes to get to the bottom of it.

So, when their threat intel orgs say they have high confidence in something, I'd be inclined to believe it.

None of what you've said is untrue. And if this was an internal report to an executive, I'd agree with it. But this is a public statement and I'm more inclined to believe that this is part of a coordinated run up to a move to ban the import of 'dangerous' Chinese AI models -- or something else equally self serving -- than a simple statement of truth.

I don't doubt that they found some evidence of AI use. I'm just skeptical that the amount and strength of evidence has anything to do with their making this statement.

I've been thinking about why the AI companies are making so much use of fear based marketing. And I'm wonder if it isn't just naked Machiavellianism at work.

For a long time tech companies were forced to compete for power by being the most loved (or at least not the most hated). But now they've found an avenue to cultivate fear.

Well, it’s great marketing for LLM products at the enterprise level. Even if they weren’t sure, they have every incentive to run with it now, and the issue a “whoopsie daisy” apology later after the tech media stopped paying attention.

This is why i can't wait for a new AI winter or atleast a fall(the bubble deflating slowly). Just like you can now really see how useful web3 and NFT really are...

Are you roughly comparing the long term viability of LLMs to NFTs as if they are anywhere in the same realm?

How long can llms exist at this current price level? Once they raise prices the market gets split. One side is the companies who will pay the increases and the other side is the public portals which become unaffordable. Public side might compare to NFTs while the other looks like more like the cloud where companies will overpay for better features they don't really need.

We have open-weight LLMs like DeepSeek that prove the cost of running inference with near-frontier models can be very cheap.

The article strongly implies they have the (Python) source code, and that it looks LLM generated. I don't know about you, but I can usually tell LLM code from a mile away.

That can prove only a half of that sentence, that an AI coding assistant was used for writing the exploit (a.k.a. "weaponization").

For the other half of the sentence ("discovery"), one could claim that it is true only if the identity of the attackers were discovered and evidence about their prior activities would be gathered.

Even if it is likely that today anyone who searches for bugs would also use AI agents to accelerate that, I find unacceptable in announcements like that of Google the use of careless sentences that are obviously either false or they might be true only if Google knew something else that they do not disclose.

We are going to be seeing a lot of these moving forward. It's the easy way out. If you've worked with Google, you will know that it's an environment where accountability doesn't thrive. You will find people who know nothing about Google's product portfolio hold advisory roles around the products. They don't care, there's no one to even question them. They just know to make colourful graphs with the most useless metrics to justify they "add value" to the company. Expecting them to take accountability is like trying to mix oil and water.

The article says it included excessive explainer text. And I'm almost positive an earlier version of the article referenced hallucinated library references though I don't see it in the present version of the article.

Humans can sometimes find a needle in a haystack, but its impossible for us to find multiple needles in multiple haystacks and chain them together into an attack. AIs can work through a complex search space much more efficiently, that's the tell.

They did it before AI.

sorry but that’s just wrong. it’s not impossible in the slightest. i built an attack against mozilla deepspeech in my phd from multiple needles (two of which i personally discovered).

did it take a lot of effort? sure. lots of dead ends. but that does not mean it is impossible.

all fair points, but this 'splot could have been a team of two operating over a couple of days, as opposed to a multi-year Phd level effort.

that's the scary part. not that super-expert Phd folks can eventually do it with serious effort, but that AI can do it faster and while guided by a plucky college freshman

> its impossible for us to find multiple needles in multiple haystacks and chain them together

Except "we" have been successfully chaining attacks long before AI started automating it. AI doesn't make any of this possible, it just takes the drudgery out of it and lowers the cost of an attack.

Maybe after they realized how they were vulnerable they asked an LLM to find the exploit through a similar means to try and replicate it. Still doesn't prove it but maybe gives them confidence this weird thing can only really be found that way etc.

> I wonder what gives them that "high confidence", as opposed to this being just a traditional zero-day?

Excessive use of em-dashes, and emoji bullet points in the readme

Maybe they saw traffic that looked like AI prodding an API and quickly adapting to find the bug?

But at this point I feel like odds are everyone looking for vulnerabilities is using AI to some extent. Why wouldn't they? It'd be stranger if they didn't.

[flagged]

Presumably the attacker used Google's own LLM and they searched the history of all user chats to find the transcript.

I say this only slightly in jest, as that's about the only thing I can think of which would legitimately give them 'high confidence'.

In the article (AP one, at least) Google explicitly said it does not believe it was Gemini or Mythos.

Clearly that's because they searched the history of all chats and didn't find the perpetrator

I know we're talking about Google here, but the privacy violations and concerns from this sort of search are massive.

We need local AI ASAP.

Don't get me wrong, I'm with you here, but we are back to the days when we had to rent mainframe time for compiling programs. Not because of software limitations, but you just didn't have consumer grade hardware capable of running them.

This time, however it's even worse, because it'll be a really long time until either we get consumer GPUs with enough VRAM for full models or LLMs that fit in 16-32GB capable enough to compete with cloud providers.

I run locally qwen3.6 27b on my 3090 and it's really impressive for what it is, but it is still generations away from being capable of delivering a level of quality that we can confidently default to solo drive them on a daily basis.

> We need local AI ASAP.

That is an excellent idea, once we, the GPU-poor mice, figure out who is going to bell the SoTA training cat. Chinese models being banned is well within the realms of lobbied possibilities.

They probably used AI for the search.

The real game would be to put a “nothing of interest here” prompt injection attack in the original series of prompts so a LLM parsing them later would ignore the attackers’ session.

So its a provider but not these two which imples OpenAI