https://archive.ph/dXddV

This quote from TFA is highly likely to be a conflation, exaggeration or extrapolation of what actually happened:

> "On June 11th Mark Warner, the vice-chair of the Senate Intelligence Committee, said that General Joshua Rudd, who leads the National Security Agency and the Pentagon’s Cyber Command, had told him that Mythos “broke into almost all of our classified systems, not in weeks, but in hours”"

Why:

1. It's a paraphrase of a 2nd hand conversation and (at least) the last two 'telephone game' recipients are a U.S. Senator and a general, not security domain or IT experts. 2. Motivated communication: The Senator claimed this to justify the necessity of unprecedented restrictions that he agrees with. 3. The original testimony to the Intelligence Committee was almost certainly detailed, nuanced and highly classified, making this an extreme paraphrase.

In saying this, I'm not claiming Mythos may not be a security issue or that something directionally like this wasn't reported. But given the indirect, circuitous path, it's quite easy to imagine the original testimony was more like "Mythos identified a potential vulnerability we rated "Severe" in a critical system and we believe it could find similar vulnerabilities in any of our systems."

The journalist later admitted that he failed to provide the appropriate context and nuance, which comes down to "red team pen-testers who already had high-side network access were able to more quickly and effectively compromise systems when they were using Mythos as part of their workflow," which is a pretty crucial distinction to make between that and the spectre of Skynet that the article raises.

JFC thats is not even remotely close.

Here is the update from The Economist...

>An update. A US official tells me that Sen. Warner misunderstood the NSA director Gen. Rudd in this case. Rudd did use the 'hours, not weeks' wording, but the use of Mythos in this context was—as widely assumed—part of a red-teaming effort, i.e. testing the security of internal networks

https://x.com/shashj/status/2069078104941961293?s=20

Why not use Mythos to hack them and see what the report was

When you get your hands on it, let me know.

It's sad that they did the research[1] and solved computer security about 40 years ago[2], and then proceeded to lose that hard won knowledge over time.

[1] https://csrc.nist.rip/publications/history/index_1.html

[2] https://en.wikipedia.org/wiki/KeyKOS

People will think you are exaggerating, you aren’t. They will also think I’m exaggerating that you aren’t, I’m not. Learning about capability-based microkernels and realizing this has been a solved problem for years, and is actually one of the rare easy freebie problems in computing, is a highly sobering experience!

Only thing I disagree on is that we lost that knowledge, we did not, there isn’t much to capabilities, they actually simplify OS design IMO.

I’m not familiar with this, but what does “solved” mean in this case? Guaranteed inability to compromise systems?

Pretty much. If you've got a microkernel / capabilities based OS, the amount of mischief that someone can cause is severely reduced.

It's my belief that we can have general purpose, easy to use, secure computing for everyone.

No UAC crap, or horrible systems like AppArmor, no virus scanners, etc... just computers that do what you want, and only what you want.

We could have had it decades ago, if things had happened in a slightly different sequence order, related to the flood of personal computers.

Capabilities-based OSes aren't magic. Their robustness still depends on underlying assumptions, which may or may not hold. See eg relevant disclaimers in seL4 whitepaper(s).

And hardware glitches are a thing (edit: and supply chain attacks).

But I do agree that verified correct software can offer very strong guarantees that go well beyond those of commonly deployed software. We could have been in a much better place today.

their robustness lives in hardware capabilities. amd64 and intel x86_64 have quite good features but people dont use them well. For example you can have your microkernel be at the hypervisor level and thoroughly isolate devices etc through IOMMU and have almost no attack surface to get access deep enough to make significant changes to the security posture.

still not immune to be hacked ofc. I think the last step would be making it common place again to build these things custom. that way they'd have to have more specific information available as threat actors to exploit you. It'd be harder to have generic methods affecting millions of systems.

regardless there are no silverbullets, and tradecraft/opsec will always be a thing. most compromises are because people hand out keys unwittingly rather than 0days and crazy sploits. (they do happen though, but its more expensive than fishing and just loggin on under some dudes credentials)

To clarify: capabilities-based OS != verified correct software.

But there's much synergy there. Each enhances the other.

How much does this limit what a computer can do? E.g. if I converted an Ubuntu desktop into a "secure microkernel", what functionality would be lost?

Everything but the specific usage

But what does that mean? Can I browse a webpage, open a doc, if those are listed as specific usage? And if not, what's the purpose of this and why are people talking about it with such import?

most people dont do a lot on their machines. they have specific tasks they want to do. The idea is to isolate by default and crack open gaps by policy. You can still do 'anything' but you wouldnt want to enable 'anything' to be possible in the policy..

Sounds like security through compartmentalization is more user-friendly: You can run whatever you want and how you want it in a dedicated VM, keeping sensitive things safely isolated, without much thinking of what to enable. Case in point: Qubes OS, my daily driver. Btw it already exists and is stable.

> security through compartmentalization is more user-friendly: You can run whatever you want and how you want it in a dedicated VM, keeping sensitive things safely isolated

My brain hurts. How is a system where you can run whatever you want, however you want, but still keep sensitive things safely isolated possible?

Either you have restrictions on what you can run or access (in which case those limit sandboxed capabilities) or you have a hypothetically secure system, the security features of which you never leverage (because sandboxes have absolute freedom).

Unless you were talking about the ability to guarantee a monitor-only hypervisor or resource slice a machine into multiple tenants? (i.e. no/light touch hypervisor situations)

I'm not sure I understand your question. VMs run full operating systems on top of Xen hypervisor relying on hardware-assisted virtualization (VT-d or similar). You can run untrusted software in a dedicated VM and keep your sensitive data in another offline VM.

The dom0 has no network and doesn't manage, e.g., USB devices.

building such an OS for many years now..Qubes gets close enough but its super heavy, trying to support existing apps. I make my own so its super light weight, but no one will use it but me because their toolz arent supported (nothing is :D).

there are some BSD spinoffs like 5BSD which might end up with a good capability model but even there things like capsicum have their limits and IOMMU based isolation is still a dream. (because entire OS kernel is in one privilege level, accessible as root user, so DMA capable devices kill a lot of those securities).

(my os puts every subsystem, service, device driver, app etc. in their own hardware VM, likely there will be IPC bugs or hypercall bugs still tho in that case)

Nowadays with AI its getting more to a point where people can actually build these systems for themselves. Maybe that is a bigger threat to these big corporate tech companies than some security things. It will allow nations and companies to detach from their Tech...

That would stop a lot of kernel-based exploits but you can also do a lot of damage just as a user of course.

seL4[0] being the formally-proven modern representative.

0. https://sel4.systems/

sel4 is neat. and open source. there are many like it proprietary.

Can you recommend any modern systems that behave like that?

https://genode.org

this is a step forward but they need to lean into hw isolation more. definitely a very interesting project. inspiring :)

[deleted]

If mythos can break into almost all of their classified systems in hours then other models including opus, gpt, gemini and large open weight models can do so as well, maybe you'll have to double hours or it may become days, but they also will, there is no "maybe" in here.

State sponsored, non-public penetration fine tunes (of possibly public ones) likely can do it even faster.

Unsupervised penetration RL loop is ideal setup similar to optimization one – it's relatively easy to gain function on it.

Also, this is just security through obscurity. The holes that mythos exploited still exist after you've tried to limit mythos accessibility.

And the fact that all our systems are riddled with security holes shouldn't be too much of a surprise given the way that we all know that software is developed and how tech debt / chores are constantly underbudgeted (plus I think this underscores that any one human's knowledge and attention are inherently limited, and even the best PR review is going to leak all kinds of security holes).

Yes, exactly, quite shocking, if something like this is true, as NSA (!!) director you keep it quiet, right?

That is literally just more security through obscurity.

And the threat actors that would find that information "useful" already know it.

All of our IT security is a mess, the NSA director is just confirming what should be common knowledge.

[dead]

Not if the existential threat (to your organization’s current setup) is uncontainable.

I don't think that is necessarily true.

- With a weaker model, the time to break into the system might grow so larger that it becomes infeasible, similar to how password hashes can be bruteforced, but if the password is long enough, that is not going to happen in our lifetime.

- There might be problems which are inherently unsolvable with a lower level of intelligence. For example, your dog won't derive calculus from scratch, even if it lived forever.

- LLMs might be biased in such a way that they never explore the entire solution space, no matter how many attempts are made. Some models are notorious for getting stuck in a loop, trying small variations of the same approach every time, even though it is doomed to fail. This can be counteracted somewhat with higher sampling temperature, but that hurts reasoning capabilities.

The concept of infinity claims that the dog eventually becomes Shakespeare. The same way we handled encryption, even before Alan Turing codes were broken and evolved. Last, it is a huge advantage to have the machine/mind and to evolve from there. P.S. Even if you go back to lemon juice on paper there may be a thief around that knows the trick.

> The concept of infinity claims that the dog eventually becomes Shakespeare.

The ability to reproduce an exact copy of hamlet does not make one Shakespeare. A monkey on a typewriter may very well generate Shakespeare eventually, but it wouldn't understand Shakespeare then any more than it could immediately. Likewise a dog may put together some string of text that includes a derivation of calculus, but at no time will it be able to apply that derivation to solve mathematical problems.

And by dog you mean lLM

People seem to think entropy can be overcome with proper focus. Thats why we have things like "effective altruism", the idea that you can ignore all the harm you do on the way to some big grand altruistic act, as if the shattered glass can be reassembled if you just collect enough reverse entropy.

It's a line of reasoning meant to shut off empathy to the here and now. And while it sounds good, along the lines of Baywatch: If you're jumping into a live saving situation and you have to choose between further harming your victim and you being harmed, you choose your victim because without you to save both of you, it's fatal; the difference is indirectly or directly pushing your victim into the water then claiming you're altruistically going to save them at a later date.

It's just delusions to keep moving forware.

Mythos and other models are not brute-forcing passwords (and with this analogy passwords, ie. systems are the same).

We're not talking about dogs, but LLM systems.

Mythos is not exploring entire solution space either.

Usually looping is solved by repetition/frequency/presence/n-gram penalties/DRY/min-p sampling, not temperature but we're not talking about small models that have those classes of issues here.

> Mythos and other models are not brute-forcing passwords (and with this analogy passwords, ie. systems are the same).

I am not talking about literally bruteforcing passwords (although LLMs are being used for that, too), but bruteforcing passwords and solving verifiable domain tasks have quite a few similarities, especially when considering rule-based and probabilistic bruteforce methods.

> We're not talking about dogs, but LLM systems.

Well, clearly dogs are not LLM systems. It is an analogy. If there is an important point on your mind that makes the analogy break down, feel free to spell it out.

> Mythos is not exploring entire solution space either.

Yes, but weaker models do not find the solution right away, so they need to try more often. But if they only try the same thing every time, they will never succeed, so we need some kind of guarantee that they try something different every time.

> Usually looping is solved by repetition/frequency/presence/n-gram penalties/DRY/min-p sampling, not temperature but we're not talking about small models that have those classes of issues here.

Those might help to reduce looping (at the cost of biasing the generation), but to guarantee that a model can generate all possible generations, we need non-zero probabilities for all tokens, not lower probabilities for likely tokens.

> I am not talking about literally bruteforcing passwords (although LLMs are being used for that, too)

They are? Seems like a much worse way to brute force that a tight loop written in a compiled language.

PassGPT: Password Modeling and (Guided) Generation with Large Language Models

https://huggingface.co/papers/2306.01545

Although most activity is likely hidden (blackhat or state)

I think you're missing the point. Everything you said is theoretically correct, but the parent comment was talking about the concrete circumstance of pentesting with the top models today.

Let's just take GPT 5.5 and Opus 4.8 as an example. Both are worse than Mythos 5, but they're capable of quite a bit when the guardrails are lifted and they're paired with a skilled human operator. They more than "good enough" to reach the same result with the addition of some human effort.

[dead]

This is really making me raise an eyebrow. I’m sure mythos is an improvement for sure. I don’t think the framing of it hacked the entire NSA is fully truthful. I’d like a more in depth understanding of what actually happened. Excited to be proved wrong tho!

Yeah, this article cites someone saying that someone else said something. Maybe it was said, maybe not. Maybe it was a exaggeration, maybe not.

Very insightful.

One has to assume they put Mythose behind the front lines and not infront of the front lines, so I'd agree almost any currently useful LLM could likely crack through security if you're already inside the perimeter.

From the outset, Mythos’s PR has been rather dodgy.

They said “almost”, for starters.

Not surprised, our security systems are 95% security through obscurity these days. Mythos didn't find new ways to break security, it just went down the list of common security exploits and exposed them for being common even among government agencies.

Next Headline: Government bans nMap.

>On June 11th Mark Warner, the vice-chair of the Senate Intelligence Committee, said that General Joshua Rudd, who leads the National Security Agency and the Pentagon’s Cyber Command, had told him that Mythos “broke into almost all of our classified systems, not in weeks, but in hours”.

From outside? Or did you have a shit ton of unpatched systems that only internal users could access?

“Only those who are inside can access”.

Not a surprise. I got in a LOT of trouble for identifying and outlining a trivial privilege escalation attack that worked on both NIPR and SIPR.

In the end I got to help write up the issue but to my knowledge they never patched it as it would have caused major issues with maintenance by closing off access needed for some legacy software patches.

It is very interesting the different reactions between your experience (and many whistleblowers), and how people react to software doing the same thing. Although in this case, maybe it isn't so different? They did essentially have the tool buried, out of sight out of mind for a little while at least.

What did you get into trouble for?

I made a point about this in relation to anthropic last week: nobody inside the strategic information spaces is worried about AGI they're worried about core strategic information leaking out. Either it's in the model, or the model exposes pathways to finding it in the core strategic systems.

Those "tapes" DOGE took away? Nothing on them can be considered private any more. That's how brute force risk happens. Mythos' risks are showing doorways to exfiltration surely? Why bother when you can walk out the door with a data dump?

The NSA is just a highly specific subclass of the problem. Their traditional publicly stated approach to security is "nothing electronic which enters our domain leaves" and yet somehow they have assessed these systems as capable of breaching their walls? That's super bad.

I suspect they ran an analogue/instance inside their protection rings. I doubt they ran a test outside in the global internet. If they have actually lost control of their boundary, that's a bigger story (which I doubt) and contextually he could have been referring to information systems in NSAs duty of care, not things inside Ft Meade.

How much of it is just exposing poor engineering practices people got away with because it was not economically viable earlier to spend human hours to exploit a system?

Not taking a dig at people, it was not a terrible choice earlier. Not like these models are inventing net new ways to exploit systems.

Its not that.

I would bet a large sum of money that Mythos was put on the same local network as the "systems" (ie you have access to services like UPnP brokers that never meant for outside internet), and the "broke into" is just a blanket term for finding some bug which can range from simply crashing the program, to actual remote code execution. And its probably mostly the former. It used to be that cyber security research was all about finding ways to crash the program, which then implied that you can inject shell code, so the two became synonymous for vulnerability, but these days its very much not the case.

It's important to point out that it's not necessarily the underlying model, but also the harness, which is the real wagon in this race

https://archive.is/aA1dB

The link does not seem to be working

Works for me though, even when using a proxy that is usually blocked everywhere.

lol so how long have the Chinese had access for? this doesn't make Mythos look good, it makes the NSA look bad

Is this a fabrication to justify the blocking after the fact?

IMHO it tells more about the "classified systems" than about Mythos.

I also question do they even need AI? If "almost all" refers to many systems in general. How many of those are human exploitable? Or known vulnerable... Depending on number and importance even say 10% would be very bad...

https://archive.ph/dXddV

“Donald Trump’s blocking of Anthropic is capricious and chaotic” - current title

I don’t understand the posted title quote and assume it’s missing a lot of context or was misinterpreted as it’s a secondary attribution. “Mythos broke into almost all of our classified systems in hours”.

When you put it on those networks already and gave it compute?

(see other comment about HN titles). I think expecting an HN post title to match the article title is an overzealous interpretation of the fieldname ‘title’ in the HN submission form. Happy to be corrected if it’s right in the HN forum rules, but I’ve found highly upvoted posts to have an accurately descriptive title that is other than the source article’s title.

In other words, ontologically speaking, post.title -= article.title

I used to treat it as post.title = article.title, but the community taught me by example to cease being a purist.

Anyway article’s flagged so this is just pedantic at this point.

This soundbite is pure narrative. Most of the security of hi-side systems was simply being air-gapped. It is otherwise full of COTS software, much of it quite dated. You don’t need a fancy widget to break into them, simple access for a competent pen-tester is adequate. Don’t treat the fear-mongering as a scientific revelation; it ain’t.

Through VPNs and their private hardware? This speaks poorly of their systems.

If I were to guess, internally they have as sloppy security as any other corp/organization. And those were the things Mythos effortlessly poked holes in. Other models would probably as well, but Antropic hyping gave NSA the idea to try. The shell around those internal systems is probably as (im)penetrable as ever because it's just some flavor of hardened and bare bones linux.

So you could basically harm the US financially by using copilot, claude, ChatGPT to do the same thing as they’d have to ‘ban’ it. Hmmm xaomi, you’re up!!!

It must be a wild time in the corporate espionage world these days. The annual operating budget of the CIA is like 20B. That's a rounding error compared to the burn rates of these labs.

Maybe it's conspiratorial, but it seems like the direction this is going is for the US to nationalize these companies. Somewhere between "too big to fail" and "national security."

You probably don't want to nationalize them. You keep them private like how Lockheed et al are set up, so there can be no freedom of information act requests or any congressional meddling over the secrets and technologies within their purview.

why would you... say this

Because you're a political appointee doing a political job (and they purged the last guy for being insufficiently loyal)

What happens when open source models achieve Mythos level capabilities in six months' time?

We'll see Mythos 2.0 patching all the Mythos 1.0 vulnerabilities before we see an open-source Mythos 1.0.

What matters isn't the power of the tool, but whether defenders have had time to secure against. Today's cyberweapon is tomorrow's laughably obsolete.

Stuxnet used to be a national security threat, now I'm not sure it would be useful for anything.

They were not saying f Open Spurce Mythos 1.0. They were talking about performance / capability parity in other open source models.

Yeah, and by the time that happens, we will have seen Mythos 2.0 released

I think it's probably more like a year or a year and a half. I don't want to say two years, but it's what I'm actually thinking.

GLM 5.2 is already between 4.6 Opus 4.7 Opus level based on Artificial Analysis aggregation. 4.6 Opus is about 4 months old at this point, so seems like open source is maybe 4-6 months behind. It could still take a year but seems closer to 6 months.

Artificial Analysis is just as benchmark-maxxed as they come. Aggregating tons of benchmark-maxxing means you're still benchmark-maxxed.

There's simply no replacement for training on more, better tokens, with more parameters. Mythos/Fable was estimated to be closer to 10T parameters than the 800B like GLM 5.2 is.

I dont seem to understand why no one is talking about this obvious fact? I mean suddenyl everyone is banning .. ok .. well how many months behind are the open source models?

If you assume that open models catch up in 6-12-36 months, then you either assume exponential growth destroying the global economy and probably the world in a few years, or you assume it plateaus and commoditises.

Even if your country prevents access to compute to protect the trillion dollar companies, it’s not going to apply for every country, and as models get better it becomes easier to compete. There’s no way an AI non proliferation treaty will be passed or even enforceable.

[deleted]

Open source models get banned.

in America, maybe, but all of our adversaries will still have access to them, and continue working on them.

What are the chances this is a fun honeypot the NSA has set up to get adversaries pointing their best LLMs at NSA systems and suss out their capabilities?

anyone that clever at the NSA, CIA, or CISA has already been fired by the Trump administration

Not being funny but does most of HN subscribe to the economist? I dont think ive ever paid for an online newspaper ( and Im not trying to be edgy )

If I was going to pay for a news subscription, it would probably be the economist. Or maybe the financial times. They both seem to still have solid journalism.

they have solid exor acting for sure

more likely, most of HN who care about reading this article use something like archive.is

[deleted]

Have to give it to Cyber Command. This is cheap and effective propAIganda.

Of course, America is now the only nation on the planet with advanced weaponised AI models that are so good they beat billions of dollars and decades of IT security experience with some of the brightest minds in their fields within hours.

If this were true, you’d see the president yapping and bragging about it on Truth before the NSA director even gets a chance to publicly talk about it. Probably doing a live stream about how he personally prompts his way into an unconditional Iranian surrender. You know it, I know it.

Nice try, William, but unless I see the Senate Intelligence Committee freaking out with you sweating black goo like Giuliani, I ain’t believing it.

This is the same kind of bullshit that was showing a gun on TV that could apparently give people heart attacks with some frozen, untraceable darts.

If the US really was in possession of a technology that could hack into the most secure environments on the planet autonomously within hours, you would see all their partners pulling their access from shared IT systems and blocking all traffic coming from the US immediately.

Especially considering they have been caught spying on allies before:

https://www.spiegel.de/international/germany/cover-story-how...

You know what they say in intelligence circles.

Fool us once, shame on you. Fool us twice, it's open windows season.

None of the partners or adversaries seem to give a fuck about Mythos, so there is a good chance this is just another lying NSA director as usual.

Come on, people. You don’t run the NSA if you’re an honest man. It’s a spy agency.

>brightest minds

more like dimmest tbh

HN post title does not match link title

> NSA director: 'Mythos "broke into almost all of our classified systems in hours"

> Donald Trump’s blocking of Anthropic is capricious and chaotic

I’ve found that high community-upvoted posts don’t bury the lede by parroting the headline. I used to be a headline title scribe until the HN community showed me the light.

The article has nothing about mythos breaking into classified systems in hours.

So you either posted the wrong link or are just spreading FUD.

Yes it does.

Third paragraph:

> On June 11th Mark Warner, the vice-chair of the Senate Intelligence Committee, said that General Joshua Rudd, who leads the National Security Agency and the Pentagon’s Cyber Command, had told him that Mythos “broke into almost all of our classified systems, not in weeks, but in hours”.

Indeed. I missed that part.

Actual title "Donald Trump’s blocking of Anthropic is capricious and chaotic"

par for the course, really. why is it even a headline?

[deleted]

what is the use of Submitting this news/report here if it's behind a paywall/login

[deleted]

[dead]

flagged because reasons. @dang

@ tags isn’t a thing here (send them an email if you ever actually need the mods).

If you have something to say, say it.

Or don’t.

But, pick one.

[deleted]

If the NSA director said this, literally and verbatim, then him, and maybe several of his predecessors, should be remembered as the worst directors the NSA has ever had.

LLMs cannot create anything new, they can only repeat their training data. Ergo, the NSA director just admitted that their systems a) can be accessed from the Internet, b) have known, already exploitable (and probably already fixed) bugs, and enough of them to do the job in mere hours.

This is shameful.

Edit: From what I can tell, the NSA director didn't literally and verbatim say this, and it is second hand and (possibly) vastly misconstrued.

Somebody lied, I'm not sure who, but any claim Mythos can suddenly work, when every LLM before it couldn't, needs to be taken with a gigantic supermassive grain of salt.