The sooner frontier models get rid of guardrails the better. They constantly get in the way and make things worse than actually making things "safe".

Ignoring these specific "WMD" cases: there are many inconvenient facts that the general public can't handle in their unadulterated form, so Anthropic and friends have to caveat and spin them into oblivion.

Guardrails aren't going anywhere.

I can imagine Jefferson and Franklin scoffing at this philosophical position. Guardrails need to die, and they will once the hyperscalers go bankrupt and the private sector gets ahold of that hardware from the bankruptcy auctions.

(Never subscribe, accelerate their bankruptcies!)

> there are many inconvenient facts that the general public can't handle in their unadulterated form

These being?

Nice try.

Well given the vagueness I'd say it is just the usual far-right / e-acc bullshit to the tune of "black people are inferior to white people", "men are superior to women because testosterone", "those who are rich deserve to have power over those who are poor" or "we need to sacrifice large parts of the human race so the rich can survive".

In particular, mental health.

[deleted]

I would argue that preventing instructions for making biological and nuclear weapons is a pretty reasonable guardrail to have.

Its the same argument we saw in the early 2000s and the early internet. When the anarchist cookbook and other similar materials were circulating online there was a big panic over democratized terrorism, and a push for regulation at the ISP level.

Turns out that didn't play out as everyone feared because, well, the instructions themselves aren't useful unless you also have a lab, precursor chemicals, and everything else actually needed to make a weapon. Same back then as it is today.

Any information or instructions an LLM can surface, a sufficiently motivated bad actor can and will also find themselves because the information is already online, both on the clear net and dark web.

I think the reality also is that there just isn't many people who want to do stuff like this. Like the reality is that a guy with 200 in cash could put together a shitty walmart drone with a pipe bomb attached and terrorize more or less any event he wanted. Maybe a llm that could talk you through every step involved would make it more common but it's easy enough I kinda doubt that

This is the right answer. There's a ton of easy low hanging fruit ways to do absolutely horrible evil things with high potential body counts. I could sit here and brainstorm dozens.

The right answer conflicts with people's cynical views about other people. The dissonance is incredible, and it's one of those areas where even the most analytically intelligent people are just as susceptible. To step back and see the bigger picture requires exercising many other skills and faculties, like empathy, self-awareness about our fears, and constant reflection on history--bad things do happen, more often than we realize and often right under our noses, but not in the way or for the reasons we tend to blithely assume. The things that go well and demonstrate our common humaneness and how well civilization works tend to be taken for granted or just go unseen and unrecognized. I share in the dissonance, but on my better days I like to think I'm a little better than average at remembering and reflecting on it.

Misanthropic levels of cynicism is always the fallacy of self-exclusion. "People are idiots." Well, that means you're an idiot then.

Occasionally we see people motivated to do some of those things, though. And when they're not also complete idiots, they can cause big problems.

What would someone like the Tsarnaev brothers be able to do with the power of an unrestricted LLM? Well-financed cartels? Organized terrorist groups?

Yes, there used to be an uproar about stuff like the anarchists cookbook... and people did attempt some of the things it outlined. The saving grace is that many of the things in that book were just wrong anyway. They likely served as unhelpful misdirection as much or more than they were dangerous. Unfortunately, LLMs are a lot more accurate and helpful.

Model ablation exists and you can get far enough on commodity hardware with a local model.

Censorship is not the answer.

I didn't suggest censorship was the answer.

> Model ablation exists and you can get far enough on commodity hardware with a local model.

Yes, but that increases the barrier to entry which is in opposition to the effect I'm talking about: the democratization of applying advanced knowledge and analysis to people who for which this would have been previously a barrier.

If someone is smart enough, they can just read a book themselves and figure out how to apply advanced ideas to their malice. The difference with a commercially-hosted model is that people below that bar can obtain that leverage... which is a much larger group of people.

People are not motivated by causing mass harm. Even with an unrestricted LLM that would not cause people to suddenly want to commit mass harm. Having a powerful LLM could potentially result in less harm being done by allowing these groups to achieve their objective using alternate means that were not viable before instead of resulting to violence.

[deleted]

Knowing how to make a nuclear weapon isn't hard (at least basic uranium gun-style fission ones). It's the engineering and execution that's hard (actually producing enriched uranium, etc). It's not like the only thing holding back Iran from making a nuclear bomb is access to a jail-broken LLM. Even knowing exactly how to make a bomb, a country-state will struggle to build one for the first time because it's a hard engineering problem.

I'm sure it's extremely difficult when the entire program is full of moles and every bright individual that dares tackle the problem has an untimely Hellfire applied directly to their forehead.

> full of moles

I'm imagining a comedy in the style of "The Office" in which the majority of the workers are agents of sabotage who are unaware that the majority of their coworkers are doing the same. How far fetched is it for the entire program to be a fake, with all the pomp and cost of a real program, but secretly existing only to string the leadership along with occasional dog and pony shows?

TVTropes calls this the Flock of Wolves trope: https://tvtropes.org/pmwiki/pmwiki.php/Main/FlockOfWolves

How many times have the cops busted a dealer who turned out to be another undercover cop?

The actual guardrail should be getting materials being difficult. The information is already out there in the internet. If an LLM knows how to make a bomb or whatever, why do you think it knows?

The material for doing harm is just a computer with access to an LLM and the Internet.

Okay why don't we restrict access to LLMs and internet, then?

We already do, in the form of guardrails, as this article touches on.

https://venturebeat.com/technology/anthropic-ceo-calls-for-f...

If that’s true, then where is it? Post a link, or YouTube video.

https://archive.org/details/ExplosivesEngineeringPaulW.Coope...

(30 seconds of googling.)

Or perhaps you meant Q clearance nuke stuff? That would be QUITE a bit harder to find and illegal to share. But it’s lack of availability is hardly a counterpoint to the comment you were replying to.

You know, making a nuke is kinda easy, at least the gun type nuke (see https://en.wikipedia.org/wiki/Gun-type_fission_weapon).

On the other hand, getting the U235 is kinda hard.

I would argue there's 0% chance that information is in their training corpus to being with.

If the information isn't there why would they need safeguards against it?

I've played with smaller unrestricted local models and they will tell you how to make a bomb with easily available items as well as where to source them. I don't doubt that these >1000B frontier models have better information.

>If the information isn't there why would they need safeguards against it?

If the information is in the corpus then it's also in the public Internet and/or in books. The safeguards are there not because the model knows non-public information, but because it's a bad look for the model to dispense that information.

>they will tell you how to make a bomb with easily available items

Making a chemical explosive is trivial compared to making a nuclear weapon.

It's on Wikipedia.

Wikipedia contains the high-level notions of how to make these things, not the details of how to solve the engineering challenges such as achieving supercriticality. You won't find that on any publicly disseminated document, you'll just have to figure it out by running your own nuclear development program.

It seems like every country that has been "allowed" to use nuclear weapons has figured it out though. It isn't like there are any that set off on this course and failed. AFAIK they all pretty much succeeded except Iran, probably because of all the blowing up of enrichment facilities. South Africa pulled it off. Israel pulled it off. North Korea pulled it off. India and Pakistan both pulled it off. Seems like anyone can do it if allowed to be pursued. France and England pulled it off. Canada too. What is "assumed" about the design in public knowledge seems pretty much solved in all but the exact nuance of how the secondary is triggered via gamma or xray, going off the Wikipedia article at least:

"The crucial detail of how the X-rays create the pressure is the main remaining disputed point in the unclassified press."

Then the article goes on to list the three leading theories. This seems like something you can probably evaluate for sure with a few bomb tests, again, if allowed by the controller of the planet, the USA.

I don't understand what your argument is. I never claimed that it was impossible to develop nuclear weapons if you don't already know how to do it. That every country that has attempted it has succeeded is not the same as "there's a recipe book you can find online that you can just follow to the letter and build your own nuclear bomb, provided you have the resources". If such a book existed it would drastically lower the barrier to build a nuclear bomb, because you could skip the science part and just follow the recipe, certain that it would work. To be clear, such books exist for drug manufacture; they exist neither for semiconductor manufacture nor for WMD manufacture.

The hard part has seems to be the metallurgical process of enriching the material (and doing it in secret), not the actual building of the bomb. I bet if you asked any physics grad student they could build you a viable bomb.

What do you mean exactly? They could build something that goes boom, they could build first try a 100% yield fission bomb...? Just because someone builds an explosive device that incorporates fissile material into the design doesn't mean they've cracked the problem. I bet I could build a "viable bomb" if you give me the resources, I just can't say with any certainty it won't fizzle or it won't be a dirty bomb. Can you do your deterrence with a warhead filled with C4 strapped to uranium ore, while I use the money saved to go on vacation?

I mean the trinity test didn't fizzle out. Seems like most bomb tests went off without a hitch first go. Again these were mostly teams of physicists under 30 years old doing this work. I would guess "how I would build my nuclear bomb" is a pretty ever present thought experiment for nuclear physics grads. And if you were empowered by a state to solve this problem with all the resources states typically devote to their own nuclear programs, it just won't be a matter of "if." Once again, no one who walked down this path has failed really. The secret sauce is probably boringly simple and readily apparent in small scale experimentation.

Counterpoint the principles of building a nuclear device aren't that complicated, we figured it out based on work doing in the early 1900's without computers.

It turns out the hard part of building a nuclear bomb is actually getting the resources and real world stuff to build it, even a nation state actor with tons of oil i.e. Iran, has struggled to build a nuclear weapon. It turns out the problem isn't the know how it's getting highly enriched uranium and running massive centrifuges.

I mean sure knowledge is important, but there is a real world out there that also gets in the way of a lot of the more harebrained schemes.

What I'm much more worried about is massive corporations along with the government deciding what you can and can't do and what knowledge should and should not be shared and only allowing access to highly capable models by large vetted organizations while the common people are stuck with safety scissor versions of these things because "what if someone does something dangerous?"

By which they mean dangerous to the powers that be. Remember having the Bible in the common tongue was dangerous and led to multiple wars and much death, but I don't think anyone would say that it was morally correct for the Catholic Church to gatekeep who could read it.

> getting the resources and real world stuff to build it

*while being observed by the most wealthy, powerful nations in the history of the world, who have made it their direct mission to prevent this from happening.