$ rg 'unsafe [{]' src/ | wc -l
10428
$ rg 'unsafe [{]' src/ -l | wc -l
736
Language Files Lines Code Comments Blanks
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Rust 1443 929213 732281 116293 80639
Zig 1298 711112 574563 59118 77431
TypeScript 2604 654684 510464 82254 61966
JavaScript 4370 364928 293211 36108 35609
C 111 305123 205875 79077 20171
C++ 586 262475 217111 19004 26360
C Header 779 100979 57715 29459 13805
Cool you can just search specifically for potentially unsafe code in Rust. How do you search for unsafe code in Zig? Or do you just have to assume it's everywhere?
If half of your code is unsafe then unless you exercise tremendous discipline (Claude basically doesn't) you will just end up with a big ball of unsafe, peppered with hallucinations in whatever random documentary comments Claude decided to make. I doubt they enforced the confinement of unsafe to a specific architectural layer or anything like that.
Aren't the Rust unsafes a reflection of the Zig it was ported from? However now that you're working with Rust, you're in a position to continue improving and eliminating the unsafes.
Plus I seem to recall the Rust community solved this issue by making tooling that proofs if unsafe code is truly unsafe, I remember one of the concurrent frameworks got scanned and people freaked out, the creator was about to abandon ship entirely as a result, don't recall what fully came of it. Anyway my overall point being, if there's already tooling to find the truly unsafe / bad code, it might make fixing it simpler / quicker to accomplish.
There is no Rust tooling that tells you if your unsafe code is shit or not. If there was you wouldn't need the unsafe stuff at all.
The Actix web stuff was the maintainer using unsafe code to increase performance (iirc, it was a long time ago) in what was the most popular rust web frameworks at the time. It has since declined and been supplanted by other projects but the push was mainly a web framework shouldn't need so much unsafe. They eventually ceded the project to another maintainer and went off to work on something else.
Fuzzing can help with that. But it’s not only applicable to Rust.
For Actix web he was using “unsafe” to increase performance. That doesn’t mean the code written was unsafe… The Rust community was turning to a cult on this topic when perfectly experienced C++ can write code would need the unsafe in Rust when they perfectly know the code isn’t. It’s good for the community to push people to avoid to use unsafe but not to that extend of drama and bullying…
Maybe my memory is hazy on it, but yeah that's the exact drama I remember.
You're likely thinking of Miri, a sanitiser. It's not a proof solver, but it screams to high heaven about this code nonetheless.
https://github.com/oven-sh/bun/issues/30719
There is a qualitative difference between unsafe Rust and Zig as far as I know.
In principle static analysis is possible. (Note: WIP)
https://github.com/ityonemo/clr
if half of your files in a million line codebase are unsafe that doesn't tell you much any more. Presumably the point of a Rust rewrite is that you actually make use of Rust's safety features in a coherent way.
But given the whole "let AI rewrite this for me" stunt nature of this project that was not going to happen because that would require well, actual thinking and a re-design. So now you have Zig disguised as Rust and a line-by-line port because the semantics of idiomatic Rust don't map on the semantics of Zig.
>if half of your files in a million line codebase are unsafe that doesn't tell you much any more.
If half of your files in the first pass of a million line rewrite are unsafe then that's completely fine. Do you understand what the tag actually is? It doesn't even mean that the code is actually unsafe, just that the compiler can't guarantee its safety, which can happen for a number of reasons, some benign.
Who rewrites a 700K codebase trying to be idiomatic from the get go ? That's setting yourself up for failure, whether you're a human or a machine.
A 1:1 translation, warts and all, is the _only_ foolproof way to do a language to language rewrite. Anything else in a non-trivial codebase is almost guaranteed to introduce regressions.
And? This is absolutely the correct and standardized way to do mechanical rewrites: you do a rewrite that maps directly to the original source so you can rely on the original correctness guarantees and bug-for-bug compatibility and log issues, and then you go into the next phase where you begin to use idiomatic constructs.
This is the same in COBOL-to-Java ports that have been done in banking and insurance for the past 20 years.
If the rewrite was zig to C and half the code was in __asm blocks is that different or the same?
COBOL to Java is a completely different thing and pretty much unrelated.
Rust can easily call C libraries and vice versa and so can Zig. A more appropriate and designed rewrite would identify the core pieces of the Zig code that were the primary sources of all the big issues. Then, you rewrite that component in Rust and verify that you get the expected improvements. That keeps the codebase stable, it keeps you honest on actually reducing bugs and issues, and other benefits. Then you either just keep it that way or slowly rinse and repeat.
Without doing the analysis of what the core issues were in the first place, the author of Bun can make no claims towards the rewrite. He claims to have fixed flaky tests and improved memory safety. Where is the analysis that shows this? Where is the proof and data? Does he even know where the issues in the Zig codebase were at? I saw a commit where a test had a one second sleep put in place.
Compare this to say the Racket rewrite where a significant portion of the C core was replaced by Chez Scheme and Racket itself. There were several blog posts doing both pre- and post-analysis, and Racket has far less users than Bun.
This rewrite is totally unprofessional and has been poorly and even antagonistically communicated. The author was on this site just days ago telling everyone to relax and that he'd probably throw out this code, and that was even after it had been brought up that this wasn't pre-communicated to users. If I was a dependent in Bun, I would migrate off immediately.
So I push back in the idea that this is the way to do a rewrite like this.
>This is the same in COBOL-to-Java ports
it isn't, because those guys didn't think a naive 1-1 machine translation would give them the benefits of Java, which somehow the people involved in this rust rewriting seem to think they've already gained despite the virtually identical code.
If the whole point genuinely would have been to do a purely mechanical translation they could and should have written a transpiler, which would have had significantly higher correctness guarantees than this given that it'd be deterministic, but of course that would have defeated the PR purpose of this whole thing, which just looks like a marketing for Anthropic frankly
You gain some benefits. You could in theory gain benefits in compilation speed, portability or even memory use and execution speed, from an automatic language translation. But everyone, including the bun people, understand that you certainly don't get code clarity benefits, and safety benefits is extremely dubious.
> If the whole point genuinely would have been to do a purely mechanical translation they could and should have written a transpiler, which would have had significantly higher correctness guarantees than this given that it'd be deterministic, but of course that would have defeated the PR purpose of this whole thing, which just looks like a marketing for Anthropic frankly
If it were just a marketing stunt you wouldn't have a fraction of a percent of the test suite passing with the remaining bugs being realistically very fixable, and everything written in languages with type systems that give far more guarantees than what COBOL is possible.
You're being extremely negative about this whole endeavour without looking at the evidence that this effort is going far more smoothly than expected, and maps with many people's experience with using LLMs for tasks like these.
>You're being extremely negative about this whole endeavour without looking at the evidence that this effort is going far more smoothly than expected
no I'm being negative because as I just said, if you want to do a purely syntactic translation you don't even need an LLM, that's called transpilation and we've been doing it programmatically for decades.
This is the kind of thing that looks great to people who can't program, think this is some new superpower unlocked by the mystery magic of LLMs and that is exactly the kind of impression Claude wants to sell.
Transpilation won't get you passing 99.8% of a comprehensive test suite of a 700K+ codebase in a week (and maybe none at all) and that's assuming transpilation is practical for the pair in question. So if you remotely want these kinds of results, then you most certainly do need an LLM.
There are literally formally verified language transpilers out there today. They can get you 100% coverage without "cheating" like LLMs tend to do by modifying test suites to pass, etc.
I'm currently using an LLM in my day job to accelerate such a 1:1translation, and it's certainly "working"/making progress but God I wish I had a formally verified machine translator instead of this probalistic bullshitting LLM.
Don't get me wrong, it's extremely helpful and impressive in what it can do. But I trust it somewhat less than if I had done it myself, and for good reason. The lies I tell myself tend not to take down production. The lies my LLM tells me do however.
I mean No-one is forcing you to not use a transpiler right? If it was quicker to use one or build a specific, limited one for your existing codebase and run it then you would certainly have done that already.
Sadly none is available for my current use case. Building one is so far out of scope that it'd be the most epic yak shaving of all time. If this was a personal project I would consider it. My personal projects are all about the journey and not the destination so side quests are all part of the fun. Not true for my day job however...
A. Transpilation is not 100% compatible because there are many idioms in some languages that cannot be directly translated to others. The lifetime system in Rust disallows a lot of constructs coming from languages with more relaxed constraints. Ironically transpilation will produce code with worse semantics than an LLM. B. At this point it's clear that LLMs reason very effectively about code and its intent. If you haven't asked Claude Opus with Max Reasoning to do, I suggest you give it a try, because the results are pretty fantastic.
Push comes to shove, you could probably still ask an LLM to generate transpiler code, if you're so inclined, and then have it fix the remaining "edge cases" afterward, right…?
[dead]
It's worth pointing out that "unsafe" in rust is not a very sound concept - it's not like a monad or "function colour" whereby the compiler can say "this code ultimately calls unsafe". It's more like a comment on steroids; you call unsafe in a function, write a comment about it, and no caller of that function would have any idea that it's calling unsafe code.
Yes, the point of unsafe is that you promise it's safe, you promise to preserve the necessary invariants to make it safe to call no matter from where. It was never supposed to "taint" all code that calls it, that would defeat its purpose. It's sound enough, it's just not at all trying to do that.
Yes I understand what they were trying to do and this is the ugly hack they came up with.
I just don't like it. I am not ignorant of their intentions, it just does not work well.
Unsafe code is normal. Trying to hide it is unsound. And I stand by that.
The half of the files contain 'unsafe' keyword? It doesn't seem as a good rewrite. What is the point of rewrite into Rust, if ~half of your code is still unsafe?
Bun is fundamentally a boundary-heavy system and it also rolls its own version of a lot of things that people typically use via libraries, where unsafe is hidden. (no async, memory arenas, etc). It also uses FFI heavily which requires unsafe.
It also looks like the top 2 maintainers are currently actively working on getting the amount of unsafe down and it's going down quickly.
If the unsafe can be iteratively removed and the final code is of reasonable quality that seems like a sane strategy. Any large migration just needs to be doable incrementally so progress can be made.
1. Rewrite from zig to rust in as close to zig as you can.
2. Turn into idiomatic rust.
1. Get hired into a company where you have a solid bet on making multi-century lasting generational wealth (>$50,000,000).
2. Every waking moment do everything in your power to boost the company that might give you the ability to define the direction of technology for the rest of your life.
3. Use the only thing you have (bun) to help push you in this direction and do things to help boost LLM marketing (a technology that already deeply struggles to find customers and has to rely on welfare (lucrative government contracts) to make sales).
---
Honestly think this generation of tech workers in SF are more evil than those that worked at Google + Facebook in the early 10s.
> a technology that already deeply struggles to find customers
As far as I know it's the opposite, Anthropic struggles to satisfy demand, they have tons of paying customers and their customer base is growing fast.
Wow as far as you know? That settles it then! Just ignore this:
https://www.flyingpenguin.com/wheres-ed-anthropic-told-court...
So, your link shows that they probably have like $1 billion in sales per month (but they publicly overstated this by 30%), and that's the struggle to find customers?
There are tons of posts and reporting about Anthropic's problems with meeting demand, usage limits (on paid plans, especially during peak hours), fast growth (your link confirms that), and problems with infrastructure.
Some links:
https://uk.finance.yahoo.com/news/anthropic-throttles-claude...
https://techcrunch.com/2026/03/28/anthropics-claude-populari...
So the takeaway here is that they scaled to just over $5bn instead of $6.6bn in revenue in just a few years…? Still sounds like plenty demand exists?
What does that have to do with rewriting from zig to rust??? This thread is what's pushing LLM marketing, not the rewrite itself.
If the rewrite is just a stunt and it will crash and burn it will do that whether we spend our free (or work) time writing comments. If there is any hype around this particular topic, it's happening here not in the GitHub repo.
This is exactly the case here.
The author of Bun is a Thiel Fellow, so he's already been trained in The Way.
People are trying to wash away the recklessness of this rewrite by applying engineering principles the author their self didn't apply. It's like trying to make sense of a certain president's words. There is a lot of missing analysis both before this rewrite, during it, and after that is missing. And given that Zig and Rust can interoperate with each other via C, it makes a wholesale rewrite even more bizarre.
I’m honestly confused. What is it that you think makes these workers “more evil” than Google and Facebook workers from the early 2010s?
Google and Facebook workers just made a lot of cash and mostly made everyone's life harder by Leetcode and bad interview process, they didn't threaten and actively work to put millions of SE on the street.
> they didn't threaten and actively work to put millions of SE on the street
Programmers in the 90s weren't less evil or had a stronger moral compass. They simply didn't have the opportunity to reduce the need for their fellow developers on a massive scale. They (we) would have, had we had the chance.
They (we) did it to tons of other industries. And we collectively patted ourselves on the back, saying that automation is a good thing and we're the good guys for doing it and people who lost their jobs will adapt and maybe they should just learn to code.
Now it's happening to (some of) us and suddenly it's evil?
No. The point is: programmers are whores. We like to act all righteous on forums, but very very few of us care enough about the consequences of our code to do something about it.
We either don't think about it ("what could go wrong?"), don't care about it (eh), justify it ("I need to eat!!!", "I'm just following orders"), or actively embrace it ("It's the future!").
> Programmers in the 90s weren't less evil or had a stronger moral compass. They simply didn't have the opportunity to reduce the need for their fellow developers on a massive scale. They (we) would have, had we had the chance.
Nah. The fact that such opportunity wasn't available attracted a different sort of person.
[flagged]
And definitely not more evil than the workers at current Meta.
> What is the point of rewrite
To win a news cycle.
For the forseeable future, the AI market competition is not about which product can provide the most valuable utility to users. It's about which product can be holding the protective aura of social media and investment zeitgeist while competitors buckle under the strain from unfulfilled hype and over-leveraging.
Utility, engineering, efficiency... these are all menial details for the winners to reluctantly iron out in 2035.
Bannon’s ‘flood the zone’ strategy applied to AI.
unsafe just means that you take responsibility for the safety of the code contained within. Calling into non-Rust libraries has to be wrapped in unsafe. Making syscalls has to be wrapped in unsafe.
Bun needs to interact with FFI code. This gets wrapped in unsafe blocks.
There are many places where a JavaScript interpreter and library would need to make unsafe calls and operations.
It doesn't literally mean the code is unsafe. It means the code contained within is not something that can be checked by the compiler, so the writer takes responsibility for it.
There are many low-level data munging and other benign operations that a human can demonstrate are safe, but need to be wrapped in safe because they do things outside of what the compiler can check.
There's actually a good example of this in the rewrite [1], in `PathString::slice`. They are doing an unsafe operation to return a slice that could be a use-after-free, if the caller had not already guaranteed that an invariant will remain true. Following proper rust idiomatic practices, claude has added a SAFETY comment to the unsafe block to explain why it's safe: "caller guarantees the borrowed memory outlives this".
Now, normally, you'd communicate this contract to your API users by marking the type's constructor (PathString::init) as "unsafe", and including the contract in its documentation. Unfortunately in this case, this invariant does not exist - it appears to have been fabricated out of thin air by the LLM [2]. So, not only does this particular codebase have UB problems caused by unsafe code, the SAFETY blocks for the unsafe code are also, well, lies.
[1] https://github.com/oven-sh/bun/blob/63035b3e37/src/bun_core/...
[2] https://github.com/oven-sh/bun/blob/63035b3e37/src/bun_core/...
`PathString` worked the exact same way in our Zig code, with less visibility from the compiler & type system. And yes, it will be refactored heavily (or deleted overall) in the next week or so.
One potential way to solve this in a principled manner is to turn at least some "unsafe" annotations into ghost capability tokens that are explicitly threaded through the code and consistently checked by the compiler. Manufacturing the capability could itself be left as an unsafe operation, or require a runtime check of some kind.
You already see this in some cases, for example the NonZero<T> generic type can be viewed as a T endowed with a capability or token that just says "this particular value of type T is nonzero, so the zero value is available for niche purposes". But this could be expanded a lot, especially with some AI assistance.
This already happens all the time in rust, including in the standard library. The typical pattern is to define your CheckedType to be
pub struct CheckedType(UncheckedType);
e.g. where its inner field is private. Then, you only present safe constructors that check your invariant, and only provide methods that maintain the invariant.
For a concrete example, String in rust is a Vec<u8> with the guarantee that the underlying bytes correspond to valid UTF8. Concretely, it is defined as
#[derive(PartialEq, PartialOrd, Eq, Ord)] #[stable(feature = "rust1", since = "1.0.0")] #[lang = "String"] pub struct String { vec: Vec<u8>, }
You can construct a string from a vec of bytes via
fn from_utf8(vec: Vec<u8>) -> Result<String, _>;
as well as the unsafe method
unsafe fun from_utf8_unchecked(vec: Vec<u8>) -> String;
Note here that there isn't a separate capability/token though. This is typically viewed as bad practice in rust, as you can always ignore checking a capability/token. See for example rust's mutexes Mutex<T>, which carry the data (T) that you want access to themself. So, to get access to the data, you must call .lock(). There is a similar philosophy behind Rust's `Result` type. to get data underlying it, you must handle the possibility of an error somehow (which can include panicing upon detecting the error of course).
Yes, or you could review the code.
It’d only take an hour if you reviewed a million lines per hour
[Sorry guys, I couldn't review this code because I generated it all]
Even before AI, deterministic checks by compilers are almost always better than "review the code"
"review the code" as a solution will eventually fail and cause a problem, even pre-AI.
The entire point of unsafe blocks and SAFETY comments is that they are easy for humans to find and audit, but not compiler checkable. If it can be compiler-checked by some clever token system, then ... it's just plain safe rust, and you don't need to document any special safety invariants in the first place
even when you can review the code, it's good to have the compiler check for you. This is for similar reasons why it's better to have CI check correctness on each code change, vs testing the code thoroughly one time, and then being careful going forward.
> unsafe just means that you take responsibility for the safety of the code contained within.
In this case it means you delegated the responsibility to a notably flaky heuristic.
> a JavaScript interpreter
Bun is not a Javascript interpreter. But I do see the point.
Some correct me if I'm wrong, but it's unlikely they wrote this first initial version of Rust and will leave it unchanged as-is. What's there now is a step in a long process, not the final destination.
The point is to serve as marketing for Claude. Absolutely nothing else.
Rust has a ton of other features besides safe. Like exhaustive checking of enum variants and the ability to avoid using null with option and result.
Zig has these modern language features too fwiw.
I think the goal was to do a massive rewrite for Anthropic (they acquired bun) and show that rewriting projects from lang -> lang with Claude can reduce security vulnerabilities to help with the hype for an IPO.
I don’t use/know Rust so I can’t comment on the quality, but there was a public security review that found issues with the new Rust code: https://x.com/SwivalAgent/status/2054468328119279923
This is an interesting experiment but I’m skeptical of any claims of success by Jarred/Anthropic due to the incentive to hype agents. There’s probably a trillion dollars at stake with the IPO. And Anthropic seems to be developing this part of their business with Mythos and the super review features.
But I’d like to see the same experiment done on a project without so much relying on the story being success.
There's a reasonable request to run the same analysis for the Zig version of the code as a comparison.
In lieu of that, it seems the Swivel devs ran an analysis on Tigerbeetle, one of the other major Zig projects, and found only 7 medium/low priority issues:
https://xcancel.com/SwivalAgent/status/2054063291266113994
To clarify, those are things an LLM considers to be issues, and LLMs can make mistakes.
Some of those are clear false positives, others I need to revisit tomorrow to say one way or another.
that sounds like a starting point and an honest translation. If it was originally unsafe and suddenly becomes safe immediately after the rewrite, it would mean they break existing behaviors
Better to know where memory bugs may happen than them being everywhere. Also, bun team are looking it to reduce it by a large margin. Since it was a line by line port, there is a good space for improvement. By first rust release, a significant number of it should be resolved.
Wouldn't it be better to port more idiomatically? Otherwise, you've done nothing but port all the existing bugs while creating new ones.
That's one problem with LLM's. I had claude write a function in python for me that did a bit of math, because, like most programmers, I don't know math.
The function worked perfectly mathematically speaking, but after a bit of research I realized a human being would never write a piece of code so bad.
I don't remember exactly, but it looked like this:
There are 2 problems with this code.First, that is the correct way to calculate the LCM that you'll quickly learn if you google it (or if you ask claude). The problem: math.lcm already exists! Any human being writing this would have paused to think "wait, Python has math.gcd, does it have math.lcm as well?" And then they would have just used that.
Second, you don't even need reduce. You can just math.lcm(*denominators). A human being would have realized this when intellisense showed it takes any number of arguments instead of just 2.
Pretty much every time I used an LLM to generate code it generates a rough draft barely held together that needs to be completely rewritten later. With Qt for example it generated 2 push buttons for Ok/Cancel when there is QDialogButtonBox for this that even orders the buttons to match the typical system order, or when generating a combo box that associated labels with objects it tried to figure out which object from the text of the label of the items when there is already a way to just set an arbitrary object for each item and then get it later with .currentData().
Every single time it makes me think: yes, this works. But no, not like this.
I can't imagine with 1 million lines of this feels like.
Sure hope Mythos is as world beating as they claim, they’re gonna need it now.
We got memory safety at home !
At home:
> 10428