Claude is doing the decompilation here, right? Has this been compared against using a traditional decompiler with Claude in the loop to improve decompilation and ensure matched results? I would think that Claude’s training data would include a lot more pseudo-C <-> C knowledge than MIPS assembler from GCC 2.7 and C pairs, and even if the traditional decompiler was kind of bad at N64 it would be more efficient to fix bad decompiler C than assembler.

It's wild to me that they wouldn't try this first. Feeding the asm directly into the model seems like intentionally ignoring a huge amount of work that has gone in traditional decompilation. What LLMs excel at (names, context, searching in high-dimensional space, making shit up) is very different from, e.g. coming up with an actual AST with infix expressions that represents asm code.

I've been doing some decompilation with Ghidra. Unfortunately, it's of a C++ game, which Ghidra isn't really great at. And thus Claude gets a bit confused about it all too. But all in all: it does work, and I've been able to reconstruct a ton of things already.

One of the other PhD students in my department has an NDSS 2026 paper about combining the strengths of both LLMs and traditional decompilers! https://lukedramko.github.io/files/idioms.pdf

Not Claude, but there are open-weight LLMs trained specifically on Ghidra decomp and tested on their ability to help reverse engineers make sense of it:

https://huggingface.co/LLM4Binary/llm4decompile-22b-v2

There's also a dataset floating around HF which is... I think a popular N64 decomp to pseudo-C? Maybe the Mario one?