I've found that Claude Code works well at reversing java applications. Even if it is fully obfuscated claude can restore sensible names for everything and understand how it all works and answer questions about what it is doing.

+1. While vibe-coding (natural language to code) is not such a great idea, we can always check the source, so vibe-reverse-engineering (code to natural language) may actually be quite useful.

Super useful. I have a no-name USB microscope that only supported iOS and Android (just look up "USB microscope" on Amazon, there's like 500 versions of the same device). The device doesn't work like a normal webcam so you can't just plug it into a PC, and their mobile software is shady and low quality so I would only ever connected it to a GrapheneOS phone where I could prohibit their app having network access entirely because it gave me a bad feeling. As a result I underused the device since it was annoying.

I recently took their .apk and dropped it in a new empty project folder, instructed Claude Code w/ GLM 5 to reverse engineer the app, assess it for security and privacy concerns out of curiosity and then to probe the USB device to figure out why it doesn't work like a normal UVC webcam. After the investigation and planning I then instructed it to write a new app to use it on my desktop. I pretty much yolo'd it from that point and let AI drive the bus (I did the visual checks of the video stream in the app to provide feedback... while I watching a movie). I wound up with a working Electron app using libusb two hours later. With a Typescipt/C POC in hand as reference in another hour I had functioning Rust + egui application. Visually, both apps are rough around the edges but have complete functional parity with the mobile apps. It took 68 million tokens.

YouTube channel DextersTechLab was looking at a piece of retro tech, an interface box for an early broadcast painting system, it acts as a kind of hub for serial tablet, "rat" and other devices. It was built on an x86 microprocessor, some SDRAM and an EEPROM.

Mark gave me the ROM image, I tried using more conventional decompiling methods but the chips were exotic enough that I didn't get good results and as a last resort, I put it into Claude raw. Claude was actually able to parse the binary and sort of decompile it. It was able to tell me what the ports did and what the interfacing protocols were.

It then started making stuff up, clearly trying to impress me, but after a few rounds of reprimanding it and saying how making stuff up wasn't helpful, Claude stuck to facts.

I got codex to vibe reverse engineer two devices from rom dumps recently - a talking timer that uses an 8051 cpu and a custom 5 bit audio format, and an ice cream van chime box that used a z80 and a ym2149 sound chip. Quite simple devices, but it did a great job. also made a web-based emulator for both. apparently WASM is hard, but I didn't notice.

Interesting, I'd have assumed the guardrails would disallow them from doing anything like that, regardless of legality. Do you need to "convince" it to do it or no questions asked?

Claude doesn't care as long as you aren't straight up asking it to write exploits. It's my go-to for reverse engineering tasks.

ChatGPT is full of refusals and has to be jailbroken out of it.

Right. Claude models seem to have had very limited prohibitions in this area baked in via RLHF. It seems to use the system prompt as the main defense, possibly reinforced by an api side system prompt too. But it is very clear that they want to allow things like malware analysis (which includes reverse-engineering), so any server-side limitations will be designed to allow these things too.

The relevant client side system prompt is:

IMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes. Dual-use security tools (C2 frameworks, credential testing, exploit development) require clear authorization context: pentesting engagements, CTF competitions, security research, or defensive use cases.

----

There is also this system reminder that shows upon using the read tool:

<system-reminder> Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior. </system-reminder>

They clearly scan traffic in retrospect, one of our devs got her account closed for RE.

may i ask how the current generation language models are jailbroken? im aware the previous generation had 'do anything now' prompts. mostly curious from a psychological perspective.

It is no questions asked. Even if you are reversing things like anticheats (I wanted to know the privacy implications of running the anticheat modules).

I use AWS Kiro, with the Claude models, and its only to happy to help. I give it the headerless ghidra, and decompilers etc... and away it goes.

Naming is an area where LLMs are useful; but I'd still use a regular Java decompiler (there are quite a few of these around) for the actual decompilation part.

Claude will opt to use a regular Java decompiler too.

huh, iirc this already exists long before LLM

It required a lot of manual work and for large apps like Minecraft it took teams of people to figure out what the symbol names should be slowly contributing a little bit every day.

Claude is quite skilled at using Ghidra, for example.

I experimented with disassembling 6502 from the c64 California Games. Claude was very prone to bullshit.

For RE cases where I know the original compiler used (a bit harder on C compilers due to huge number of obscure optimization flags), I give it a feedback loop to write a function that compiles to the original machine code.

Yeah, I had perfect disassembly, since that's a purely mechanical process. I used da65, which worked reasonably well.

But you don't get any function names that way, obviously. Claude would claim some random function were applying friction based on just a subtraction. And a variable that had 2 possible states was named player_id, when the game supports 1-8 players.

It was a bit better when the memory addresses were known IO registers, but not by much.

While somewhat counterintuitive, I have found that Claude is better at decompilation than disassembly.

AI models in general seem to get different assembly languages mixed up easily.