Dumb question about reverse engineering binaries: is there a way to only do it piecemeal? I'm eventually waiting for LLMs and harnesses to get good enough to reverse engineer BFME (old Lord of the Rings game that still has an active modding community), but it's a multi GB sized game that would have to be done in bite-sized pieces.

Basically; can you reverse engineer in bite sized pieces, and recompile/customize their behavior, without needing to do it all at once?

Most decomp projects (that I know of) are Ship of Theseus style projects where the minimum unit is a function, give or take alignment requirements and quirks of the compiler. On the MIPS side, tools like Splat and SPIM can help identify function and even source file boundaries, generate inline ASM C files[0], and write linker scripts to build a matching binary. You can then go through and replace the ASM functions one at a time until you just have C left.

0 - for example: https://github.com/Xeeynamo/sotn-decomp/blob/master/src/boss...

Interesting when you mention Ship of Theseus, I never thought of that but I wonder if that is where the name “Ship of Harkinian” comes from?

It has a double meaning actually! The ship of Theseus reference like you noted and the “Harkinian” part being the name of the king of Hyrule from the CDI games. One of his lines is “Enough! My ship sails in the morning” [0] so the project is also a reference to his actual ship. (Referenced in the projects FAQ [1])

[0] https://youtu.be/JmxGLo_itEY?is=x85epFYBcPeRDxxh

[1] https://www.shipofharkinian.com/faq

Yes, quite easily. It requires some setup, but the basic idea is that you create a DLL and a simple loader program which injects it into your target process. You can then use a hooking library like MinHook to replace individual functions with your own implementations. If the target application is in C++, you can additionally do vtable hooking and replace functions even easier (though it will always be a combination of the two techniques).

There’s also fun stuff like VEH hooking and SLAT hooking, though SLAT hooks are not very useful in this case.

Most of those GB are probably data rather than executable code, it might not be quite as bad as you're imagining.

Have you tried? I've haven't tried anything huge but I've had LLMs decompile SNES ROMs for me.