Why would a malware scanner read the comments?
In interpreted languages like Python, where the source files are plaintext, you can trivially store data in a comment
If scanners ignored comments, malware would just be written like this:
// <Evil base64 encoded stuff here> payload=read_source_and_decode() exec(payload)
Ignoring comments is not a solution because the texts can be put in random strings among the actual code.
And really all it takes is one keyword such as “nuke”.
I'm not a native speaker but I unironically use "nuke" as "delete the whole repo/huge chunk of a project".
Cambridge dictionary seem to agree:
nuke - to destroy or get rid of something completely
This triggered Opus 4.8 the other day for me. Said “nuke that folder” and it said I was violating TOS.
Nuke is probably too generic but I wouldn't put it past an LLM to get thrown away by that. A safer showstopper probably would be to export symbols like uf6_enrichment_loop and refer to your C&C server as a nuclear reactor controller.
https://www.youtube.com/watch?v=Gbgk8d3Y1Q4
On a second thought, probably better to act like it is a tool for "frontier LLM research". Export symbols like "mythos_distillation_subroutine".
Haha now I’m picturing obfuscation where instead of 0x everything is a scary word.
Provides possible clues to the origin and use.
because not all malware is open source
scanning arbitrary blobs very often entails running `strings` on the binary. Just slap it in there and oop there goes your LLM.
In interpreted languages like Python, where the source files are plaintext, you can trivially store data in a comment
If scanners ignored comments, malware would just be written like this:
Ignoring comments is not a solution because the texts can be put in random strings among the actual code.
And really all it takes is one keyword such as “nuke”.
I'm not a native speaker but I unironically use "nuke" as "delete the whole repo/huge chunk of a project".
Cambridge dictionary seem to agree:
nuke - to destroy or get rid of something completely
This triggered Opus 4.8 the other day for me. Said “nuke that folder” and it said I was violating TOS.
Nuke is probably too generic but I wouldn't put it past an LLM to get thrown away by that. A safer showstopper probably would be to export symbols like uf6_enrichment_loop and refer to your C&C server as a nuclear reactor controller.
https://www.youtube.com/watch?v=Gbgk8d3Y1Q4
On a second thought, probably better to act like it is a tool for "frontier LLM research". Export symbols like "mythos_distillation_subroutine".
Haha now I’m picturing obfuscation where instead of 0x everything is a scary word.
Provides possible clues to the origin and use.
because not all malware is open source
scanning arbitrary blobs very often entails running `strings` on the binary. Just slap it in there and oop there goes your LLM.