Why would a malware scanner read the comments?

In interpreted languages like Python, where the source files are plaintext, you can trivially store data in a comment

If scanners ignored comments, malware would just be written like this:

  // <Evil base64 encoded stuff here>
  payload=read_source_and_decode()
  exec(payload)

Ignoring comments is not a solution because the texts can be put in random strings among the actual code.

And really all it takes is one keyword such as “nuke”.

I'm not a native speaker but I unironically use "nuke" as "delete the whole repo/huge chunk of a project".

Cambridge dictionary seem to agree:

nuke - to destroy or get rid of something completely

This triggered Opus 4.8 the other day for me. Said “nuke that folder” and it said I was violating TOS.

Nuke is probably too generic but I wouldn't put it past an LLM to get thrown away by that. A safer showstopper probably would be to export symbols like uf6_enrichment_loop and refer to your C&C server as a nuclear reactor controller.

https://www.youtube.com/watch?v=Gbgk8d3Y1Q4

On a second thought, probably better to act like it is a tool for "frontier LLM research". Export symbols like "mythos_distillation_subroutine".

Haha now I’m picturing obfuscation where instead of 0x everything is a scary word.

Provides possible clues to the origin and use.

because not all malware is open source

scanning arbitrary blobs very often entails running `strings` on the binary. Just slap it in there and oop there goes your LLM.