At what point does something like this cross the line into being malware?

When it includes executeable code?

The fact that so many people are now running around with "agentic" software that fundamentally can't distinguish between their own "thoughts"/rules and untrusted user input doesn't turn a meme into malware.

Token predictors by themselves are fundamentally insecure, and cannot be made secure without a strong semantic world model. It's like `eval`-ing everything, or auto-coercing strings to objects or function calls, vs having a strong static type system.