I'm sure you are correct about being able to do some clever prompting or tricks to get it to print inappropriate stickers, but I believe in this case it may be OK.

If you consider a threat model where the threat is printing inappropriate stickers, who are the threat actors? Children who are attempting to circumvent the controls and print inappropriate stickers? If they already know about topics that they shouldn't be printing and are trying to get it to print, I think they probably don't truly _Need_ the guardrails at that point.

In the same way many small businesses don't (most likely can't even afford to) opt to put security controls in place that are only relevant to blocking nation state attackers, this device really only needs enough controls in place to prevent a child from accidentally getting an inappropriate output.

It's just a toy for kids to print stickers with, and as soon as the user is old enough to know or want to see more adult content they can just go get it on a computer.

ChatGPT allegedly has similar guardrails in place, and now has allegedly encouraged minors to commit self-harm. There is no threat actor, it's not a security issue. It's an unsolved, and as far as we know intrinsic problem with LLMs themselves.

The word "accidentally" is slippery, our understanding of how accidents can happen with software systems is not applicable to LLMs.