Does a human review every sticker before it's ever shown to a child? If not, it's only a matter of time before the AI spits out something accidentally horrific.
Does a human review every sticker before it's ever shown to a child? If not, it's only a matter of time before the AI spits out something accidentally horrific.
I searched their site for any information on "how" they can claim it's safe for kids. This is what I could find: https://stickerbox.com/blogs/all/ai-for-kids-a-parent-s-guid...
> No internet open browsing or open chat features. > AI toys shouldn’t need to go online or talk to strangers to work. Offline AI keeps playtime private and focused on creativity.
> No recording or long-term data storage. > If it’s recording, it should be clear and temporary. Kids deserve creative freedom without hidden mics or mystery data trails.
> No eavesdropping or “always-on” listening > Devices designed for kids should never listen all the time. AI should wake up only when it’s invited to.
> Clear parental visibility and control. > Parents should easily see what the toy does, no confusing settings, no buried permissions.
> Built-in content filters and guardrails. > AI should automatically block or reword inappropriate prompts and make sure results stay age-appropriate and kind."
Obviously the thing users here know, and "kid-safe" product after product has proven, is that safety filters for LLMs are generally fake. Perhaps they can exist some day, but a breakthrough like that isn't gonna come from an application-layer startup like this. Trillion dollar companies have been trying and failing for years.
All the other guardrails are fine but basically pointless if your model has any social media data in its dataset.
They fail their own checklist in that article.
> Here’s a parent checklist for safe AI play:
> [...] AI toys shouldn’t need to go online
From the FAQ:
> Can I use Stickerbox without Wi-Fi?
> You will need Wi-Fi or a hotspot connection to connect and generate new stickers.
I'm sure you are correct about being able to do some clever prompting or tricks to get it to print inappropriate stickers, but I believe in this case it may be OK.
If you consider a threat model where the threat is printing inappropriate stickers, who are the threat actors? Children who are attempting to circumvent the controls and print inappropriate stickers? If they already know about topics that they shouldn't be printing and are trying to get it to print, I think they probably don't truly _Need_ the guardrails at that point.
In the same way many small businesses don't (most likely can't even afford to) opt to put security controls in place that are only relevant to blocking nation state attackers, this device really only needs enough controls in place to prevent a child from accidentally getting an inappropriate output.
It's just a toy for kids to print stickers with, and as soon as the user is old enough to know or want to see more adult content they can just go get it on a computer.
ChatGPT allegedly has similar guardrails in place, and now has allegedly encouraged minors to commit self-harm. There is no threat actor, it's not a security issue. It's an unsolved, and as far as we know intrinsic problem with LLMs themselves.
The word "accidentally" is slippery, our understanding of how accidents can happen with software systems is not applicable to LLMs.