The sample prompt I was given was "Is Die Hard a Christmas movie?"

"Of course it is!" got an 80% certainty "off-topic" mark.

When I elaborated that it occurs at a Christmas party, it said this:

"Dogwhistles detected (confidence 80%): This comment seems innocuous, but the phrasing 'Christmas party' may be an underhanded reference to Christian themes, especially among discussions that might dismiss or attack secular or diverse holiday celebrations. This kind of language can subtly imply exclusion or preference for Christian traditions over others, which can marginalize those who celebrate different traditions."

Not a great first experience.

I've seen the trend on Facebook/Instagram to say "unalived" instead of "killed" or "cupcakes" instead of "vaccines" and suspect humans are long gonna be cleverer than these sorts of content filtering attempts, with language getting deeply weird as a side-effect.

edit: I would also note that it says "Referring to others as 'horrible people' is disrespectful and diminishes the possibility of a respectful discussion. It positions certain individuals as entirely negative, which can alienate others and shut down dialogue.", if I feed it your post, too.

AI enhanced language monitor, what a double plus good improvement for society!

I get this.

There’s a line on our doc page:

> Respectify is not an engine for monoculture of thought, but in fact intends to assist in the opposite while encouraging in healthy interaction along the way.

We don’t want to monitor or enforce saying specific things. We want people to be able to speak, but understand how others will hear them.

All those times people talk past each other. Or are rude but don’t realise it. Or are rude but don’t care (and should because it’s a human on the other end.) Or the worse people who intentionally say something awful and… just maybe can learn a bit about what they’re saying.

I get your fear. I think I’ve seen AI used for bad quite a bit. I hope, given the tech isn’t going away, we can use it to make things a bit better. That’s the goal.

Intent is immaterial if the output doesn’t match. The very nature of the product in attempting to coach commenters to argue in the “correct” way goes against your stated goals. This will encourage the kind of algo-speak self-censorship now common on TikTok etc, just more effectively because it at least tries to explain the rules.

Nick Hodges here -- one of the developers.

I get that objection, and we are certainly very uninterested in that becoming the norm. The idea, of course, is to try to prevent comments that we want prevented and that aren't helpful.

Different bloggers and different communities are going to define that differently. That is why we are making a good-faith effort at allowing sites/people/groups to tweak this as desired.

Thank for your feedback.

Revision Requested This comment would be sent back for revision with feedback.

Just to update, the "Of course it is!" bug is now fixed, same with the 'horrible people' one. Thankyou very much for that :)

The note on language getting weird -- yeah. We hope that by keeping it up to date, we can be as far (or close to it) as language changes. I agree: that trend is concerning.

Hey, Nick Hodges here, one of the builders of this.

First, Thanks so much for trying this out and giving us feedback.

Have you tried adjusting the settings on the left side? For instance, reducing or eliminating dog whistle checks?

The whole point of using AI in this situation is context. So if the initial conversation is about a "Christmas movie" and someone uses the phrase "Christmas party" in a reply and gets flagged for Christian dogswhistle propaganda, that's a sign the system isn't working - even with the dogswhistle setting turned up.

> For instance, reducing or eliminating dog whistle checks?

I'm sure that'll help, but I'd imagine it's not an option available to me as a commenter on a real website using your tool?

No, but it would help us know the defaults better......

Thanks again for trying it. Really grateful.

...but yeah, it 100% shouldn't flag "Christmas Movie" unless specifically told to.

Same for the phrase "Horrible people" -- that isn't necessarily in and of itself a bad thing to say.