Hacker News

jsw97 10 hours ago [ - ]

Given the high rate of false positives people are reporting for the non-silent cybersecurity, biological, etc., safeguards, there is a strong likelihood that you will encounter silently nerfed behavior even if you are _not_ violating their TOS.

Ultimately this will be evident in the way customers / external benchmarkers experience Fable. Hopefully competition will drive future models toward a lower false positive rate. Until that happens, Mythos and Fable users seem likely to have pretty divergent experiences.

nsingh2 9 hours ago [ - ]

It's such an obviously bad policy, it's mind-boggling that they thought this was a good idea. It just breeds paranoia and mistrust, especially when people are already a bit paranoid about silent model quantification for cost cutting reasons.

SXX 5 hours ago [ - ]

Its not pranoia when entity you are dealing with cant be trusted and will do everything to abuse your trust.

llelouch 5 hours ago [ - ]

What's the alternative? Not release the model at all?

"Make the guardrails better" isn't very hard and probably not worth the effort.

hagbarth 5 hours ago [ - ]

The alternative is to be explicit when you nerf, so users know what they are working with.

port11 4 hours ago [ - ]

I guess people would just game the system and find ways around these guardrails.

rootlocus 40 minutes ago [ - ]

They have enough info on you and your sessions to eventually catch you, label you as bad faith actor and ban you automatically. I don't think many would risk it.

KennyBlanken 8 hours ago [ - ]

Another "knob" is reducing the thinking time...

azalemeth 3 hours ago [ - ]

I'm a medical physicist. I use the word nuclear a lot. Opus is fine (well, 99% of the time - I've certainly hit the CBRN filters a few times and even been invited to email anthropic about the false positives).

Fable has literally refused to work on any of my problems (even those about fluid dynamics!) and just tells me that I'm violating anthropic's AUP.

jsw97 13 minutes ago [ - ]

This problem is compounded by the fact that you can be banned (really by any provider) based on an algorithm, and the methods for restoring your account seem like they do not function as well as might be desired. So be careful with your queries, basically, or you might get locked out.

KennyBlanken 8 hours ago [ - ]

If a benchmark is affected the model owner will almost certainly tune it, so there will be a game of cat and mouse...

Honestly, wouldn't surprise me if the AI companies try to detect benchmarking. Most hardware companies do...