What I don't quite understand is why would one of the most advanced AI labs use rudimentary broken text match heuristics to track and detect abuse. Why not run simple inference on actual turns out of band, and if abuse is detected, adjust the quotas semi-retroactively.

> What I don't quite understand is why would one of the most advanced AI labs use rudimentary broken text match heuristics to track and detect abuse.

It's vibe-coded. What's hard about understanding that?

> most advanced AI labs use rudimentary broken text match

> It's vibe-coded

I called this out when I saw Claude Code CLI source code reach for regex on a certain task a while back and got told it was very unlikely that nobody reviewed the diff. Looks like the bar was lower than imagined.

They’re idiots who hacked together a shockingly useful tool by leveraging the billions of dollars they received from shamelessly hyping up chatbots. The Claude Code leak makes this very clear.

Pretty wild to say that the company with one of the best models (arguably the best) is a bunch of idiots.

The people working on the models almost certainly aren't the same people writing the code for their harness.

> Pretty wild to say that the company with one of the best models (arguably the best) is a bunch of idiots.

It would be pretty wild if they didn't considering all the money thrown at them!

You're looking at one of the largest investments business (as a collective) has ever made. They had better be one of the forerunners in the space :-/

And you think with all of this money they are employing idiots?

They're completely vibe-coding one of their flagship products. It's not unreasonable to consider that the people who took that decision are, indeed, idiots.

Even idiots can succeed if you uncritically funnel them hundreds of billions of dollars.

You can't just burn money in a pit to get the best AI model out. Undoubtedly some of the smartest people in the world are working on frontier AI.

[deleted]

Maybe running additional inference on all sessions to detect OpenClaw usage would require spending more money than they would save with that detection in the first place (which is the original goal). I also suspect the Claude Code team is just a regular software team without immediate access to ML pipelines (or competence to run them) to quickly develop proper abuse detection systems with extensive testing (to avoid false positives, which people would also complain about), and they're under pressure by the management to do something right now, so a regex is all they can do within those constraints.

> Why not run simple inference on actual turns out of band, and if abuse is detected, adjust the quotas semi-retroactively.

I suppose because running inference of any kind is a helluva lot more demanding than running a regex and less deterministic.