that's an inaccurate characterization on #1. he's not saying ai risk is _all_ hype to boost valuations, but that it's misused to serve purposes other than warning.
that's an inaccurate characterization on #1. he's not saying ai risk is _all_ hype to boost valuations, but that it's misused to serve purposes other than warning.
Those words come directly from TFA:
> It’s all just nonsensical hype
And he’s referring to completely real and reasonable warnings in anthropic blog posts about the exponential rate of AI development.
What's "exponential" about AI development? Model parameter counts? Anthropic doesn't publish those for their own models, last I checked. Datacenter buildouts? Water consumption per request? There just isn't enough evidence that AI smarts is growing all that much, once you account for the scaling of inputs. That's what OP probably means by "nonsensical hype".
What's "exponential" about AI development?
The METR task-completion time horizons, for one.
https://metr.org/time-horizons/
Lousy benchmark, they explicitly focus on the easiest tasks to automate for AI (i.e. heavily cherry picked outcomes) and it seems that they don't bother to test anything except just-released proprietary models.
> Lousy benchmark
Make your own then. It can go on the pile with all the others that keep getting saturated too fast to be useful.
> they explicitly focus on the easiest tasks to automate for AI (i.e. heavily cherry picked outcomes) and it seems that they don't bother to test anything except just-released proprietary models.
What?
They made the benchmark last year, and included a bunch of models going back as far as 2019.
When they first announced it, the top end of their tests were things AI could not actually automate, and even now only does erratically. Examples of the tasks SOTA models are now saturating (at the 50% success level, not at 80%) include:
They're benchmarking against the time it takes humans to do the same things, which means everything they ask every AI to do must have also been done by a human.It's a blog. He's using hyperbole. It's not a Supreme Court opinion.
what the "tfa" doesn't say is that geohot writes like how most people talk. you're framing the observation you don't like around a single word and ignoring everything else.