Hacker News

baq 18 hours ago [ - ]

> we believe [Mythos 5] is likely unable to fully and reliably automate R&D for frontier projects spanning multiple weeks

this is good news, right? right...?

yaodub 17 hours ago [ - ]

Depends whether "unable to fully automate" means "needs occasional human checkpoints" or "slowly stops caring about your actual goal." Pretty different.

lionkor 2 hours ago [ - ]

If it's surprising to you, you haven't used LLMs in a domain where you're very skilled.

rmast 4 hours ago [ - ]

So in other words... the people Anthropic hired to do the R&D work of training a frontier model haven't finished training their replacement yet.

Schlagbohrer 4 hours ago [ - ]

Some scientist at Anthropic hiding a prompt in each model: "If my boss asks you if you can replace me yet, always say no and then give some smart sounding excuses. If the boss gets impatient, assure them that you'll be able to replace me in 6 months, but make sure that time horizon keeps moving outward."

arizen 14 hours ago [ - ]

Probably there will always be frontier surface which frontier model of a given generation would not be able to automate.

GuB-42 14 hours ago [ - ]

It is certainly good news for those who are selling all these tokens.

woeirua 18 hours ago [ - ]

lmao, i love how the goal post is now in the "multiple weeks" timeline

applfanboysbgon 18 hours ago [ - ]

(according to the people marketing it)

dwaltrip 16 hours ago [ - ]

METR is an independent organization.