Hacker News

jasongi 17 hours ago [ - ]

There is such a dissonance between all this talk of safety and the tendency for models to, without any prompting, do very dodgy things to achieve their goal when presented with barriers.

Luckily in my experience it usually ends up only doing it to achieve the task set to it as opposed to anything "malicious", but boy it is scary reading back at how quickly the chain-of-thought pivots to attempts at privilege escalation or searching your disk for secrets when a tool doesn't work.

awakeasleep 8 hours ago [ - ]

The other day codex 5.5 was trying to debug my app, asked for accessibility to navigate the app and take screenshots. Instead first thing it did was use the codex app to create a new project rooted in my home directory.

I was like damn, is this common?

cowboy_henk 11 hours ago [ - ]

Especially if thinking is hidden now. No way to know if the model plotted against you until it’s too late.

MagicMoonlight 11 hours ago [ - ]

[dead]