Hacker News

vthallam 20 hours ago [ - ]

This model is great at long horizon tasks, and Codex now has heartbeats, so it can keep checking on things. Give it your hardest problem that would take hours with verifiable constraints, you will see how good this is:)

*I work at OAI.

spaceman_2020 6 hours ago [ - ]

Is there any task that actually doesn't require human intervention in-between, even if its just to setup stuff?

Like I will get Opus to make me an app but it will stop in between because I need to setup the db and plug in the API keys and Opus really can't do that on its own yet

stingraycharles 2 hours ago [ - ]

> Is there any task that actually doesn't require human intervention in-between, even if its just to setup stuff?

The goal is none. The current situation: everything that matters requires human intervention.

I think the end situation will be that LLMs will be able to perform decently well in a highly controlled and predictable environment.

thereeldeel 7 hours ago [ - ]

Will Codex App support new context window, rather than compaction, for "unrelated" sub-tasks during long horizon tasks?

dandaka 20 hours ago [ - ]

Could be a great feature, can't wait to test! Tired of other models (looking at you Opus) constantly stuck mid-task lately.

winrid 19 hours ago [ - ]

Interesting, I just had opus convert a 35k loc java game to c++ overnight (root agent that orchestrated and delegated to sub agents) and woke up and it's done and works.

What plan are you on? I'm starting to wonder if they're dynamically adjusting reasoning based on plan or something.

gck1 18 hours ago [ - ]

I'm on max 5x and noticed this too. I don't use built-in subagents but rather full Claude session that orchestrates other full claude sessions. Worker agents that receive tasks now stop midway, they ask for permission to continue. My "heartbeat" is basically "status. One line" message sent to the orchestrator.

Opus 4.6 worker agents never asked for permission to continue, and when heartbeat was sent to orchestrator, it just knew what to do (checked on subagents etc). Now it just says that it waits for me to confirm something.

winrid 8 hours ago [ - ]

Weird. I don't have this behavior, although I did with codex and 5.4 haha. I bet the providers are playing with settings underneath and different users are routed to different deployments, or they're secretly routing us to different models under load.

adamandsteve 16 hours ago [ - ]

This has to be bait.

azan_ 15 hours ago [ - ]

Why?

winrid 8 hours ago [ - ]

what?

adamandsteve2 3 hours ago [ - ]

Because there’s no way in hell it can rewrite a game with 35k loc perfectly lol, link the codebase or it didn’t happen.

frotaur 18 hours ago [ - ]

I've been using the /ralph-loop plugin for claude code, works well to keep the model hammering at the task.

dannyw 19 hours ago [ - ]

It's genuinely so great at long horizon tasks! GPT-5.5 solved many long-horizon frontier challenges, for the first time for an AI model we've tested, in our internal evals at Canva :) Congrats on the launch!

brcmthrowaway 17 hours ago [ - ]

Can we not do growth hacking here?

RALaBarge 16 hours ago [ - ]

We totally agree.

That's what I've been heads down, HUNGRY, working on, looking for investors and founding engineers pst: https://heymanniceidea.com (disclaimer: I am not associated with heymanniceidea.com)

smallerize 16 hours ago [ - ]

HN is owned by a startup accelerator and venture capital firm. They do growth hacking on the front page. And you probably know that since your throwaway account is several years old.

16 hours ago [ - ]

[deleted]

bkyan 16 hours ago [ - ]

Sorry, what is "heartbeats", exactly?

gurjeet 15 hours ago [ - ]

> Today we launched heartbeats in Codex: automations that maintain context inside a single thread over time.

https://x.com/pashmerepat/status/2044836560147984461

bkyan 14 hours ago [ - ]

Thanks!