Hacker News

mohsen1 a day ago [ - ]

> Additionally, we’re introducing a new `ultra` mode that goes beyond the capabilities of a single agent by leveraging subagents to accelerate complex work.

I'm curious about how does this work? Do the subagents also get to use the same tools? Will the client be flooded with tool calls? Why extra pricing for a new "model" when the same thing can happen in the client with more controls?

And if it's an army of subagents, why do they compare it to Fable and Mythos? Those models with similar harness would probably bench better I'm guessing

gck1 a day ago [ - ]

If it's anything like ClaudeCode's ultracode, it's nothing new or revolutionary.

It's essentially a bunch of subagents being called by a deterministic script written by the main model thread, each eating tokens for lunch and output of which is synthesized by an orchestrator agent.

Sidio a day ago [ - ]

The fact that it's even named Ultra is pretty telling.

bogdan a day ago [ - ]

Ultra expensive

mohsen1 a day ago [ - ]

Confusion is: ultracode is not a different model with its own benchmarks

gck1 a day ago [ - ]

Neither is OpenaAI's ultra. Article specifically calls it 'mode' and it's not even mentioned in the model card.

It's for sure a codex harness feature.

EDIT: yeah, it's the same thing. https://github.com/openai/codex/blob/main/codex-rs/core/test...

enraged_camel a day ago [ - ]

>> If it's anything like ClaudeCode's ultracode, it's nothing new or revolutionary.

OpenAI flat out copying Anthropic is a pretty funny development. It's strong evidence that they've been in catch-up mode.

gck1 a day ago [ - ]

Eh, pretty much everyone that spent some time tweaking their harness already had a homemade 'ultracode' long before Anthropic did it.

OpenAI is just way more careful with what features they add or enable by default in their harness. Anthropic's harness is a junk drawer of random features, with a new feature added every few hours. It feels like they're in panic mode, dropping random things to see what sticks when models are eventually commoditized.

I prefer OpenAI way - slow and steady.

derwiki a day ago [ - ]

Don’t all the major harnesses (pi, Claude code, codex) utilize sub agents? Def if you direct it to, but I’ve seen at least pi spin them up without explicit instruction.

alansaber a day ago [ - ]

Absolutely yes

te_chris a day ago [ - ]

With pi they’re an extension, but that’s pi

MVQ93 a day ago [ - ]

Which specific subagent one do you use?

rolisz a day ago [ - ]

If it's anything like Claude Ultracode, it burns 3 million tokens in half an hour with a single prompt.

koolala 15 hours ago [ - ]

Sounds like an Agent using an Agent like Mr. Meeseeks.

a day ago [ - ]

[deleted]

jamilton a day ago [ - ]

Yeah, I'm interested too. My guess for the reason, if not purely to eke out more performance, is so they can cleanly gather real-world data on this kind of usage.

alansaber a day ago [ - ]

I'm shocked they didn't use subagents already. Maybe they're just talking about their web deployment being unified with codex?

Sidio a day ago [ - ]

With Codex, subagents are only used if you specifically prompt for them. Unlike Claude Code. Odd since it's the former with excess compute available to them.

helloplanets a day ago [ - ]

Deep Research has been using the Orchestrator -> Subagents -> Synthesizer loop since the beginning. It's just strange that they'd put a loop benchmark next to actual model benchmarks.

Maybe it's a tune of the base model that works especially well with the subagent loop?

simianwords a day ago [ - ]

Claude also has ultra code mode which is exactly the same thing. This seems to be different from pro however.

jiggawatts 11 hours ago [ - ]

> Will the client be flooded with tool calls?

I was just saying to colleagues that I haven't felt the need to go past an 8 core machine until this month, when I started running parallel GPT 5.5 agents on a decent sized codebase (over 4 MB of code). There were times I could barely move my mouse cursor!