Hacker News

siva7 15 hours ago [ - ]

I can't shake of the feeling that Googles Deep Think Models are not really different models but just the old ones being run with higher number of parallel subagents, something you can do by yourself with their base model and opencode.

energy123 5 hours ago [ - ]

They could do it this way: generate 10 reasoning traces and then every N tokens they prune the 9 that have the lowest likelihood, and continue from the highest likelihood trace.

This is a form of task-agnostic test time search that is more general than multi agent parallel prompt harnesses.

10 traces makes sense because ChatGPT 5.2 Pro is 10x more expensive per token.

That's something you can't replicate without access to the network output pre token sampling.

Davidzheng 15 hours ago [ - ]

And after i do that, how do i combine the output of 1000 subagents into one output? (Im not being snarky here, i think it's a nontrivial problem)

mattlondon 14 hours ago [ - ]

You just pipe it to another agent to do the reduce step (i.e. fan-in) of the mapreduce (fan-out)

It's agents all the way down.

tifik 14 hours ago [ - ]

The idea is that each subagent is focused on a specific part of the problem and can use its entire context window for a more focused subtask than the overall one. So ideally the results arent conflicting, they are complimentary. And you just have a system that merges them.. likely another agent.

int_19h 5 hours ago [ - ]

Claude Cowork does this by default and you can see how exactly it is coordinating them etc.

jonathanstrange 14 hours ago [ - ]

Start with 1024 and use half the number of agents each turn to distill the final result.