Hacker News

Business wise, it would make sense to hold off on details till they're at least ready to serve. Look at what happened with Open AI and reasoning models. Everyone struggled with getting RL to work with LLMs for a good while. Open AI figured it out, and a few months later everyone had their prototypes out in short order. Don't forget who these labs employ. They're some of the brightest people around. Sub-q aren't really in a position for that lol. If they'd shared details at the first announcement for instance, the big labs might have had something out by now while they're still pulling resources to scale and then what ?

cmogni1 3 hours ago [ - ]

I don't think it makes sense from a business perspective to hold off on details as a new lab. OpenAI will not implement new architectural changes unless they've tested the changes themselves internally. Even if someone claims some great innovation, they'd need to do scaling experiments to somewhere between the size of GPT-4 to GPT-5 before they'd decide it is worth it to implement themselves. Plenty of mechanisms that seem to work at one scale do not translate to the next.

Because the cost to OpenAI to make an architectural shift is far greater than the cost to a new lab to try something different, providing details is usually a net benefit for recruiting, building trust, getting acquired, etc. The lack of details is a poor business decision because it makes them seem untrustworthy.

I'm not advocating that they should open source their model, but there is already so much noise in the space and many bad papers that being cagey is a poor strategy for winning over talent, developers, etc.

famouswaffles 2 hours ago [ - ]

>OpenAI will not implement new architectural changes unless they've tested the changes themselves internally.

OpenAI validating it can still happen faster than they can get the compute to serve the models themselves[1]. It doesn't make a lot of sense to give out details if they want to be a serious contender or even as some have said, be acquired.

Yeah there's noise but if they have the real deal then it doesn't matter. They only thing they need to do is let people pay to use the models.

[1] I'm assuming this is the primary cause of the delay. That may not be the case of course.