Hacker News

gandreani a day ago [ - ]

Using gpt-5.4-mini in off-peak hours already feels like super-speed to me. That's probably no more than 100-150 tk/s. I can't imagine 750!

I've always eyed Cerebras but never had a use for it that would justify paying for the API directly. Although now that I think about it, trying out the API would probably cost less than a subscription for a month...

jasonjmcghee a day ago [ - ]

Try gpt-5.3-codex-spark - it's 1000 TPS and from my experience more capable than 5.4 mini.

If you have a subscription it's a different pool of usage.

small_model a day ago [ - ]

Used it, very fast but tiny context window and doesn't have good reasoning. (good for quick simple code changes)

trollbridge 7 hours ago [ - ]

MIMO 2.5 Pro ultraspeed has a 1M window. 1,000 tok/sec is great for planning since you can have a rapid conversation with a lot of turns.

beering a day ago [ - ]

Agreed, 1000tok/s just fills up the context window (which is big by 2004 standards) super fast. But seems like 5.3-spark was just a taste of what’s to come.

taneq a day ago [ - ]

2004 standards? O.o

mlinsey 20 hours ago [ - ]

In 2004, I took a class where we trained "language models" that were bigram word models, on an archive of a couple years of the Wall Street Journal.

I remember someone who literally announced they were dropping the class to the whole room at the end of a lecture, saying "This isn't AI!!!"

partsch a day ago [ - ]

1904

bogeholm 12 hours ago [ - ]

Back when we were kids, we would get 0 tokens/sec _if we were lucky_

embedding-shape a day ago [ - ]

The ChatGPT subscription gives you access to the -spark model(s) in Codex which are blazing fast (but pretty dumb) which I think runs on Cerebras hardware too.

rrvsh 16 hours ago [ - ]

is this specifically in codex? have been trying to use the models for months on opencode then pi but it says chatgpt subscriptions don't have access to it - i was under the assumption that OpenAI doesn't lock down their models based on harness a la Claude Code

cactusplant7374 4 hours ago [ - ]

What plan are you on? It is only available to Pro users.

kegs_ a day ago [ - ]

I have a pretty good use case for gpt-oss. The amount of time savings has actually been wild. Definitely worth a try. Just to be clear, it gets like 2000tok/s