Used it, very fast but tiny context window and doesn't have good reasoning. (good for quick simple code changes)

MIMO 2.5 Pro ultraspeed has a 1M window. 1,000 tok/sec is great for planning since you can have a rapid conversation with a lot of turns.

Agreed, 1000tok/s just fills up the context window (which is big by 2004 standards) super fast. But seems like 5.3-spark was just a taste of what’s to come.

2004 standards? O.o

In 2004, I took a class where we trained "language models" that were bigram word models, on an archive of a couple years of the Wall Street Journal.

I remember someone who literally announced they were dropping the class to the whole room at the end of a lecture, saying "This isn't AI!!!"

1904

Back when we were kids, we would get 0 tokens/sec _if we were lucky_