Hacker News

tomtom1337 9 hours ago [ - ]

This is literally what talaas has done with chatjimmy.ai.

Try it, it's llama 3.1 8B at 16000 tokens per second.

chatjimmy.ai https://taalas.com/the-path-to-ubiquitous-ai/

jupr 5 hours ago [ - ]

Wow that incredibly fast. I like this outcome more than centralized datacenters.