This is literally what talaas has done with chatjimmy.ai.
Try it, it's llama 3.1 8B at 16000 tokens per second.
chatjimmy.ai https://taalas.com/the-path-to-ubiquitous-ai/
This is literally what talaas has done with chatjimmy.ai.
Try it, it's llama 3.1 8B at 16000 tokens per second.
chatjimmy.ai https://taalas.com/the-path-to-ubiquitous-ai/
Wow that incredibly fast. I like this outcome more than centralized datacenters.