I think people will always disagree on what qualifies as a "usable rate". But keep in mind that practically no one sensible is running the latest Opus or GPT around the clock, especially not at sustainable, unsubsidized prices. With open-weights models it's easy to do that.
Also for people doing something medical, privacy or sensitive data related, there's an almost incalculable value (depending on industry niche) in having absolutely no external network traffic to any servers/systems you don't fully control.
I have downloaded Kimi-K2.6 (the original release).
du -sh moonshotai/Kimi-K2.6
555G moonshotai/Kimi-K2.6
du -s moonshotai/Kimi-K2.6
581255612 moonshotai/Kimi-K2.6
For comparison (sorted in decreasing sizes, 3 bigger models and 3 smaller models, all are recently launched):
du -sh zai-org/GLM-5.1
1.4T zai-org/GLM-5.1
du -sh XiaomiMiMo/MiMo-V2.5-Pro
963G XiaomiMiMo/MiMo-V2.5-Pro
du -sh deepseek-ai/DeepSeek-V4-Pro
806G deepseek-ai/DeepSeek-V4-Pro
du -sh XiaomiMiMo/MiMo-V2.5
295G XiaomiMiMo/MiMo-V2.5
du -sh MiniMaxAI/MiniMax-M2.7
215G MiniMaxAI/MiniMax-M2.7
du -sh deepseek-ai/DeepSeek-V4-Flash
149G deepseek-ai/DeepSeek-V4-Flash
So, realistically, $100K for an 8x RTX 6000 Pro system that can run it at a usable rate.
I think people will always disagree on what qualifies as a "usable rate". But keep in mind that practically no one sensible is running the latest Opus or GPT around the clock, especially not at sustainable, unsubsidized prices. With open-weights models it's easy to do that.
Also for people doing something medical, privacy or sensitive data related, there's an almost incalculable value (depending on industry niche) in having absolutely no external network traffic to any servers/systems you don't fully control.
the 'unsloth' link above is a 3rd party person that has quantized it to Q8, the original release is considerably larger in size than 600GB:
https://huggingface.co/moonshotai/Kimi-K2.6
No.
I have downloaded Kimi-K2.6 (the original release).
For comparison (sorted in decreasing sizes, 3 bigger models and 3 smaller models, all are recently launched):That page mentions that the model is natively INT4 for most of the params, and 600GB is in the ballpark of what's available there for download.