How can a medium-sized model like Deepseek-V4-Flash be cheaper than a much smaller models like Qwen3.5-35B-A3B.
It's five times bigger in both total and active parameters!
How can a medium-sized model like Deepseek-V4-Flash be cheaper than a much smaller models like Qwen3.5-35B-A3B.
It's five times bigger in both total and active parameters!