> 98% smaller in terms of active parameters (since it's a mixture of experts model).
I don’t think that’s right, this flash model is 5B active params. Qwen3.6-35B-A3B is 3B so 40% smaller.
> 98% smaller in terms of active parameters (since it's a mixture of experts model).
I don’t think that’s right, this flash model is 5B active params. Qwen3.6-35B-A3B is 3B so 40% smaller.