Hacker News

The smaller of the two models is open weights and available on Huggingface:

https://huggingface.co/Qwen/Qwen-AgentWorld-35B-A3B

Give it a day or two and the 'unsloth' people will probably publish a Q6 and Q8 (maybe Q8XL?) quantization in GGUF format for llama-server and other users.

npodbielski 11 hours ago [ - ]

I tried to run it but seems like it is either broken or it does not work on dockerized llama.cpp:

0.01.865.326 E llama_model_load: error loading model: missing tensor 'blk.40.attn_norm.weight'

khimaros 5 hours ago [ - ]

that particular quant is just corrupted. these work but seem to loop in reasoning a lot https://huggingface.co/groxaxo/Qwen-AgentWorld-35B-A3B-GGUF