I tried it with Qwen/Qwen3.5-27B UD-IQ3_XXS on an RTX 5060 Ti with 16Gb - the convert step succeeded, but upon loading the converted model I encountered:

    Model load deferred: 'Qwen3_5Config' object has no attribute 'vocab_size'
My transformers version is up to date, and using AutoModelForCausalLM.from_pretrained() should not raise this exception - what's going on here?