There are a bunch of tutorials on how to use GRPO to fine tune a small Qwen. Depending what you're doing LoRA or even just prefix tuning can give pretty good results with no special hardware.
There are a bunch of tutorials on how to use GRPO to fine tune a small Qwen. Depending what you're doing LoRA or even just prefix tuning can give pretty good results with no special hardware.