Exactly, inference cost is a very good reason to fine tune with something like Qwen