Hell, most of us are still using llama.cpp for inference in some form