I ran Gemma on a 2015 thinkpad to do something similar. Fortunately, I could upgrade the memory otherwise it would have been a painful exercise.

Not gonna lie, llama.cpp had the fans spinning at max speed. But it worked and I got the job done.

> the fans spinning at max speed

This always confuses me - don't people want their computations to run as fast as possible and thus inevitably produce more heat that needs to be vented?

I suppose sometimes it is just an analogy for "its utilizing 100% of my resources" (which I'm guessing it is here), but I've definitely had people say it as an actual complaint in different contexts

[delayed]