I ran Gemma on a 2015 thinkpad to do something similar. Fortunately, I could upgrade the memory otherwise it would have been a painful exercise.
Not gonna lie, llama.cpp had the fans spinning at max speed. But it worked and I got the job done.
I ran Gemma on a 2015 thinkpad to do something similar. Fortunately, I could upgrade the memory otherwise it would have been a painful exercise.
Not gonna lie, llama.cpp had the fans spinning at max speed. But it worked and I got the job done.
> the fans spinning at max speed
This always confuses me - don't people want their computations to run as fast as possible and thus inevitably produce more heat that needs to be vented?
I suppose sometimes it is just an analogy for "its utilizing 100% of my resources" (which I'm guessing it is here), but I've definitely had people say it as an actual complaint in different contexts
[delayed]