Any halo strix laptop, I have been using the hp zbook ultra g1a with 128gb of unified memory. Mostly with the 20B parameters models but it can load larger ones. I find local models (gpt oss 20B) are good quick references but if you want to refactor or something like that you need a bigger model. I’m running llama.cpp directly and using the api it offers for neovim’s avante plugin, or a cli tool like aichat, it comes with a basic web interface as well.
Do you run into hibernation/sleep issues under current mainline Linux kernels by chance? I have this laptop and that's the only thing which isn't working out of the box for me on the Linux side, but it works fine in Windows. I know it's officially supported under the Ubuntu LTS, but I was hoping that wouldn't be needed as I do want a newer+customized kernel.
Under current kernels (6.17) it seems there is an issue with the webcam driver, https://bugzilla.kernel.org/show_bug.cgi?id=220702 . looks like there are still some issues with sleep/webcam at this time, they might be fixed by the 6.18 release.
I got sleep working by disabling webcam in the bios for now.
Well shucks, my sleep was still broken after disabling that :/. Will have to keep poking at it - thanks!