Hacker News

It's not a fixed split. I don't know if it's possible live, or if it requires a reboot, but it's not hardwired.

I want to know if it's possible. 4GB for Linux, a bit of room for the calculations, and then you can load a 122GB model entirely into VRAM.

How would that perform in real life? Someone please benchmark it!

yencabulator 4 days ago [ - ]

You're still thinking of the old school thing, where you set the split in the firmware and it's fixed for that boot. There's dynamic allocation on top of it these days.

I have that split set at the minimum 2 GB and I'm giving the GPU a 20 GB model to process.