> The RAM is split between CPU and GPU at a user-configurable ratio.
I believe the fixed split thing is a historical remnant. These days, the OS can allocate memory for the GPU to use on the fly.
> The RAM is split between CPU and GPU at a user-configurable ratio.
I believe the fixed split thing is a historical remnant. These days, the OS can allocate memory for the GPU to use on the fly.
Indeed it can be reallocated, needs a reboot though. I've gotten up to around 110 GB before running into OOM issues. I set it at 108 GB to provide a little headroom: https://www.jeffgeerling.com/blog/2025/increasing-vram-alloc...
Also, from your link:
> It seems like tools will have to adapt to dynamic VRAM allocation, as none of the monitoring tools I've tested assume VRAM can be increased on the fly.
amdgpu_top shows VRAM (the old fixed thing) and GTT (dynamic) separately.
Good to know!
No need for a reboot, echo 9999 >/sys/module/ttm/parameters/pages_limit
You're talking about an allocator policy for when to allow GTT and when not, not the old firmware-level VRAM split thing where whatever size the BIOS sets for VRAM is permanently away from the CPU. The max GTT limit is there to decrease accidental footguns, it's not a technological limitation; at least earlier the default policy was to reserve 1/4 of RAM for non-GPU use, and 1/4*128 GB=32GB is more than enough so you're looking to adjust the policy. It's just an if statement in the kernel, GTT the mechanism doesn't limit it, and deallocating a chunk of memory used by the GPU returns it to the general kernel memory pool, where it can next be used by the CPU again.
It's not a fixed split. I don't know if it's possible live, or if it requires a reboot, but it's not hardwired.
I want to know if it's possible. 4GB for Linux, a bit of room for the calculations, and then you can load a 122GB model entirely into VRAM.
How would that perform in real life? Someone please benchmark it!
You're still thinking of the old school thing, where you set the split in the firmware and it's fixed for that boot. There's dynamic allocation on top of it these days.
I have that split set at the minimum 2 GB and I'm giving the GPU a 20 GB model to process.