It offloads to system memory, but since there are "only" 3 Billion active parameters, it works surprisingly well. I've been able to run models that are up to 29GB in size, albeit very, very slow on my system with 32GB RAM.