Are there any open source solutions or is cerebrium open-sourcing the technology behind it. Although I liked the read-up (side note: you might have to tone down the animations, as others have said, it was a bit dizzy) but overall it was a nice read but I still wish for more technical details as Cold starts are something that just is something that I am interested in.

So are there any more resources that perhaps the team could point out or other resources or if there are any idea of open-sourcing it ever for more internal deeper dives as I would love to know more about it!

This sort of technology is available on GKE

https://docs.cloud.google.com/kubernetes-engine/docs/concept...

Interesting! I didn't see they released this. Do you know what their benchmarks are? I know for cloud run they are pretty slow

gVisor is open-source, and `cuda-checkpoint` is freely available.

gVisor's `runsc checkpoint` subcommand supports a `--save-restore-exec-argv` which lets you specify a program to execute before gVisor starts taking the process snapshot.

You can fill in the blanks from there.

Us and the team from Modal have been upstreaming things to the GVisor repo (https://github.com/google/gvisor/pulls) in order to make it compatible with cuda-checkpoint and other parts of our system. While we are both contributing fixes and performance improvements we are unfortunately leaving some secret sauce on the side but hopefully it should get most folks to a successful implementation as is