We have support for running warmup requests and fork the VM after that. Eventually I’d like to add the ability to export the state of the VM so the warmup can be run on a different machine.

I think there’s a number of challenges with that approach, mainly getting a representative set of sample queries that will accurately optimize the reference VM. I wonder if harvesting the VM state at scale based on pages that are duplicates across machines might work of course then you have problems with ASLR and how to reconstruct a VM to actually use that data.

The more representative the warmup set the better the result but even a quite simplistic approach is helpful since much of what you want to optimise is not page dependent: the React rendering infrastructure, router, server, and in the case of Deno the runtime level code written in JS.

I suspect harvesting VM state from a production workload would be counterproductive to the goal of isolation.

Harvesting the pages for the JIT and somehow reusing them to prewarm the JIT state, not the heap state overall. The heap state itself is definitely solved by the simple prewarming you describe because of the various state within various code paths that might take time to initialize/prewarm.

I’m not saying it’s not helpful. I’m just flagging that JIT research is pretty clear that the performance improvements from JIT are hugely dependent on actually running the realistic code paths and data types that you see over and over again. If there’s divergence you get suboptimal or even negative gains because the JIT will start generating code for the misoptimization which you actually don’t care about. If you actually have control of the JIT then you can mitigate some of these problems but it sounds like you don’t in which case it’s something to keep in mind as a problem at scale. ie could end up being 5-10% of global compute I think if all your traffic is JIT and certainly would negatively impact latencies of this code running on your service. Of course I’m sure you’ve got bigger technical problems to solve. It’s a very interesting approach for sure. Great idea!

Thanks! I can see how that would be useful but it sounds like it would require deep integration with the JIT. With the TinyKVM/KVMServer approach we have a well defined boundary in the Linux system call interface to work with. It's been quite surprising to me how much is possible with such a small amount of code.

For sure. I think though you might want more non-JIT customers because a) Cloudflare and AWS have a better story there and thus customer acquisition is more expensive b) you have a much stronger story for things they have to breakdown to WASM for as WASM has significant penalties. Eg if I had an easy Cloudflare-like way to deploy Rust that would be insanely productive.