I have used Replicate Cog, built on docker, fairly heavily and and find it is a decent compromise of features. Docker taking this use case more seriously is quite welcome, though surprisingly late. Local metal GPU support (where available to the containerized application APIs), not currently available in Cog, is attractive though it would require generalization of application code to support containers executable via Cuda and Metal etc.

I knew about Replicate but not about Cog, so linky if others are similarly interested https://github.com/replicate/cog#how-it-works (Apache 2)